Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Backend #54

Merged
merged 58 commits into from
Sep 27, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
3f66de7
Add SQLAlchemy dependency
Antar1011 Sep 6, 2016
fef31ca
Declarative classes
Antar1011 Sep 7, 2016
baf44d1
Register relationships
Antar1011 Sep 7, 2016
c8238d9
SQLAlchemy should be "sa" not "sq"
Antar1011 Sep 8, 2016
97726cb
compute_fid utility
Antar1011 Sep 9, 2016
3a5fed9
Merge branch 'master' of https://github.com/Antar1011/Onix into backend
Antar1011 Sep 9, 2016
c95a3d2
Merge branch 'master' of https://github.com/Antar1011/Onix into backend
Antar1011 Sep 9, 2016
32d4295
Refactor
Antar1011 Sep 9, 2016
8e5d2af
Close sinks when you're done with them.
Antar1011 Sep 9, 2016
fea304a
Let sinks handle batching instead of LogProcessor
Antar1011 Sep 17, 2016
7116f21
Added documentation, reworked some table structures.
Antar1011 Sep 19, 2016
a2bac04
Init test file for SQL sinks
Antar1011 Sep 19, 2016
384be35
Ignore scratch files
Antar1011 Sep 19, 2016
2500b9f
Refactor: declarative -> model
Antar1011 Sep 19, 2016
ea7e60f
Tests for model module
Antar1011 Sep 20, 2016
15e436d
Use context managers for sinks
Antar1011 Sep 21, 2016
c60f52b
Merge branch 'master' of github.com:Antar1011/Onix into backend
Antar1011 Sep 21, 2016
aac730d
Doctests for the first few sink functions
Antar1011 Sep 21, 2016
120931e
Finished conversion methods (with examples!)
Antar1011 Sep 21, 2016
ef82807
No real reason for these to be protected methods...
Antar1011 Sep 21, 2016
c6bd697
log_generator.generate_pokemon bugfix
Antar1011 Sep 21, 2016
5c8ba59
Refactor `compute_tid` to backend
Antar1011 Sep 22, 2016
253d7dd
Tests for everything but the sinks themselves
Antar1011 Sep 22, 2016
7849ddd
docstrings on fixtures are stupid
Antar1011 Sep 22, 2016
4216a27
Better way of accessing table
Antar1011 Sep 22, 2016
b56cd72
Tests for SQL implementation of MovesetSink
Antar1011 Sep 22, 2016
e6e208c
Prevent duplicate entries in join table
Antar1011 Sep 22, 2016
bd30fa6
Remove foreign key constraint from teams table
Antar1011 Sep 22, 2016
28570da
Finished SQL BattleInfoSink tests
Antar1011 Sep 22, 2016
49b0932
Initial commit for SQL DAO
Antar1011 Sep 22, 2016
8b8d91c
Initial commit for SQL DAO test file
Antar1011 Sep 22, 2016
b7a2a9c
fixtures to generate dtos
Antar1011 Sep 22, 2016
9ca41e7
Laying out what I'm expecting from the dao tests
Antar1011 Sep 22, 2016
e8e6ad3
Merge branch 'master' of https://github.com/Antar1011/Onix into backend
Antar1011 Sep 23, 2016
ce00f6e
Update SQL ReportingDAO to new interface
Antar1011 Sep 23, 2016
dc4e9b6
Renamed some tables for consistency
Antar1011 Sep 24, 2016
bd81ed2
Tests for ReportingDAO
Antar1011 Sep 24, 2016
bd2c9b9
Implemented get_number_of_battles
Antar1011 Sep 24, 2016
0c5e18b
"skill_chance" (aka weighting) metric
Antar1011 Sep 25, 2016
834bb3b
implemented get_total_weight
Antar1011 Sep 25, 2016
1a76d90
null-rating handling
Antar1011 Sep 25, 2016
c5d85b1
filter, then weight--not weight, then filter
Antar1011 Sep 25, 2016
c2545bd
Refactor queries so they chain more transparently
Antar1011 Sep 26, 2016
ca24520
Finished get_usage_by_species
Antar1011 Sep 26, 2016
b4af6d1
Finished a half-written doctest...
Antar1011 Sep 26, 2016
068fd3d
py3 test fix
Antar1011 Sep 26, 2016
b301d15
Prevent sum from being computed twice
Antar1011 Sep 26, 2016
896b2bd
need to order formes
Antar1011 Sep 26, 2016
40628a1
order formes
Antar1011 Sep 26, 2016
4efff8c
rewrite model in SQLAlchemy Core
Antar1011 Sep 26, 2016
d49c15c
rewrite sink conversion methods
Antar1011 Sep 26, 2016
7547faa
Rewrite MovesetSink
Antar1011 Sep 26, 2016
3f907ca
Rewrite BattleInfoSink
Antar1011 Sep 26, 2016
2601f87
Updated DAO
Antar1011 Sep 26, 2016
f6e9ef3
Merge pull request #55 from Antar1011/sa-core
Antar1011 Sep 26, 2016
3597316
Update sorting
Antar1011 Sep 27, 2016
dd578f5
Merge branch 'backend' of https://github.com/Antar1011/Onix into sa-core
Antar1011 Sep 27, 2016
81a0ee9
Sinks should not be closing connections
Antar1011 Sep 27, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@
*.pyc
.coverage
.cache
sample_logs
sample_logs
test.sqlite
scratch.py
1 change: 1 addition & 0 deletions onix-env2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ dependencies:
- ipython
- pip
- python=2
- sqlalchemy
- pip:
- pytest
- pytest-cov
Expand Down
1 change: 1 addition & 0 deletions onix-env3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ dependencies:
- ipython
- pip
- python=3
- sqlalchemy
- pip:
- pytest
- pytest-cov
Expand Down
Empty file added onix/backend/__init__.py
Empty file.
Empty file added onix/backend/sql/__init__.py
Empty file.
218 changes: 218 additions & 0 deletions onix/backend/sql/dao.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
"""DAO implementations for SQL backend"""
import calendar
import datetime

import sqlalchemy as sa

from onix import metrics
from onix.reporting import dao as _dao
from onix.backend.sql import model


class ReportingDAO(_dao.ReportingDAO):
"""
SQL implementation of ReportingDAO interface

Args:
connection (sqlalchemy.engine.base.Connection) :
connection to the SQL backend
"""
def __init__(self, connection):
self.conn = connection
self.conn.connection.create_function('weight', 3, metrics.skill_chance)

def _filtered_battles(self, month, metagame):
"""
Filter out battles that are not in the date range, not the right
metagame or are too short (early forfeit policy)

Args:
month (str) :
the month to analyze
metagame (str) :
the sanitized name of the metagame

Returns:
sa.sql.expression.Alias :
the filtered view of the battle_infos table
"""
battle_infos = model.battle_infos

month_start = datetime.datetime.strptime(month, '%Y%m').date()
_, last_day_of_month = calendar.monthrange(month_start.year,
month_start.month)
month_end = datetime.date(month_start.year, month_start.month,
last_day_of_month)

query = (sa.select([battle_infos.c.id])
.select_from(battle_infos)
.where(battle_infos.c.format == metagame)
.where(battle_infos.c.date.between(month_start, month_end)))

# filter out early forfeits -- TODO: logic to handle non-6v6
query = query.where(battle_infos.c.turns >= 6)
return query.alias()

def get_number_of_battles(self, month, metagame):
query = self._filtered_battles(month, metagame)
query = sa.select([sa.func.count()]).select_from(query)

result = self.conn.execute(query)
return result.fetchone()[0]

def _weighted_players(self, battles, baseline):
"""
Gets player weights for the specified battles

Args:
battles (sa.sql.expression.Alias) :
the relevant battles
baseline (float) :
the baseline to use for skill_chance. Defaults to 1630.

.. note ::
a baseline of zero corresponds to unweighted stats

Returns:
sa.sql.expression.Alias :
the relevant battle_players table with weight added
"""
players = model.battle_players

join = sa.join(battles, players, onclause=battles.c.id == players.c.bid)
query = sa.select([battles.c.id.label('bid'),
players.c.side.label('side'),
players.c.pid.label('pid'),
players.c.tid.label('tid'),
players.c.w.label('w'),
players.c.l.label('l'),
players.c.t.label('t'),
players.c.elo.label('elo'),
players.c.r.label('r'),
players.c.rd.label('rd'),
players.c.rpr.label('rpr'),
players.c.rprd.label('rprd')]).select_from(join)
filtered = query.alias()

# policy is to use provisional ratings
r = sa.func.ifnull(filtered.c.rpr, 1500.)
rd = sa.func.ifnull(filtered.c.rprd, 130.)

if baseline == 0.:
weight = sa.literal_column("1")
elif baseline > 1500.:
weight = sa.case([(rd > 100., 0)],
else_=sa.func.weight(r, rd, baseline))
else:
weight = sa.func.weight(r, rd, baseline)
query = sa.select([filtered.c.bid,
filtered.c.side,
filtered.c.pid,
filtered.c.tid,
weight.label('weight')]).select_from(filtered)
return query.alias()

def get_total_weight(self, month, metagame, baseline=1630.):
players = self._weighted_players(
self._filtered_battles(month, metagame), baseline)

query = sa.select([sa.func.sum(players.c.weight)]).select_from(players)

result = self.conn.execute(query)
return result.fetchone()[0] or 0

def _weighted_team_members(self, players, species_lookup):
"""
Gets weights for individual team members and prettifies species names

Args:
players (sa.sql.expression.Alias) :
The relevant players with weights
species_lookup (dict) :
mapping of species names or forme-concatenations to their
display names. This is what handles things like determining
whether megas re tiered together or separately or what counts as
an "appearance-only" forme.

Returns:
sa.sql.expression.Alias :
the relevant teams table with prettified species names and
weights added

"""
teams = model.teams
mf = model.moveset_forme
formes = model.formes

join = sa.join(players, teams, onclause=players.c.tid == teams.c.tid)
join = join.join(mf, onclause=teams.c.sid == mf.c.sid)
join = join.join(formes, onclause=mf.c.fid == formes.c.id)

join = (sa.select([players.c.bid.label('bid'),
players.c.side.label('side'),
players.c.weight.label('weight'),
teams.c.idx.label('slot'),
teams.c.sid.label('sid'),
formes.c.species.label('species'),
mf.c.prime.label('prime')])
.select_from(join)
.order_by(formes.c.species)
.order_by(mf.c.prime.desc())).alias()

combo_formes = sa.func.group_concat(join.c.species
).label('combined_formes')
pretty = sa.case(species_lookup,
value=combo_formes,
else_='-' + combo_formes)

query = (sa.select([join.c.bid,
join.c.side,
join.c.weight,
join.c.slot,
join.c.sid,
pretty.label('species')])
.select_from(join)
.group_by(join.c.bid,
join.c.side,
join.c.slot))
return query.alias()

def _remove_duplicates(self, team_members):
"""
Prevent double-counting in usage stats for metagames without species
clause by combining team members of the same species

Args:
team_members (sa.sql.expression.Alias) :
the relevant weighted team members

Returns:
sa.sql.expression.Alias :
the input table with duplicate team members combined
"""
query = (sa.select([team_members.c.bid,
team_members.c.side,
team_members.c.weight,
sa.func.count(team_members.c.slot).label('count'),
team_members.c.species])
.select_from(team_members).group_by(team_members.c.bid,
team_members.c.side,
team_members.c.species))
return query.alias()

def get_usage_by_species(self, month, metagame, species_lookup,
baseline=1630.):
team_members = self._remove_duplicates(
self._weighted_team_members(
self._weighted_players(
self._filtered_battles(month, metagame),
baseline),
species_lookup))

total = sa.func.sum(team_members.c.weight).label('sum')
query = (sa.select([team_members.c.species,
total])
.select_from(team_members)
.group_by(team_members.c.species)
.order_by(total.desc()))
return list(self.conn.execute(query))
121 changes: 121 additions & 0 deletions onix/backend/sql/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
"""Table structure for SQL Backend"""
import sqlalchemy as sa


# SQLite foreign key enforcement derived from: https://goo.gl/okJmTL
@sa.event.listens_for(sa.engine.Engine, "connect")
def set_sqlite_pragma(dbapi_connection, connection_record):
cursor = dbapi_connection.cursor()
cursor.execute("PRAGMA foreign_keys=ON")
cursor.close()

_ignore_tables = set()


# INSERT OR IGNORE handling derived from: http://goo.gl/ih2NbY
@sa.event.listens_for(sa.engine.Engine, "before_execute", retval=True)
def _ignore_insert(conn, element, multiparams, params):
if isinstance(element, sa.sql.Insert) \
and element.table.name in _ignore_tables:
element = element.prefix_with("OR IGNORE")
return element, multiparams, params

metadata = sa.MetaData()

# moveset info that's shared across formes
movesets = sa.Table('movesets', metadata,
sa.Column('id', sa.String(512), primary_key=True),
sa.Column('gender', sa.CHAR),
sa.Column('item', sa.String(64)),
sa.Column('level', sa.SmallInteger),
sa.Column('happiness', sa.SmallInteger))

# moves
moveslots = sa.Table('moveslots', metadata,
sa.Column('sid', sa.String(512),
sa.ForeignKey('movesets.id'), primary_key=True),
sa.Column('idx', sa.SmallInteger, primary_key=True),
sa.Column('move', sa.String(64)))
'''Note that the "idx" column refers to the position of the move after
sorting / sanitizing and doesn't reflect the actual position of the move'''

# forme info
formes = sa.Table('formes', metadata,
sa.Column('id', sa.String(512), primary_key=True),
sa.Column('species', sa.String(64), nullable=False),
sa.Column('ability', sa.String(64)),
sa.Column('hp', sa.SmallInteger),
sa.Column('atk', sa.SmallInteger),
sa.Column('dfn', sa.SmallInteger),
sa.Column('spa', sa.SmallInteger),
sa.Column('spd', sa.SmallInteger),
sa.Column('spe', sa.SmallInteger))

# association table for many-to-many mappings of movesets to formes
moveset_forme = sa.Table('moveset_forme', metadata,
sa.Column('sid', sa.String(512),
sa.ForeignKey('movesets.id'),
primary_key=True),
sa.Column('fid', sa.String(512),
sa.ForeignKey('formes.id'),
primary_key=True),
sa.Column('prime', sa.Boolean))

# team members
teams = sa.Table('teams', metadata,
sa.Column('tid', sa.String(512), primary_key=True),
sa.Column('idx', sa.SmallInteger, primary_key=True),
sa.Column('sid', sa.String(512), nullable=False))
'''
Note that the "idx" column refers to the position of the member after sorting
the team by SID, *not* its position on a team during battle.

Note also that the sid column should really be a foreign key in the movesets
table, but it's not so as to allow movesets and battle info to be written to the
DB in any order. Be aware, and take special care to preserve preserve the
integrity of this table.
'''

# battle metadata
battle_infos = sa.Table('battle_infos', metadata,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('format', sa.String(64)),
sa.Column('date', sa.Date),
sa.Column('turns', sa.Integer),
sa.Column('end_type', sa.String(64)))

# player-instance metadata
battle_players = sa.Table('battle_players', metadata,
sa.Column('bid', sa.Integer,
sa.ForeignKey('battle_infos.id'),
primary_key=True),
sa.Column('side', sa.SmallInteger, primary_key=True),
sa.Column('pid', sa.String(512), nullable=False),
sa.Column('tid', sa.String(512), nullable=False),
sa.Column('w', sa.Integer),
sa.Column('l', sa.Integer),
sa.Column('t', sa.Integer),
sa.Column('elo', sa.Float),
sa.Column('r', sa.Float),
sa.Column('rd', sa.Float),
sa.Column('rpr', sa.Float),
sa.Column('rprd', sa.Float))

'''
It's possible that in the future we'll have a table that we don't want to
ignore, but for now...'''
_ignore_tables.update(metadata.tables.keys())


def create_tables(engine):
"""
Creates all the tables for the SQL backend

Args:
engine (sqlalchemy.engine.base.Engine) : the database engine to use

Returns:
None

"""
metadata.create_all(engine)
Loading