# retag tappedout decks

in order to get all decks normalized and tagged from a central source of taggings, I have built the neo4j tag database. the issue: the updated, normalized source for the "canonical" tags is metamox, which uses its own set of tags. this means I have some aliasing to do before the "source" tags and tappedout are synced. this is that work!

## imports

In [1]:
import sys, os

sys.path.insert(0, os.path.realpath('../'))

In [3]:
import logging
import pickle

import networkx as nx
import numpy as np
import pandas as pd

from IPython.display import display
from ipywidgets import Button, Checkbox, Dropdown, Output, Text
from neo4j import basic_auth, GraphDatabase

from mtg.credentials import F_NEO_CONF, load_neo_config
from mtg.extract.neo4j import get_neo_tags, get_card_official_tags
from mtg.extract.tappedout import (TAPPEDOUT_SPECIAL_TAGS,
                                   TAPPEDOUT_TAGS_TO_REPLACE,
                                   TappedoutDeck,
                                   build_categories_df,
                                   get_all_categories)
from mtg.load.neo import digraph_to_neo
from mtg.utils import init_logging

In [4]:
init_logging()

## function defs

In [5]:
deck_id = '19-09-19-grixis-rogues'
d = TappedoutDeck(deck_id=deck_id, ignore_lands=False, with_tags=True)

2019-09-21 22:36:55,953 DEBUG    [mtg.extract.tappedout.get_categories:271] loading categories for deck 19-09-19-grixis-rogues


In [6]:
print(d.text_description)

1x AEtherize 
1x Abyssal Specter 
1x Acquire 
1x Adaptive Automaton 
1x Akki Underminer 
1x Aqueous Form 
1x Arbiter of the Ideal 
1x Arcane Denial 
1x Ash Barrens 
1x Ashling, the Extinguisher 
1x Auntie's Snitch 
1x Balefire Dragon 
1x Bane Alley Broker 
1x Banshee of the Dread Choir 
1x Barren Moor 
1x Bident of Thassa 
1x Bitterblossom 
1x Blasphemous Act 
1x Blatant Thievery 
1x Blazing Specter 
1x Blighted Agent 
1x Blind Zealot 
1x Blizzard Specter 
1x Blood Crypt 
1x Bloodchief Ascension 
1x Bloodforged Battle-Axe 
1x Bloodstained Mire 
1x Bojuka Bog 
1x Bonesplitter 
1x Brainstorm 
1x Bribery 
1x Broodbirth Viper 
1x Cabal Coffers 
1x Cabal Executioner 
1x Call of the Nightwing 
1x Call to the Kindred 
1x Cavalcade of Calamity 
1x Cavern of Souls 
1x Cephalid Constable 
1x Changeling Outcast 
1x Choked Estuary 
1x Chromatic Lantern 
1x Cloak and Dagger 
1x Clutch of the Undercity 
1x Coastal Piracy 
1x Coat of Arms 
1x Collective Restraint 
1x Command Tower 
1x Commander's Sph

goal: every card gets (and only gets) tags that satisfy one of the following

+ it is any of the "official" tags connected to it in neo (at any distance)
    + e.g. `(c:Card)-[*]->(t:Tag)`
+ it is a tappedout tag that has no alias to real tags
    + e.g. `#already_have_this` stays, but `#lifegain` doesn't
        + `(to:TappedoutTag {name: "#lifegain"})-[:IS_ALIAS_OF]->(t:Tag {name: "Lifegain"})`
+ it is one of the special tags that is deck-specific
    + e.g. `#amplify` stays for cards in decks where `(c:Card)-[:HAS_TAG]->(t:Tag {name: "{deck_id} - Amplify"})`

if a given tag doesn't satisfy one of those criteria, it will be removed. *note: this basically means "remove all tags that have 'real' aliases"*

In [7]:
neo_conf = load_neo_config(F_NEO_CONF)

In [8]:
# hack to make sure that the get_card_official_tags function gets
# updated; remove when done
import mtg.extract.neo4j as N
import importlib
importlib.reload(N)

get_card_official_tags = N.get_card_official_tags

### official tags

In [9]:
tags_official = pd.DataFrame(get_card_official_tags(neo_conf,
                                                    card_names=d.df.name.unique().tolist()),
                             columns=['name', 'tag'])

2019-09-21 22:37:55,334 INFO     [bulk_rename_tappedout_tags.get_card_official_tags:77] loading all official tags for 387 cards


In [10]:
tags_official.head()

Unnamed: 0,name,tag
0,Acquire,tutor
1,Acquire,tutor - opponent library
2,Arbiter of the Ideal,card advantage
3,Arcane Denial,card advantage
4,Arcane Denial,counter


drop the deck-specific tags as they could have only gotten to this side from tappedout to begin with

In [11]:
is_special = (tags_official
              .tag
              .str.contains('|'.join(TAPPEDOUT_SPECIAL_TAGS), case=False))

tags_official = tags_official[~is_special]

### deck specific tags

this set of tags exists as a constant across tappedout decks and is available as

In [12]:
TAPPEDOUT_SPECIAL_TAGS

['amplify', 'engine', 'standalone', 'stopgap', 'wincon']

### aliased tags

these are to be *removed*, not kept

In [13]:
_, _, aliases = get_neo_tags(neo_conf)
tags_aliases = pd.DataFrame(aliases, columns=['tappedout_tag', 'tag'])

2019-09-21 22:38:40,181 INFO     [bulk_rename_tappedout_tags.get_neo_tags:46] loading all known tags from bolt://localhost:7687
2019-09-21 22:38:40,277 INFO     [bulk_rename_tappedout_tags.get_neo_tags:52] loading all known tapped out alias tags from bolt://localhost:7687
2019-09-21 22:38:40,310 INFO     [bulk_rename_tappedout_tags.get_neo_tags:58] loading all known tapped out alias connections from bolt://localhost:7687


In [14]:
tags_aliases.head()

Unnamed: 0,tappedout_tag,tag


### actually doing the mapping

we'll create one function to do this and apply it across the records

In [15]:
def clean_tappedout_tag(t):
    t = t.replace('#', '').replace('_', ' ')
    # remap some of them
    t = TAPPEDOUT_TAGS_TO_REPLACE.get(t, t)
    return t

def fix_tag_list(rec):
    name = rec['name']
    tag_list = rec.tag_list
    try:
        tags = {clean_tappedout_tag(t) for t in tag_list}
    except TypeError:
        tags = set()
    
    validated_tags = set()

    # add the "white list" tags from the database
    validated_tags.update(tags_official
                          [tags_official.name == name]
                          .tag
                          .unique())

    # add any "deck specific" tags
    special_tags = tags.intersection(TAPPEDOUT_SPECIAL_TAGS)
    validated_tags.update(special_tags)
    tags.difference_update(special_tags)

    # drop anything aliased (alias on the other side should
    # have been added above)
    tags.difference_update(tags_aliases.tappedout_tag.values)

    # anything that remains is good to add back in
    validated_tags.update(tags)

    return ['#{}'.format(_.replace(' ', '_'))
            for _ in sorted(validated_tags)]

replace the existing tag column in the df with the fixed one

In [16]:
d.df.loc[:, 'tag_list_old'] = d.df.tag_list
d.df.tag_list = d.df.apply(fix_tag_list, axis=1)

In [17]:
print(d.text_description)

1x AEtherize 
1x Abyssal Specter 
1x Acquire #tutor #tutor_-_opponent_library
1x Adaptive Automaton 
1x Akki Underminer 
1x Aqueous Form 
1x Arbiter of the Ideal #card_advantage
1x Arcane Denial #card_advantage #counter
1x Ash Barrens #cycling #land #land_-_cycling #tutor #tutor_-_land
1x Ashling, the Extinguisher 
1x Auntie's Snitch 
1x Balefire Dragon #mass_damage_-_creature #wrath
1x Bane Alley Broker 
1x Banshee of the Dread Choir #copy #copy_-_creature #copy_-_planeswalker #multiplayer
1x Barren Moor #land #land_-_cycling
1x Bident of Thassa #card_advantage #carddraw
1x Bitterblossom #token_source
1x Blasphemous Act #mass_damage_-_creature #mass_removal #wrath
1x Blatant Thievery #mass_theft #theft #theft_-_indefinite
1x Blazing Specter 
1x Blighted Agent 
1x Blind Zealot #destroy_-_creature #removal
1x Blizzard Specter 
1x Blood Crypt #land #land_-_shock
1x Bloodchief Ascension #drain_and_gain
1x Bloodforged Battle-Axe #copy #equipment
1x Bloodstained Mire #land
1x Bojuka Bog #ha

## remapping tool