# retag tappedout decks

in order to get all decks normalized and tagged from a central source of taggings, I have built the neo4j tag database. the issue: the updated, normalized source for the "canonical" tags is metamox, which uses its own set of tags. this means I have some aliasing to do before the "source" tags and tappedout are synced. this is that work!

## imports

In [1]:
import sys, os

sys.path.insert(0, os.path.realpath('../'))

In [2]:
import logging
import pickle

import networkx as nx
import numpy as np
import pandas as pd

from IPython.display import display
from ipywidgets import Button, Checkbox, Dropdown, Output, Text
from neo4j import basic_auth, GraphDatabase

from mtg.credentials import F_NEO_CONF, load_neo_config
from mtg.extract.neo4j import get_neo_tags, get_card_official_tags
from mtg.extract.tappedout import (TAPPEDOUT_SPECIAL_TAGS,
                                   TAPPEDOUT_TAGS_TO_REPLACE,
                                   TappedoutDeck,
                                   build_categories_df,
                                   get_all_categories)
from mtg.load.nx2neo import digraph_to_neo
from mtg.utils import init_logging

In [3]:
init_logging()

## function defs

In [4]:
deck_id = '12-07-19-jeskai-edh'
d = TappedoutDeck(deck_id=deck_id, ignore_lands=False, with_tags=True)

2019-07-25 23:57:05,568 DEBUG    [mtg.extract.tappedout.get_categories:272] loading categories for deck 12-07-19-jeskai-edh


In [5]:
print(d.text_description)

1x Adarkar Wastes 
1x Aetherflux Reservoir #lifegain #spell_slinger #amplify #wincon
1x Ancient Tomb 
1x Anointed Procession #copy #token_source #amplify
1x Arcane Denial #card_advantage #counter
1x Arclight Phoenix 
1x Arid Mesa 
1x Ashnod's Altar 
1x Austere Command 
1x Azorius Chancery 
1x Azorius Signet 
1x Battle Hymn #ramp #amplify
1x Battlefield Forge 
1x Bident of Thassa 
1x Blasphemous Act 
1x Bloodstained Mire 
1x Bonus Round 
1x Boros Charm #burn #indestructible #stopgap
1x Boros Garrison 
1x Boros Signet 
1x Brainstorm #cantrip #card_advantage
1x Cascade Bluffs 
1x Cathars' Crusade 
1x Chaos Warp #removal #removal_all
1x Chromatic Lantern 
1x Chrome Mox 
1x City of Brass 
1x Clifftop Retreat 
1x Comet Storm #burn #other_removal #removal #storm
1x Command Tower 
1x Commander's Sphere 
1x Counterflux #counter
1x Counterspell #cantrip #counter #hard_counter #mass_removal #removal #stopgap
1x Cyclonic Rift #board_wipes #mass_removal #permanents_-_mass_bounce #removal #removal_a

goal: every card gets (and only gets) tags that satisfy one of the following

+ it is any of the "official" tags connected to it in neo (at any distance)
    + e.g. `(c:Card)-[*]->(t:Tag)`
+ it is a tappedout tag that has no alias to real tags
    + e.g. `#already_have_this` stays, but `#lifegain` doesn't
        + `(to:TappedoutTag {name: "#lifegain"})-[:IS_ALIAS_OF]->(t:Tag {name: "Lifegain"})`
+ it is one of the special tags that is deck-specific
    + e.g. `#amplify` stays for cards in decks where `(c:Card)-[:HAS_TAG]->(t:Tag {name: "{deck_id} - Amplify"})`

if a given tag doesn't satisfy one of those criteria, it will be removed. *note: this basically means "remove all tags that have 'real' aliases"*

In [6]:
neo_conf = load_neo_config(F_NEO_CONF)

In [7]:
# hack to make sure that the get_card_official_tags function gets
# updated; remove when done
import mtg.extract.neo4j as N
import importlib
importlib.reload(N)

get_card_official_tags = N.get_card_official_tags

### official tags

In [8]:
tags_official = pd.DataFrame(get_card_official_tags(neo_conf,
                                                    card_names=d.df.name.unique().tolist()),
                             columns=['name', 'tag'])

2019-07-25 23:57:53,920 INFO     [bulk_rename_tappedout_tags.get_card_official_tags:76] loading all official tags for 200 cards


In [9]:
tags_official.head()

Unnamed: 0,name,tag
0,Adarkar Wastes,Pain Lands
1,Adarkar Wastes,Lands
2,Adarkar Wastes,Azorius Mana
3,Aetherflux Reservoir,12-07-19-jeskai-edh - Amplify
4,Aetherflux Reservoir,12-07-19-jeskai-edh - Wincon


drop the deck-specific tags as they could have only gotten to this side from tappedout to begin with

In [10]:
is_special = (tags_official
              .tag
              .str.contains('|'.join(TAPPEDOUT_SPECIAL_TAGS), case=False))

tags_official = tags_official[~is_special]

### deck specific tags

this set of tags exists as a constant across tappedout decks and is available as

In [11]:
TAPPEDOUT_SPECIAL_TAGS

['amplify', 'engine', 'standalone', 'stopgap', 'wincon']

### aliased tags

these are to be *removed*, not kept

In [12]:
_, _, aliases = get_neo_tags(neo_conf)
tags_aliases = pd.DataFrame(aliases, columns=['tappedout_tag', 'tag'])

2019-07-25 23:57:54,286 INFO     [bulk_rename_tappedout_tags.get_neo_tags:45] loading all known tags from bolt://localhost:7687
2019-07-25 23:57:54,297 INFO     [bulk_rename_tappedout_tags.get_neo_tags:51] loading all known tapped out alias tags from bolt://localhost:7687
2019-07-25 23:57:54,305 INFO     [bulk_rename_tappedout_tags.get_neo_tags:57] loading all known tapped out alias connections from bolt://localhost:7687


In [13]:
tags_aliases.head()

Unnamed: 0,tappedout_tag,tag
0,evasion,Evasive
1,extort,Extort
2,faerie,Tribal - Faerie
3,fearie,Tribal - Faerie
4,first strike,Combat Tricks


### actually doing the mapping

we'll create one function to do this and apply it across the records

In [23]:
def clean_tappedout_tag(t):
    t = t.replace('#', '').replace('_', ' ')
    # remap some of them
    t = TAPPEDOUT_TAGS_TO_REPLACE.get(t, t)
    return t

def fix_tag_list(rec):
    name = rec['name']
    tag_list = rec.tag_list
    try:
        tags = {clean_tappedout_tag(t) for t in tag_list}
    except TypeError:
        tags = set()
    
    validated_tags = set()

    # add the "white list" tags from the database
    validated_tags.update(tags_official
                          [tags_official.name == name]
                          .tag
                          .unique())

    # add any "deck specific" tags
    special_tags = tags.intersection(TAPPEDOUT_SPECIAL_TAGS)
    validated_tags.update(special_tags)
    tags.difference_update(special_tags)

    # drop anything aliased (alias on the other side should
    # have been added above)
    tags.difference_update(tags_aliases.tappedout_tag.values)

    # anything that remains is good to add back in
    validated_tags.update(tags)

    return ['#{}'.format(_.replace(' ', '_'))
            for _ in sorted(validated_tags)]

replace the existing tag column in the df with the fixed one

In [24]:
d.df.loc[:, 'tag_list_old'] = d.df.tag_list
d.df.tag_list = d.df.apply(fix_tag_list, axis=1)

In [26]:
print(d.text_description)

1x Adarkar Wastes #Azorius_Mana #Lands #Pain_Lands
1x Aetherflux Reservoir #Lifegain #Spell_Slinger #amplify #spell_slinger #wincon
1x Ancient Tomb #Lands #Multiple_Colorless_Lands #Ramp
1x Anointed Procession #Copy #Token_Source #amplify #copy #token_source
1x Arcane Denial #Card_Advantage #Counter #card_advantage
1x Arclight Phoenix #Reanimate #Reanimate_-_Creature #Recursion
1x Arid Mesa #Fetches #Lands #Pain_Lands
1x Ashnod's Altar #2_Colorless #Mana_Rocks #Sacrifice_Outlet
1x Austere Command #Artifact_Removal #Artifacts_-_Mass_Destroy #Board_Wipes #Cost_Reduction #Creature_Removal #Creatures_-_Mass_Destroy #Enchantment_Removal #Enchantments_-_Mass_Destroy #Mass_Removal #Other_Cost_Reduction #Removal
1x Azorius Chancery #Azorius_Mana #Bounce_Lands #Lands
1x Azorius Signet #Azorius_Mana #Mana_Rocks #Ramp #Signets
1x Battle Hymn #Ramp #amplify
1x Battlefield Forge #Lands #Pain_Lands
1x Bident of Thassa #Bomb #Card_Advantage #Combat_Tricks #Draw #Recurring_Card_Advantage
1x Blasphemou

## remapping tool