# Magic: the Gathering Recommender System
___

##### Problem Statement:  
I will use data on Magic: the Gathering cards to build a content-based recommender system that suggests similar cards in order to improve card selection during the deck building process.

##### Outline:  
1. Gathering Data  
    a. The data can be gathered from Scryfall's bulk data section which has every card as a json file
2. Cleaning Data  
    a. There is a lot of unnecessary data that I can drop  
    b. Extract the nested json objects
3. EDA
4. Recommender System  
    a. Content-Based Recommender  
    b. Cosine similarity
5. Stretch Goals  
    a. Keep a running tally and rating system for a user-based collaborative recommender

##### Risks and Assumptions:  
One risk is that the data comes in the form of nested json objects which will need to be formatted in a way I can use it.  
I am also limiting the scope of the data to only look at unique cards. It can be a stretch goal to take into account any alternate printings  
Another potential issue is dealing with how the recommender system will actually recommend cards. For example, if a user enters 'Prized Amalgam' will they be recommended other 3 mana 3/3s in U/B or will/should it recommend cards that would work well with 'Prized Amalgam' like 'Bloodghast' or 'Narcomeba'

##### Data Sources:  
[Scryfall Bulk Data](https://scryfall.com/docs/api/bulk-data)  
[Scryfall Oracle Cards](https://archive.scryfall.com/json/scryfall-oracle-cards.json)

## 01 - Cleaning
___

### Imports

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import re

from nltk.tokenize import RegexpTokenizer

pd.options.display.max_columns = 35

In [2]:
df_og = pd.read_json('../Data/scryfall-oracle-cards.json')
# df = pd.read_json('../Data/oracle-cards-20200704050652.json') # older version
df = pd.read_json('../Data/oracle-cards-20200812210701.json')

In [3]:
df_og.head()

Unnamed: 0,object,id,oracle_id,multiverse_ids,mtgo_id,mtgo_foil_id,tcgplayer_id,name,lang,released_at,uri,scryfall_uri,layout,highres_image,image_uris,mana_cost,cmc,...,related_uris,preview,power,toughness,arena_id,watermark,promo_types,all_parts,frame_effects,card_faces,life_modifier,hand_modifier,loyalty,color_indicator,printed_name,flavor_name,variation_of
0,card,86bf43b1-8d4e-4759-bb2d-0b2e03ba7012,0004ebd0-dfd6-4276-b4a6-de0003e94237,[15862],15870.0,15871.0,3094.0,Static Orb,en,2001-04-11,https://api.scryfall.com/cards/86bf43b1-8d4e-4...,https://scryfall.com/card/7ed/319/static-orb?u...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{3},3.0,...,{'gatherer': 'https://gatherer.wizards.com/Pag...,,,,,,,,,,,,,,,,
1,card,7050735c-b232-47a6-a342-01795bfd0d46,0006faf6-7a61-426c-9034-579f2cfcfa83,[370780],49283.0,49284.0,69965.0,Sensory Deprivation,en,2013-07-19,https://api.scryfall.com/cards/7050735c-b232-4...,https://scryfall.com/card/m14/71/sensory-depri...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{U},1.0,...,{'gatherer': 'https://gatherer.wizards.com/Pag...,,,,,,,,,,,,,,,,
2,card,e718b21b-46d1-4844-985c-52745657b1ac,0007c283-5b7a-4c00-9ca1-b455c8dff8c3,[470580],77122.0,,196536.0,Road of Return,en,2019-08-23,https://api.scryfall.com/cards/e718b21b-46d1-4...,https://scryfall.com/card/c19/34/road-of-retur...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{G}{G},2.0,...,{'gatherer': 'https://gatherer.wizards.com/Pag...,"{'source': 'Magicshibby', 'source_uri': 'https...",,,,,,,,,,,,,,,
3,card,2955a257-302c-48df-9eec-8561cbc8374c,000d5588-5a4c-434e-988d-396632ade42c,[],,,,Storm Crow,en,2020-03-08,https://api.scryfall.com/cards/2955a257-302c-4...,https://scryfall.com/card/fmb1/31/storm-crow?u...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{1}{U},2.0,...,{'tcgplayer_decks': 'https://decks.tcgplayer.c...,,1.0,2.0,,,,,,,,,,,,,
4,card,b125d1e7-5d9b-4997-88b0-71bdfc19c6f2,000e5d65-96c3-498b-bd01-72b1a1991850,[12380],12637.0,12638.0,6412.0,Walking Sponge,en,1999-02-15,https://api.scryfall.com/cards/b125d1e7-5d9b-4...,https://scryfall.com/card/ulg/47/walking-spong...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{1}{U},2.0,...,{'gatherer': 'https://gatherer.wizards.com/Pag...,,1.0,1.0,,,,,,,,,,,,,


In [4]:
df.head()

Unnamed: 0,object,id,oracle_id,multiverse_ids,mtgo_id,mtgo_foil_id,tcgplayer_id,name,lang,released_at,uri,scryfall_uri,layout,highres_image,image_uris,mana_cost,cmc,...,preview,power,toughness,arena_id,watermark,produced_mana,all_parts,frame_effects,promo_types,card_faces,life_modifier,hand_modifier,loyalty,color_indicator,content_warning,printed_name,flavor_name
0,card,86bf43b1-8d4e-4759-bb2d-0b2e03ba7012,0004ebd0-dfd6-4276-b4a6-de0003e94237,[15862],15870.0,15871.0,3094.0,Static Orb,en,2001-04-11,https://api.scryfall.com/cards/86bf43b1-8d4e-4...,https://scryfall.com/card/7ed/319/static-orb?u...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{3},3.0,...,,,,,,,,,,,,,,,,,
1,card,7050735c-b232-47a6-a342-01795bfd0d46,0006faf6-7a61-426c-9034-579f2cfcfa83,[370780],49283.0,49284.0,69965.0,Sensory Deprivation,en,2013-07-19,https://api.scryfall.com/cards/7050735c-b232-4...,https://scryfall.com/card/m14/71/sensory-depri...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{U},1.0,...,,,,,,,,,,,,,,,,,
2,card,e718b21b-46d1-4844-985c-52745657b1ac,0007c283-5b7a-4c00-9ca1-b455c8dff8c3,[470580],77122.0,,196536.0,Road of Return,en,2019-08-23,https://api.scryfall.com/cards/e718b21b-46d1-4...,https://scryfall.com/card/c19/34/road-of-retur...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{G}{G},2.0,...,"{'source': 'Magicshibby', 'source_uri': 'https...",,,,,,,,,,,,,,,,
3,card,036ef8c9-72ac-46ce-af07-83b79d736538,000d5588-5a4c-434e-988d-396632ade42c,[83282],22609.0,22610.0,12835.0,Storm Crow,en,2005-07-29,https://api.scryfall.com/cards/036ef8c9-72ac-4...,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{1}{U},2.0,...,,1.0,2.0,,,,,,,,,,,,,,
4,card,b125d1e7-5d9b-4997-88b0-71bdfc19c6f2,000e5d65-96c3-498b-bd01-72b1a1991850,[12380],12637.0,12638.0,6412.0,Walking Sponge,en,1999-02-15,https://api.scryfall.com/cards/b125d1e7-5d9b-4...,https://scryfall.com/card/ulg/47/walking-spong...,normal,True,{'small': 'https://img.scryfall.com/cards/smal...,{1}{U},2.0,...,,1.0,1.0,,,,,,,,,,,,,,


In [5]:
[col for col in df.columns if col not in df_og.columns]



In [6]:
[col for col in df_og.columns if col not in df.columns]

['variation_of']

In [7]:
df['content_warning'].value_counts()

1.0    7

In [8]:
df['keywords'].value_counts()

[]                                                                                                          12834
[Flying]                                                                                                     1205
[Enchant]                                                                                                     846
[Trample]                                                                                                     275
[Equip]                                                                                                       245
                                                                                                            ...  
[Trample, Myriad]                                                                                               1
[Flying, Shroud, Defender]                                                                                      1
[Landwalk, Forestwalk, Protection]                                                      

In [9]:
df['prices'].head()

0    {'usd': '16.03', 'usd_foil': '84.99', 'eur': '...
1    {'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....
2    {'usd': '0.43', 'usd_foil': None, 'eur': '0.77...
3    {'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....
4    {'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....
Name: prices, dtype: object

In [10]:
df['produced_mana'].value_counts()

[B, G, R, U, W]       309
[C]                   304
[G]                   110
[R]                    81
[B]                    72
[U]                    41
[B, C, G, R, U, W]     40
[G, R]                 38
[W]                    37
[B, R]                 30
[U, W]                 29
[B, U]                 29
[G, W]                 28
[B, G]                 20
[G, U]                 19
[R, W]                 17
[R, U]                 16
[B, W]                 15
[G, R, W]               8
[B, C, U]               7
[C, G, W]               7
[B, R, U]               7
[B, C, R]               7
[C, G, R]               7
[C, U, W]               7
[G, U, W]               6
[B, R, W]               5
[C, G, U]               5
[B, C, W]               5
[B, G, W]               5
[B, G, U]               5
[R, U, W]               5
[B, G, R]               5
[C, R]                  5
[B, U, W]               5
[C, U]                  5
[G, R, U]               5
[B, C, G]               5
[C, R, U]   

In [11]:
df.shape

(21635, 73)

In [12]:
df_og.shape

(21369, 70)

In [13]:
df.columns

Index(['object', 'id', 'oracle_id', 'multiverse_ids', 'mtgo_id',
       'mtgo_foil_id', 'tcgplayer_id', 'name', 'lang', 'released_at', 'uri',
       'scryfall_uri', 'layout', 'highres_image', 'image_uris', 'mana_cost',
       'cmc', 'type_line', 'oracle_text', 'colors', 'color_identity',
       'keywords', 'legalities', 'games', 'reserved', 'foil', 'nonfoil',
       'oversized', 'promo', 'reprint', 'variation', 'set', 'set_name',
       'set_type', 'set_uri', 'set_search_uri', 'scryfall_set_uri',
       'rulings_uri', 'prints_search_uri', 'collector_number', 'digital',
       'rarity', 'flavor_text', 'card_back_id', 'artist', 'artist_ids',
       'illustration_id', 'border_color', 'frame', 'full_art', 'textless',
       'booster', 'story_spotlight', 'edhrec_rank', 'prices', 'related_uris',
       'preview', 'power', 'toughness', 'arena_id', 'watermark',
       'produced_mana', 'all_parts', 'frame_effects', 'promo_types',
       'card_faces', 'life_modifier', 'hand_modifier', 'loyalty',

In [14]:
df.loc[df['layout']=='transform'].head()

Unnamed: 0,object,id,oracle_id,multiverse_ids,mtgo_id,mtgo_foil_id,tcgplayer_id,name,lang,released_at,uri,scryfall_uri,layout,highres_image,image_uris,mana_cost,cmc,...,preview,power,toughness,arena_id,watermark,produced_mana,all_parts,frame_effects,promo_types,card_faces,life_modifier,hand_modifier,loyalty,color_indicator,content_warning,printed_name,flavor_name
229,card,0dbaef61-fa39-4ea7-bc21-445401c373e7,0272ca81-e727-4f4b-b06e-072d70bb5558,"[414479, 414480]",61462.0,61463.0,120122.0,Ulvenwald Captive // Ulvenwald Abomination,en,2016-07-22,https://api.scryfall.com/cards/0dbaef61-fa39-4...,https://scryfall.com/card/emn/175/ulvenwald-ca...,transform,True,,,2.0,...,,,,,,"[C, G]",,[mooneldrazidfc],,"[{'object': 'card_face', 'name': 'Ulvenwald Ca...",,,,,,,
453,card,c1f53d7a-9dad-46e8-b686-cd1362867445,04eeb9ad-5c59-411b-8809-db8349838588,"[410049, 410050]",59810.0,59811.0,115917.0,"Westvale Abbey // Ormendahl, Profane Prince",en,2016-04-08,https://api.scryfall.com/cards/c1f53d7a-9dad-4...,https://scryfall.com/card/soi/281/westvale-abb...,transform,True,,,0.0,...,,,,,,[C],"[{'object': 'related_card', 'id': 'c1f53d7a-9d...",[sunmoondfc],,"[{'object': 'card_face', 'name': 'Westvale Abb...",,,,,,,
732,card,b6867ddd-f953-41c6-ba36-86ae2c14c908,08b3328c-1d96-4a05-ae8b-f1b654084faa,"[414313, 414314]",61316.0,61317.0,120485.0,Extricator of Sin // Extricator of Flesh,en,2016-07-22,https://api.scryfall.com/cards/b6867ddd-f953-4...,https://scryfall.com/card/emn/23/extricator-of...,transform,True,,,3.0,...,,,,,,,"[{'object': 'related_card', 'id': 'b6867ddd-f9...",[mooneldrazidfc],,"[{'object': 'card_face', 'name': 'Extricator o...",,,,,,,
948,card,c0f9c733-0818-4a03-8f0c-a163d09e0fff,0b55eac6-a745-4bf4-8926-5ce83bc38d7d,"[435410, 435411]",65528.0,65529.0,144537.0,Treasure Map // Treasure Cove,en,2017-09-29,https://api.scryfall.com/cards/c0f9c733-0818-4...,https://scryfall.com/card/xln/250/treasure-map...,transform,True,,,2.0,...,,,,66477.0,,"[B, C, G, R, U, W]","[{'object': 'related_card', 'id': 'c0f9c733-08...",[compasslanddfc],,"[{'object': 'card_face', 'name': 'Treasure Map...",,,,,,,
1114,card,3e2011f0-a640-4579-bd67-1dfbc09b8c09,0d397c05-a680-4274-972f-6a5f778b5133,"[414496, 414497]",61268.0,61269.0,119477.0,"Ulrich of the Krallenhorde // Ulrich, Uncontes...",en,2016-07-22,https://api.scryfall.com/cards/3e2011f0-a640-4...,https://scryfall.com/card/emn/191/ulrich-of-th...,transform,True,,,5.0,...,,,,,,,,[sunmoondfc],,"[{'object': 'card_face', 'name': 'Ulrich of th...",,,,,,,


In [15]:
df.loc[220, 'scryfall_uri']

'https://scryfall.com/card/dde/38/nomadic-elf?utm_source=api'

___
### Drop unneeded columns

In [16]:
unneeded = ['id', 'oracle_id', 'multiverse_ids', 'tcgplayer_id', 'uri', 'image_uris',
            'highres_image', 'games', 'set_uri', 'set_search_uri',  'scryfall_set_uri', 'rulings_uri', 
            'prints_search_uri', 'collector_number', 'card_back_id', 'artist_ids', 'illustration_id', 
            'story_spotlight', 'related_uris', 'preview', 'arena_id', 'all_parts', 'mtgo_id',
            'color_indicator', 'mtgo_foil_id', 'life_modifier', 'hand_modifier', 'frame_effects', 'flavor_text',
            'watermark', 'lang', 'released_at', 'reserved', 'foil', 'nonfoil', 'promo', 'reprint', 'variation',
            'artist', 'frame', 'full_art', 'textless', 'booster', 'promo_types', 'edhrec_rank']
df = df.drop(columns=unneeded)

In [17]:
df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,oversized,set,set_name,set_type,digital,rarity,border_color,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name
0,card,Static Orb,https://scryfall.com/card/7ed/319/static-orb?u...,normal,{3},3.0,Artifact,"As long as Static Orb is untapped, players can...",[],[],[],"{'standard': 'not_legal', 'future': 'not_legal...",False,7ed,Seventh Edition,core,False,rare,white,"{'usd': '16.03', 'usd_foil': '84.99', 'eur': '...",,,,,,,,
1,card,Sensory Deprivation,https://scryfall.com/card/m14/71/sensory-depri...,normal,{U},1.0,Enchantment — Aura,Enchant creature\nEnchanted creature gets -3/-0.,[U],[U],[Enchant],"{'standard': 'not_legal', 'future': 'not_legal...",False,m14,Magic 2014,core,False,common,black,"{'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....",,,,,,,,
2,card,Road of Return,https://scryfall.com/card/c19/34/road-of-retur...,normal,{G}{G},2.0,Sorcery,Choose one —\n• Return target permanent card f...,[G],[G],[Entwine],"{'standard': 'not_legal', 'future': 'not_legal...",False,c19,Commander 2019,commander,False,rare,black,"{'usd': '0.43', 'usd_foil': None, 'eur': '0.77...",,,,,,,,
3,card,Storm Crow,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,{1}{U},2.0,Creature — Bird,Flying (This creature can't be blocked except ...,[U],[U],[Flying],"{'standard': 'not_legal', 'future': 'not_legal...",False,9ed,Ninth Edition,core,False,common,white,"{'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....",1.0,2.0,,,,,,
4,card,Walking Sponge,https://scryfall.com/card/ulg/47/walking-spong...,normal,{1}{U},2.0,Creature — Sponge,{T}: Target creature loses your choice of flyi...,[U],[U],[],"{'standard': 'not_legal', 'future': 'not_legal...",False,ulg,Urza's Legacy,expansion,False,uncommon,black,"{'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....",1.0,1.0,,,,,,


In [18]:
df.columns

Index(['object', 'name', 'scryfall_uri', 'layout', 'mana_cost', 'cmc',
       'type_line', 'oracle_text', 'colors', 'color_identity', 'keywords',
       'legalities', 'oversized', 'set', 'set_name', 'set_type', 'digital',
       'rarity', 'border_color', 'prices', 'power', 'toughness',
       'printed_name', 'flavor_name'],
      dtype='object')

Also drop oversized cards

In [19]:
df = df.drop(df[df['oversized'] == True].index)

In [20]:
df = df.drop(columns=['oversized', 'digital'])

___
### check for nulls

In [21]:
df.isnull().sum()

object                 0
name                   0
scryfall_uri           0
layout                 0
mana_cost            193
cmc                    0
type_line              0
oracle_text          333
colors               193
color_identity         0
keywords               0
legalities             0
set                    0
set_name               0
set_type               0
rarity                 0
border_color           0
prices                 0
power              10034
toughness          10034
produced_mana      20078
card_faces         21112
loyalty            21231
printed_name       21444
flavor_name        21444
dtype: int64

In [22]:
df.shape

(21445, 26)

Art Series cards only existed in the modern horrizon set and are not actual cards, so we should drop them from our data set

In [23]:
df = df.drop(df[df['layout'] == 'art_series'].index)

In [24]:
# drop all the cards from the joke sets, because they are not legal in any format
df = df.drop(df[(df['set'] == 'unh') | (df['set'] == 'ugl') | (df['set'] == 'ust')].index)

In [25]:
df['border_color'].value_counts()

black         20125
white           690
silver          111
gold             72
borderless        1
Name: border_color, dtype: int64

In [26]:
# drop any remaining gold or silver bordered cards because those are not legal either
df = df.drop(df.loc[(df['border_color'] == 'gold') | (df['border_color'] == 'silver')].index)

### drop tokens and non-legal cards

In [27]:
df['layout'].value_counts()

normal                19980
token                   374
transform               104
split                    86
vanguard                 75
emblem                   55
double_faced_token       33
adventure                30
saga                     25
leveler                  25
flip                     20
meld                      9
Name: layout, dtype: int64

In [28]:
non_cards_index = df[(df['layout'] == 'double_faced_token') | (df['layout'] == 'token') | 
                             (df['layout'] == 'vanguard') | (df['layout'] == 'emblem')].index

In [29]:
df = df.drop(non_cards_index)
df = df.drop(df[df['set_type'] == 'token'].index)
df.shape

(20246, 26)

In [30]:
df['set_type'].value_counts()

expansion           11370
masters              2744
core                 2037
commander            1588
draft_innovation     1114
duel_deck             686
starter               286
funny                 127
planechase            120
archenemy              84
memorabilia            45
box                    26
promo                  11
treasure_chest          8
Name: set_type, dtype: int64

In [31]:
df.loc[df['set_type'] == 'funny']['set'].value_counts()

cmb1     120
htr18      4
htr17      3
Name: set, dtype: int64

In [32]:
# the sets cmb1, htr17, and hho are not legal in any format so let's drop them
joke_cards_index = df.loc[(df['set'] == 'hho') | (df['set'] == 'htr17') | (df['set'] == 'cmb1')].index
df = df.drop(joke_cards_index)

In [33]:
# cards with the memorabilia set_type are also not legal in any format, so let's drop those as well.
non_legal_index = df.loc[df['set_type'] == 'memorabilia'].index
df = df.drop(non_legal_index)

In [34]:
# conspiracy cards are also not legal in any format
conspiracy_index = df.loc[df['type_line'] == 'Conspiracy'].index
df = df.drop(conspiracy_index)

In [35]:
# now that we've cleaned up the df a little, we can drop some more extraneous columns
df = df.drop(columns=['set', 'set_name', 'set_type', 'border_color'])

In [36]:
df.isnull().sum()

object                 0
name                   0
scryfall_uri           0
layout                 0
mana_cost            104
cmc                    0
type_line              0
oracle_text          237
colors               104
color_identity         0
keywords               0
legalities             0
rarity                 0
prices                 0
power               9304
toughness           9304
produced_mana      18737
card_faces         19816
loyalty            19848
printed_name       20052
flavor_name        20052
dtype: int64

In [37]:
df[df['colors'].isnull()]['layout'].value_counts()

transform    104
Name: layout, dtype: int64

In [38]:
df[df['mana_cost'].isnull()]['layout'].value_counts()

transform    104
Name: layout, dtype: int64

In [39]:
df[df['oracle_text'].isnull()]['layout'].value_counts()

transform    104
split         83
adventure     30
flip          20
Name: layout, dtype: int64

In [40]:
df = df.reset_index(drop=True)
df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name
0,card,Static Orb,https://scryfall.com/card/7ed/319/static-orb?u...,normal,{3},3.0,Artifact,"As long as Static Orb is untapped, players can...",[],[],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '16.03', 'usd_foil': '84.99', 'eur': '...",,,,,,,,
1,card,Sensory Deprivation,https://scryfall.com/card/m14/71/sensory-depri...,normal,{U},1.0,Enchantment — Aura,Enchant creature\nEnchanted creature gets -3/-0.,[U],[U],[Enchant],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....",,,,,,,,
2,card,Road of Return,https://scryfall.com/card/c19/34/road-of-retur...,normal,{G}{G},2.0,Sorcery,Choose one —\n• Return target permanent card f...,[G],[G],[Entwine],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.43', 'usd_foil': None, 'eur': '0.77...",,,,,,,,
3,card,Storm Crow,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,{1}{U},2.0,Creature — Bird,Flying (This creature can't be blocked except ...,[U],[U],[Flying],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....",1.0,2.0,,,,,,
4,card,Walking Sponge,https://scryfall.com/card/ulg/47/walking-spong...,normal,{1}{U},2.0,Creature — Sponge,{T}: Target creature loses your choice of flyi...,[U],[U],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....",1.0,1.0,,,,,,


___
From here I'd like to deal with the dual cards (transform, split, adventure, and flip cards).  

For flip and transform cards I want to keep the names the same because I want the recommender to recommend the whole card, not just one half of it.  

First let's deal with the transform cards.  
Since transform cards can have different values on each side, I'm going to make another column that keeps track of the transformed side's values. For example, the entry for Delver of Secrets // Insectile Aberration will be one row however will have a power column that will be Delver of Secrets's power and another column that will be Insectile Aberration's power. Same thing for other values on the card

In [41]:
trans_df = df.loc[df['layout'] == 'transform']
trans_df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name
217,card,Ulvenwald Captive // Ulvenwald Abomination,https://scryfall.com/card/emn/175/ulvenwald-ca...,transform,,2.0,Creature — Werewolf Horror // Creature — Eldra...,,,[G],"[Transform, Defender]","{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.14', 'usd_foil': '0.49', 'eur': '0....",,,"[C, G]","[{'object': 'card_face', 'name': 'Ulvenwald Ca...",,,,
424,card,"Westvale Abbey // Ormendahl, Profane Prince",https://scryfall.com/card/soi/281/westvale-abb...,transform,,0.0,Land // Legendary Creature — Demon,,,[B],"[Flying, Lifelink, Indestructible, Transform, ...","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '6.23', 'usd_foil': '9.98', 'eur': '4....",,,[C],"[{'object': 'card_face', 'name': 'Westvale Abb...",,,,
681,card,Extricator of Sin // Extricator of Flesh,https://scryfall.com/card/emn/23/extricator-of...,transform,,3.0,Creature — Human Cleric // Creature — Eldrazi ...,,,[W],"[Delirium, Transform]","{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.09', 'usd_foil': '0.38', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Extricator o...",,,,
885,card,Treasure Map // Treasure Cove,https://scryfall.com/card/xln/250/treasure-map...,transform,,2.0,Artifact // Land,,,[],"[Transform, Scry]","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '1.86', 'usd_foil': '2.97', 'eur': '1....",,,"[B, C, G, R, U, W]","[{'object': 'card_face', 'name': 'Treasure Map...",,,,
1037,card,"Ulrich of the Krallenhorde // Ulrich, Uncontes...",https://scryfall.com/card/emn/191/ulrich-of-th...,transform,,5.0,Legendary Creature — Human Werewolf // Legenda...,,,"[G, R]","[Transform, Fight]","{'standard': 'not_legal', 'future': 'not_legal...",mythic,"{'usd': '0.87', 'usd_foil': '3.56', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Ulrich of th...",,,,


In [42]:
# set up empty lists to fill
# first set of lists are for the front half
mana_cost_list = []
oracle_text_list = []
colors_list = []
power_list = []
toughness_list = []
loyalty_list = []
card_type_list = []

# second set of lists are for the back half. We don't need mana cost for transformed sides because they are treated
# as the same as the front side
oracle_text_back_list = []
colors_back_list = []
power_back_list = []
toughness_back_list = []
loyalty_back_list = []
card_type_back_list = []

# iterate through our list of transform cards
for index in trans_df.index:
    
    # Front half of the cards
    mana_cost_list.append(trans_df.loc[index, 'card_faces'][0]['mana_cost'])
    oracle_text_list.append(trans_df.loc[index, 'card_faces'][0]['oracle_text'])
    colors_list.append(trans_df.loc[index, 'card_faces'][0]['colors'])
    # doing some try/excepts becuase not all cards have power, toughness, or loyalty
    try:
        power_list.append(trans_df.loc[index, 'card_faces'][0]['power'])
    except:
        power_list.append('NONE')
    try:
        toughness_list.append(trans_df.loc[index, 'card_faces'][0]['toughness'])
    except:
        toughness_list.append('NONE')
    try:
        loyalty_list.append(trans_df.loc[index, 'card_faces'][0]['loyalty'])
    except:
        loyalty_list.append('NONE')
    card_type_list.append(trans_df.loc[index, 'card_faces'][0]['type_line'].split(' — ')[0])
    
    # Back half of the cards
    oracle_text_back_list.append(trans_df.loc[index, 'card_faces'][1]['oracle_text'])
    colors_back_list.append(trans_df.loc[index, 'card_faces'][1]['colors'])
    try:
        power_back_list.append(trans_df.loc[index, 'card_faces'][1]['power'])
    except:
        power_back_list.append('NONE')
    try:
        toughness_back_list.append(trans_df.loc[index, 'card_faces'][1]['toughness'])
    except:
        toughness_back_list.append('NONE')
    try:
        loyalty_back_list.append(trans_df.loc[index, 'card_faces'][1]['loyalty'])
    except:
        loyalty_back_list.append('NONE')
    card_type_back_list.append(trans_df.loc[index, 'card_faces'][1]['type_line'].split(' — ')[0])
    
# fill in our values for the front half
df.loc[trans_df.index, 'mana_cost'] = mana_cost_list
df.loc[trans_df.index, 'oracle_text'] = oracle_text_list
df.loc[trans_df.index, 'colors'] = colors_list
df.loc[trans_df.index, 'power'] = power_list
df.loc[trans_df.index, 'toughness'] = toughness_list
df.loc[trans_df.index, 'loyalty'] = loyalty_list
df.loc[trans_df.index, 'card_type'] = card_type_list

# fill in our values for the back half
df.loc[trans_df.index, 'oracle_text_back'] = oracle_text_back_list
df.loc[trans_df.index, 'colors_back'] = colors_back_list
df.loc[trans_df.index, 'power_back'] = power_back_list
df.loc[trans_df.index, 'toughness_back'] = toughness_back_list
df.loc[trans_df.index, 'loyalty_back'] = loyalty_back_list
df.loc[trans_df.index, 'card_type_back'] = card_type_back_list

In [43]:
df.loc[trans_df.index].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back
217,card,Ulvenwald Captive // Ulvenwald Abomination,https://scryfall.com/card/emn/175/ulvenwald-ca...,transform,{1}{G},2.0,Creature — Werewolf Horror // Creature — Eldra...,Defender\n{T}: Add {G}.\n{5}{G}{G}: Transform ...,[G],[G],"[Transform, Defender]","{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.14', 'usd_foil': '0.49', 'eur': '0....",1,2,"[C, G]","[{'object': 'card_face', 'name': 'Ulvenwald Ca...",NONE,,,,Creature,{T}: Add {C}{C}.,[],4,6,NONE,Creature
424,card,"Westvale Abbey // Ormendahl, Profane Prince",https://scryfall.com/card/soi/281/westvale-abb...,transform,,0.0,Land // Legendary Creature — Demon,"{T}: Add {C}.\n{5}, {T}, Pay 1 life: Create a ...",[],[B],"[Flying, Lifelink, Indestructible, Transform, ...","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '6.23', 'usd_foil': '9.98', 'eur': '4....",NONE,NONE,[C],"[{'object': 'card_face', 'name': 'Westvale Abb...",NONE,,,,Land,"Flying, lifelink, indestructible, haste",[B],9,7,NONE,Legendary Creature
681,card,Extricator of Sin // Extricator of Flesh,https://scryfall.com/card/emn/23/extricator-of...,transform,{2}{W},3.0,Creature — Human Cleric // Creature — Eldrazi ...,"When Extricator of Sin enters the battlefield,...",[W],[W],"[Delirium, Transform]","{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.09', 'usd_foil': '0.38', 'eur': '0....",0,3,,"[{'object': 'card_face', 'name': 'Extricator o...",NONE,,,,Creature,"Eldrazi you control have vigilance.\n{2}, {T},...",[],3,5,NONE,Creature
885,card,Treasure Map // Treasure Cove,https://scryfall.com/card/xln/250/treasure-map...,transform,{2},2.0,Artifact // Land,"{1}, {T}: Scry 1. Put a landmark counter on Tr...",[],[],"[Transform, Scry]","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '1.86', 'usd_foil': '2.97', 'eur': '1....",NONE,NONE,"[B, C, G, R, U, W]","[{'object': 'card_face', 'name': 'Treasure Map...",NONE,,,,Artifact,(Transforms from Treasure Map.)\n{T}: Add {C}....,[],NONE,NONE,NONE,Land
1037,card,"Ulrich of the Krallenhorde // Ulrich, Uncontes...",https://scryfall.com/card/emn/191/ulrich-of-th...,transform,{3}{R}{G},5.0,Legendary Creature — Human Werewolf // Legenda...,Whenever this creature enters the battlefield ...,"[G, R]","[G, R]","[Transform, Fight]","{'standard': 'not_legal', 'future': 'not_legal...",mythic,"{'usd': '0.87', 'usd_foil': '3.56', 'eur': '0....",4,4,,"[{'object': 'card_face', 'name': 'Ulrich of th...",NONE,,,,Legendary Creature,"Whenever this creature transforms into Ulrich,...","[G, R]",6,6,NONE,Legendary Creature


Next let's deal with flip cards

In [44]:
flip_df = df.loc[df['layout'] == 'flip'].copy()
flip_df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back
1545,card,Nezumi Graverobber // Nighteyes the Desecrator,https://scryfall.com/card/cm2/71/nezumi-graver...,flip,{1}{B},2.0,Creature — Rat Rogue // Legendary Creature — R...,,[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.65', 'usd_foil': None, 'eur': '0.21...",2,1,,"[{'object': 'card_face', 'name': 'Nezumi Grave...",,,,,,,,,,,
2744,card,"Faithful Squire // Kaiso, Memory of Loyalty",https://scryfall.com/card/bok/3/faithful-squir...,flip,{1}{W}{W},3.0,Creature — Human Soldier // Legendary Creature...,,[W],[W],[Flying],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.16', 'usd_foil': '0.36', 'eur': '0....",2,2,,"[{'object': 'card_face', 'name': 'Faithful Squ...",,,,,,,,,,,
3323,card,Jushi Apprentice // Tomoya the Revealer,https://scryfall.com/card/chk/70/jushi-apprent...,flip,{1}{U},2.0,Creature — Human Wizard // Legendary Creature ...,,[U],[U],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.98', 'usd_foil': '2.99', 'eur': '0....",1,2,,"[{'object': 'card_face', 'name': 'Jushi Appren...",,,,,,,,,,,
4755,card,"Cunning Bandit // Azamuki, Treachery Incarnate",https://scryfall.com/card/bok/99/cunning-bandi...,flip,{1}{R}{R},3.0,Creature — Human Warrior // Legendary Creature...,,[R],[R],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.17', 'usd_foil': '0.43', 'eur': '0....",2,2,,"[{'object': 'card_face', 'name': 'Cunning Band...",,,,,,,,,,,
5439,card,Nezumi Shortfang // Stabwhisker the Odious,https://scryfall.com/card/chk/131/nezumi-short...,flip,{1}{B},2.0,Creature — Rat Rogue // Legendary Creature — R...,,[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '3.43', 'usd_foil': '5.89', 'eur': '1....",1,1,,"[{'object': 'card_face', 'name': 'Nezumi Short...",,,,,,,,,,,


In [45]:
# set up empty lists to fill
# first set of lists are for the front half
oracle_text_list = []
power_list = []
toughness_list = []
card_type_list = []

# second set of lists are for the back half. We don't need mana cost for transformed sides because they are treated
# as the same as the front side
oracle_text_back_list = []
power_back_list = []
toughness_back_list = []
card_type_back_list = []

# iterate through our list of transform cards
for index in flip_df.index:
    
    # Front half of the cards
    oracle_text_list.append(flip_df.loc[index, 'card_faces'][0]['oracle_text'])
    # doing some try/excepts becuase not all cards have power, toughness, or loyalty
    try:
        power_list.append(flip_df.loc[index, 'card_faces'][0]['power'])
    except:
        power_list.append('NONE')
    try:
        toughness_list.append(flip_df.loc[index, 'card_faces'][0]['toughness'])
    except:
        toughness_list.append('NONE')
    card_type_list.append(flip_df.loc[index, 'card_faces'][0]['type_line'].split(' — ')[0])
    
    # Back half of the cards
    oracle_text_back_list.append(flip_df.loc[index, 'card_faces'][1]['oracle_text'])
    try:
        power_back_list.append(flip_df.loc[index, 'card_faces'][1]['power'])
    except:
        power_back_list.append('NONE')
    try:
        toughness_back_list.append(flip_df.loc[index, 'card_faces'][1]['toughness'])
    except:
        toughness_back_list.append('NONE')
    card_type_back_list.append(flip_df.loc[index, 'card_faces'][1]['type_line'].split(' — ')[0])
    
# fill in our values for the front half
df.loc[flip_df.index, 'oracle_text'] = oracle_text_list
df.loc[flip_df.index, 'power'] = power_list
df.loc[flip_df.index, 'toughness'] = toughness_list
df.loc[flip_df.index, 'card_type'] = card_type_list

# fill in our values for the back half
df.loc[flip_df.index, 'oracle_text_back'] = oracle_text_back_list
df.loc[flip_df.index, 'power_back'] = power_back_list
df.loc[flip_df.index, 'toughness_back'] = toughness_back_list
df.loc[flip_df.index, 'card_type_back'] = card_type_back_list

In [46]:
df.loc[flip_df.index].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back
1545,card,Nezumi Graverobber // Nighteyes the Desecrator,https://scryfall.com/card/cm2/71/nezumi-graver...,flip,{1}{B},2.0,Creature — Rat Rogue // Legendary Creature — R...,{1}{B}: Exile target card from an opponent's g...,[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.65', 'usd_foil': None, 'eur': '0.21...",2,1,,"[{'object': 'card_face', 'name': 'Nezumi Grave...",,,,,Creature,{4}{B}: Put target creature card from a gravey...,,4,2,,Legendary Creature
2744,card,"Faithful Squire // Kaiso, Memory of Loyalty",https://scryfall.com/card/bok/3/faithful-squir...,flip,{1}{W}{W},3.0,Creature — Human Soldier // Legendary Creature...,"Whenever you cast a Spirit or Arcane spell, yo...",[W],[W],[Flying],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.16', 'usd_foil': '0.36', 'eur': '0....",2,2,,"[{'object': 'card_face', 'name': 'Faithful Squ...",,,,,Creature,"Flying\nRemove a ki counter from Kaiso, Memory...",,3,4,,Legendary Creature
3323,card,Jushi Apprentice // Tomoya the Revealer,https://scryfall.com/card/chk/70/jushi-apprent...,flip,{1}{U},2.0,Creature — Human Wizard // Legendary Creature ...,"{2}{U}, {T}: Draw a card. If you have nine or ...",[U],[U],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.98', 'usd_foil': '2.99', 'eur': '0....",1,2,,"[{'object': 'card_face', 'name': 'Jushi Appren...",,,,,Creature,"{3}{U}{U}, {T}: Target player draws X cards, w...",,2,3,,Legendary Creature
4755,card,"Cunning Bandit // Azamuki, Treachery Incarnate",https://scryfall.com/card/bok/99/cunning-bandi...,flip,{1}{R}{R},3.0,Creature — Human Warrior // Legendary Creature...,"Whenever you cast a Spirit or Arcane spell, yo...",[R],[R],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.17', 'usd_foil': '0.43', 'eur': '0....",2,2,,"[{'object': 'card_face', 'name': 'Cunning Band...",,,,,Creature,"Remove a ki counter from Azamuki, Treachery In...",,5,2,,Legendary Creature
5439,card,Nezumi Shortfang // Stabwhisker the Odious,https://scryfall.com/card/chk/131/nezumi-short...,flip,{1}{B},2.0,Creature — Rat Rogue // Legendary Creature — R...,"{1}{B}, {T}: Target opponent discards a card. ...",[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '3.43', 'usd_foil': '5.89', 'eur': '1....",1,1,,"[{'object': 'card_face', 'name': 'Nezumi Short...",,,,,Creature,"At the beginning of each opponent's upkeep, th...",,3,3,,Legendary Creature


Next, let's deal with the adventure cards.  
Adventure cards are different from flip and transform cards in that you could play either half of the card and not the other. However, in terms of cleaning we'll treat them the same

In [47]:
adv_df = df.loc[df['layout'] == 'adventure'].copy()
adv_df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back
427,card,Faerie Guidemother // Gift of the Fae,https://scryfall.com/card/eld/11/faerie-guidem...,adventure,{W} // {1}{W},1.0,Creature — Faerie // Sorcery — Adventure,,[W],[W],[Flying],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.07', 'usd_foil': '0.18', 'eur': '0....",1,1,,"[{'object': 'card_face', 'name': 'Faerie Guide...",,,,,,,,,,,
712,card,Tuinvale Treefolk // Oaken Boon,https://scryfall.com/card/eld/180/tuinvale-tre...,adventure,{5}{G} // {3}{G},6.0,Creature — Treefolk Druid // Sorcery — Adventure,,[G],[G],[],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.03', 'usd_foil': '0.07', 'eur': '0....",6,5,,"[{'object': 'card_face', 'name': 'Tuinvale Tre...",,,,,,,,,,,
1306,card,Murderous Rider // Swift End,https://scryfall.com/card/eld/97/murderous-rid...,adventure,{1}{B}{B} // {1}{B}{B},3.0,Creature — Zombie Knight // Instant — Adventure,,[B],[B],[Lifelink],"{'standard': 'legal', 'future': 'legal', 'hist...",rare,"{'usd': '1.66', 'usd_foil': '2.33', 'eur': '1....",2,3,,"[{'object': 'card_face', 'name': 'Murderous Ri...",,,,,,,,,,,
1335,card,Foulmire Knight // Profane Insight,https://scryfall.com/card/eld/90/foulmire-knig...,adventure,{B} // {2}{B},1.0,Creature — Zombie Knight // Instant — Adventure,,[B],[B],[Deathtouch],"{'standard': 'legal', 'future': 'legal', 'hist...",uncommon,"{'usd': '0.10', 'usd_foil': '0.38', 'eur': '0....",1,1,,"[{'object': 'card_face', 'name': 'Foulmire Kni...",,,,,,,,,,,
1397,card,Smitten Swordmaster // Curry Favor,https://scryfall.com/card/eld/105/smitten-swor...,adventure,{1}{B} // {B},2.0,Creature — Human Knight // Sorcery — Adventure,,[B],[B],[Lifelink],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.06', 'usd_foil': '0.18', 'eur': '0....",2,1,,"[{'object': 'card_face', 'name': 'Smitten Swor...",,,,,,,,,,,


In [48]:
# set up empty lists to fill
# first set of lists are for the creature half
mana_cost_list = []
oracle_text_list = []
power_list = []
toughness_list = []
card_type_list = []

# second set of lists are for the adventure half
oracle_text_back_list = []
card_type_back_list = []
mana_cost_back_list = []

# iterate through our list of adventure cards
for index in adv_df.index:
    
    # creature half of the cards
    mana_cost_list.append(adv_df.loc[index, 'card_faces'][0]['mana_cost'])
    oracle_text_list.append(adv_df.loc[index, 'card_faces'][0]['oracle_text'])
    power_list.append(adv_df.loc[index, 'card_faces'][0]['power'])
    toughness_list.append(adv_df.loc[index, 'card_faces'][0]['toughness'])
    card_type_list.append(adv_df.loc[index, 'card_faces'][0]['type_line'].split(' — ')[0])
    
    # adventure half of the cards
    mana_cost_back_list.append(adv_df.loc[index, 'card_faces'][1]['mana_cost'])
    oracle_text_back_list.append(adv_df.loc[index, 'card_faces'][1]['oracle_text'])
    card_type_back_list.append(adv_df.loc[index, 'card_faces'][1]['type_line'].split(' — ')[0])
    
# fill in our values for the creature half
df.loc[adv_df.index, 'mana_cost'] = mana_cost_list
df.loc[adv_df.index, 'oracle_text'] = oracle_text_list
df.loc[adv_df.index, 'power'] = power_list
df.loc[adv_df.index, 'toughness'] = toughness_list
df.loc[adv_df.index, 'card_type'] = card_type_list

# fill in our values for the adventure half
df.loc[adv_df.index, 'mana_cost_back'] = mana_cost_back_list
df.loc[adv_df.index, 'oracle_text_back'] = oracle_text_back_list
df.loc[adv_df.index, 'card_type_back'] = card_type_back_list

In [49]:
df.loc[adv_df.index].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back
427,card,Faerie Guidemother // Gift of the Fae,https://scryfall.com/card/eld/11/faerie-guidem...,adventure,{W},1.0,Creature — Faerie // Sorcery — Adventure,Flying,[W],[W],[Flying],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.07', 'usd_foil': '0.18', 'eur': '0....",1,1,,"[{'object': 'card_face', 'name': 'Faerie Guide...",,,,,Creature,Target creature gets +2/+1 and gains flying un...,,,,,Sorcery,{1}{W}
712,card,Tuinvale Treefolk // Oaken Boon,https://scryfall.com/card/eld/180/tuinvale-tre...,adventure,{5}{G},6.0,Creature — Treefolk Druid // Sorcery — Adventure,,[G],[G],[],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.03', 'usd_foil': '0.07', 'eur': '0....",6,5,,"[{'object': 'card_face', 'name': 'Tuinvale Tre...",,,,,Creature,Put two +1/+1 counters on target creature. (Th...,,,,,Sorcery,{3}{G}
1306,card,Murderous Rider // Swift End,https://scryfall.com/card/eld/97/murderous-rid...,adventure,{1}{B}{B},3.0,Creature — Zombie Knight // Instant — Adventure,"Lifelink\nWhen Murderous Rider dies, put it on...",[B],[B],[Lifelink],"{'standard': 'legal', 'future': 'legal', 'hist...",rare,"{'usd': '1.66', 'usd_foil': '2.33', 'eur': '1....",2,3,,"[{'object': 'card_face', 'name': 'Murderous Ri...",,,,,Creature,Destroy target creature or planeswalker. You l...,,,,,Instant,{1}{B}{B}
1335,card,Foulmire Knight // Profane Insight,https://scryfall.com/card/eld/90/foulmire-knig...,adventure,{B},1.0,Creature — Zombie Knight // Instant — Adventure,Deathtouch,[B],[B],[Deathtouch],"{'standard': 'legal', 'future': 'legal', 'hist...",uncommon,"{'usd': '0.10', 'usd_foil': '0.38', 'eur': '0....",1,1,,"[{'object': 'card_face', 'name': 'Foulmire Kni...",,,,,Creature,You draw a card and you lose 1 life. (Then exi...,,,,,Instant,{2}{B}
1397,card,Smitten Swordmaster // Curry Favor,https://scryfall.com/card/eld/105/smitten-swor...,adventure,{1}{B},2.0,Creature — Human Knight // Sorcery — Adventure,Lifelink,[B],[B],[Lifelink],"{'standard': 'legal', 'future': 'legal', 'hist...",common,"{'usd': '0.06', 'usd_foil': '0.18', 'eur': '0....",2,1,,"[{'object': 'card_face', 'name': 'Smitten Swor...",,,,,Creature,You gain X life and each opponent loses X life...,,,,,Sorcery,{B}


Finally, we'll deal with the split cards

In [50]:
split_df = df.loc[df['layout'] == 'split'].copy()
split_df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back
326,card,Heaven // Earth,https://scryfall.com/card/akr/M-H2E/heaven-ear...,split,{X}{G} // {X}{R}{R},3.0,Instant // Sorcery,,"[G, R]","[G, R]",[Aftermath],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': None, 'usd_foil': None, 'eur': None, '...",,,,"[{'object': 'card_face', 'name': 'Heaven', 'ma...",,,,,,,,,,,,
356,card,Breaking // Entering,https://scryfall.com/card/dgm/124/breaking-ent...,split,{U}{B} // {4}{B}{R},8.0,Sorcery // Sorcery,,"[B, R, U]","[B, R, U]","[Mill, Fuse]","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.45', 'usd_foil': '0.74', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Breaking', '...",,,,,,,,,,,,
595,card,Flesh // Blood,https://scryfall.com/card/dgm/128/flesh-blood?...,split,{3}{B}{G} // {R}{G},7.0,Sorcery // Sorcery,,"[B, G, R]","[B, G, R]",[Fuse],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.20', 'usd_foil': '0.41', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Flesh', 'man...",,,,,,,,,,,,
658,card,Assure // Assemble,https://scryfall.com/card/grn/221/assure-assem...,split,{G/W}{G/W} // {4}{G}{W},8.0,Instant // Instant,,"[G, W]","[G, W]",[],"{'standard': 'legal', 'future': 'legal', 'hist...",rare,"{'usd': '0.21', 'usd_foil': '0.41', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Assure', 'ma...",,,,,,,,,,,,
871,card,Struggle // Survive,https://scryfall.com/card/akr/M-S2S/struggle-s...,split,{2}{R} // {1}{G},5.0,Instant // Sorcery,,"[G, R]","[G, R]",[Aftermath],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': None, 'usd_foil': None, 'eur': None, '...",,,,"[{'object': 'card_face', 'name': 'Struggle', '...",,,,,,,,,,,,


In [51]:
# set up empty lists to fill
# first set of lists are for the front half
mana_cost_list = []
oracle_text_list = []
card_type_list = []

# second set of lists are for the back half.
mana_cost_back_list = []
oracle_text_back_list = []
card_type_back_list = []

# iterate through our list of transform cards
for index in split_df.index:
    
    # Front half of the cards
    mana_cost_list.append(split_df.loc[index, 'card_faces'][0]['mana_cost'])
    oracle_text_list.append(split_df.loc[index, 'card_faces'][0]['oracle_text'])
    card_type_list.append(split_df.loc[index, 'card_faces'][0]['type_line'].split(' — ')[0])
    
    # Back half of the cards
    mana_cost_back_list.append(split_df.loc[index, 'card_faces'][1]['mana_cost'])
    oracle_text_back_list.append(split_df.loc[index, 'card_faces'][1]['oracle_text'])
    card_type_back_list.append(split_df.loc[index, 'card_faces'][1]['type_line'].split(' — ')[0])
    
# fill in our values for the front half
df.loc[split_df.index, 'mana_cost'] = mana_cost_list
df.loc[split_df.index, 'oracle_text'] = oracle_text_list
df.loc[split_df.index, 'card_type'] = card_type_list

# fill in our values for the back half
df.loc[split_df.index, 'mana_cost_back'] = mana_cost_back_list
df.loc[split_df.index, 'oracle_text_back'] = oracle_text_back_list
df.loc[split_df.index, 'card_type_back'] = card_type_back_list

In [52]:
df.loc[split_df.index].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back
326,card,Heaven // Earth,https://scryfall.com/card/akr/M-H2E/heaven-ear...,split,{X}{G},3.0,Instant // Sorcery,Heaven deals X damage to each creature with fl...,"[G, R]","[G, R]",[Aftermath],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': None, 'usd_foil': None, 'eur': None, '...",,,,"[{'object': 'card_face', 'name': 'Heaven', 'ma...",,,,,Instant,Aftermath (Cast this spell only from your grav...,,,,,Sorcery,{X}{R}{R}
356,card,Breaking // Entering,https://scryfall.com/card/dgm/124/breaking-ent...,split,{U}{B},8.0,Sorcery // Sorcery,Target player mills eight cards.\nFuse (You ma...,"[B, R, U]","[B, R, U]","[Mill, Fuse]","{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.45', 'usd_foil': '0.74', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Breaking', '...",,,,,Sorcery,Put a creature card from a graveyard onto the ...,,,,,Sorcery,{4}{B}{R}
595,card,Flesh // Blood,https://scryfall.com/card/dgm/128/flesh-blood?...,split,{3}{B}{G},7.0,Sorcery // Sorcery,Exile target creature card from a graveyard. P...,"[B, G, R]","[B, G, R]",[Fuse],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.20', 'usd_foil': '0.41', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Flesh', 'man...",,,,,Sorcery,Target creature you control deals damage equal...,,,,,Sorcery,{R}{G}
658,card,Assure // Assemble,https://scryfall.com/card/grn/221/assure-assem...,split,{G/W}{G/W},8.0,Instant // Instant,Put a +1/+1 counter on target creature. That c...,"[G, W]","[G, W]",[],"{'standard': 'legal', 'future': 'legal', 'hist...",rare,"{'usd': '0.21', 'usd_foil': '0.41', 'eur': '0....",,,,"[{'object': 'card_face', 'name': 'Assure', 'ma...",,,,,Instant,Create three 2/2 green and white Elf Knight cr...,,,,,Instant,{4}{G}{W}
871,card,Struggle // Survive,https://scryfall.com/card/akr/M-S2S/struggle-s...,split,{2}{R},5.0,Instant // Sorcery,Struggle deals damage to target creature equal...,"[G, R]","[G, R]",[Aftermath],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': None, 'usd_foil': None, 'eur': None, '...",,,,"[{'object': 'card_face', 'name': 'Struggle', '...",,,,,Instant,Aftermath (Cast this spell only from your grav...,,,,,Sorcery,{1}{G}


In [53]:
df.isnull().sum()

object                  0
name                    0
scryfall_uri            0
layout                  0
mana_cost               0
cmc                     0
type_line               0
oracle_text             0
colors                  0
color_identity          0
keywords                0
legalities              0
rarity                  0
prices                  0
power                9200
toughness            9200
produced_mana       18737
card_faces          19816
loyalty             19744
printed_name        20052
flavor_name         20052
card_type           19816
oracle_text_back    19816
colors_back         19949
power_back          19929
toughness_back      19929
loyalty_back        19949
card_type_back      19816
mana_cost_back      19940
dtype: int64

___
Let's get the card types from the type line

In [54]:
missing_card_type_index = df.loc[df['card_type'].isnull()].index

card_type_list = []

for index in missing_card_type_index:
    card_type_list.append(df.loc[index, 'type_line'].split(' — ')[0])
    
df.loc[missing_card_type_index, 'card_type'] = card_type_list

In [55]:
df['card_type'].value_counts()

Creature                          9099
Instant                           2453
Enchantment                       2262
Sorcery                           2188
Artifact                          1265
Legendary Creature                 911
Land                               600
Artifact Creature                  593
Legendary Planeswalker             207
Enchantment Creature               105
Legendary Artifact                  68
Legendary Land                      41
Legendary Enchantment               38
Snow Creature                       37
Legendary Enchantment Creature      30
World Enchantment                   26
Tribal Instant                      20
Tribal Sorcery                      16
Tribal Enchantment                  13
Legendary Artifact Creature         13
Snow Land                            8
Snow Enchantment                     6
Basic Land                           6
Legendary Sorcery                    6
Artifact Land                        6
Tribal Artifact          

In [56]:
df.isnull().sum()

object                  0
name                    0
scryfall_uri            0
layout                  0
mana_cost               0
cmc                     0
type_line               0
oracle_text             0
colors                  0
color_identity          0
keywords                0
legalities              0
rarity                  0
prices                  0
power                9200
toughness            9200
produced_mana       18737
card_faces          19816
loyalty             19744
printed_name        20052
flavor_name         20052
card_type               0
oracle_text_back    19816
colors_back         19949
power_back          19929
toughness_back      19929
loyalty_back        19949
card_type_back      19816
mana_cost_back      19940
dtype: int64

In [57]:
df.loc[df['power'].isnull()]['card_type'].value_counts()

Instant                           2453
Enchantment                       2260
Sorcery                           2187
Artifact                          1232
Land                               599
Legendary Planeswalker             205
Legendary Artifact                  62
Legendary Land                      41
Legendary Enchantment               28
World Enchantment                   26
Tribal Instant                      20
Tribal Sorcery                      16
Tribal Enchantment                  13
Snow Land                            8
Snow Enchantment                     6
Artifact Land                        6
Basic Land                           6
Legendary Sorcery                    6
Basic Snow Land                      5
Legendary Enchantment Artifact       5
Tribal Artifact                      5
Hero Artifact                        5
Hero                                 2
Snow Artifact                        2
Legendary Snow Land                  1
Legendary Snow Enchantmen

In [58]:
# The rest of the Nulls in the power and toughness columns are non-creatures so they don't have a power or
# toughness. We will impute those nulls as 'NONE'
df['power'] = df['power'].fillna('NONE')
df['toughness'] = df['toughness'].fillna('NONE')
df.isnull().sum()

object                  0
name                    0
scryfall_uri            0
layout                  0
mana_cost               0
cmc                     0
type_line               0
oracle_text             0
colors                  0
color_identity          0
keywords                0
legalities              0
rarity                  0
prices                  0
power                   0
toughness               0
produced_mana       18737
card_faces          19816
loyalty             19744
printed_name        20052
flavor_name         20052
card_type               0
oracle_text_back    19816
colors_back         19949
power_back          19929
toughness_back      19929
loyalty_back        19949
card_type_back      19816
mana_cost_back      19940
dtype: int64

In [59]:
df.loc[df['card_faces'].isnull()]['layout'].value_counts()

normal     19757
saga          25
leveler       25
meld           9
Name: layout, dtype: int64

In [60]:
# the rest of the Nulls for card_faces are for non-dual-cards, so let's impute those Nulls as 'NONE'
df['card_faces'] = df['card_faces'].fillna('NONE')

In [61]:
df.loc[df['loyalty'].isnull()]['card_type'].value_counts()

Creature                          9032
Instant                           2453
Enchantment                       2260
Sorcery                           2187
Artifact                          1257
Legendary Creature                 903
Land                               599
Artifact Creature                  590
Enchantment Creature               105
Legendary Artifact                  66
Legendary Land                      41
Snow Creature                       37
Legendary Enchantment Creature      30
Legendary Enchantment               28
World Enchantment                   26
Tribal Instant                      20
Tribal Sorcery                      16
Tribal Enchantment                  13
Legendary Artifact Creature         13
Snow Land                            8
Snow Enchantment                     6
Basic Land                           6
Legendary Sorcery                    6
Artifact Land                        6
Tribal Artifact                      5
Basic Snow Land          

In [62]:
# the Nulls for loyalty are for non-planeswalker cards, so fill those Nulls as 'NONE'
df['loyalty'] = df['loyalty'].fillna('NONE')

In [63]:
# The rest of the Nulls are for back half of cards that are non-dual cards, so let's impute those as 'NONE'
df = df.fillna('NONE')
df.isnull().sum()

object              0
name                0
scryfall_uri        0
layout              0
mana_cost           0
cmc                 0
type_line           0
oracle_text         0
colors              0
color_identity      0
keywords            0
legalities          0
rarity              0
prices              0
power               0
toughness           0
produced_mana       0
card_faces          0
loyalty             0
printed_name        0
flavor_name         0
card_type           0
oracle_text_back    0
colors_back         0
power_back          0
toughness_back      0
loyalty_back        0
card_type_back      0
mana_cost_back      0
dtype: int64

In [64]:
df.loc[(df['mana_cost'] == "")]['card_type'].value_counts()

Land                   600
Legendary Land          41
Snow Land                8
Artifact Land            6
Sorcery                  6
Basic Land               6
Basic Snow Land          5
Hero Artifact            5
Hero                     2
Legendary Creature       2
Artifact                 2
Land Creature            1
Instant                  1
Legendary Snow Land      1
Creature                 1
Name: card_type, dtype: int64

In [65]:
# these mana costs should be None to avoid Null values later on
# according to the rules of magic no mana cost is different than a mana cost of {0}
no_mana_cost = df.loc[(df['mana_cost'] == "")].index
df.loc[no_mana_cost, 'mana_cost'] = 'NONE'

In [66]:
df.loc[no_mana_cost].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back
35,card,Savai Triome,https://scryfall.com/card/iko/253/savai-triome...,normal,NONE,0.0,Land — Mountain Plains Swamp,"({T}: Add {R}, {W}, or {B}.)\nSavai Triome ent...",[],"[B, R, W]",[Cycling],"{'standard': 'legal', 'future': 'legal', 'hist...",rare,"{'usd': '4.47', 'usd_foil': '4.92', 'eur': '4....",NONE,NONE,"[B, R, W]",NONE,NONE,NONE,NONE,NONE,Land,NONE,NONE,NONE,NONE,NONE,NONE,NONE
52,card,"Shizo, Death's Storehouse",https://scryfall.com/card/chk/283/shizo-deaths...,normal,NONE,0.0,Legendary Land,"{T}: Add {B}.\n{B}, {T}: Target legendary crea...",[],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '12.22', 'usd_foil': '45.32', 'eur': '...",NONE,NONE,[B],NONE,NONE,NONE,NONE,NONE,Legendary Land,NONE,NONE,NONE,NONE,NONE,NONE,NONE
65,card,Timber Gorge,https://scryfall.com/card/m19/258/timber-gorge...,normal,NONE,0.0,Land,Timber Gorge enters the battlefield tapped.\n{...,[],"[G, R]",[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.11', 'usd_foil': '0.44', 'eur': '0....",NONE,NONE,"[G, R]",NONE,NONE,NONE,NONE,NONE,Land,NONE,NONE,NONE,NONE,NONE,NONE,NONE
72,card,Game Trail,https://scryfall.com/card/soi/276/game-trail?u...,normal,NONE,0.0,Land,"As Game Trail enters the battlefield, you may ...",[],"[G, R]",[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '2.32', 'usd_foil': '3.79', 'eur': '1....",NONE,NONE,"[G, R]",NONE,NONE,NONE,NONE,NONE,Land,NONE,NONE,NONE,NONE,NONE,NONE,NONE
79,card,Selesnya Sanctuary,https://scryfall.com/card/c20/308/selesnya-san...,normal,NONE,0.0,Land,Selesnya Sanctuary enters the battlefield tapp...,[],"[G, W]",[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.19', 'usd_foil': None, 'eur': '0.17...",NONE,NONE,"[G, W]",NONE,NONE,NONE,NONE,NONE,Land,NONE,NONE,NONE,NONE,NONE,NONE,NONE


In [67]:
# Any leftover blank oracle text entries are for vanilla creatures (meaning they have no abilities)
vanilla_creatures = df.loc[(df['oracle_text'] == "") | (df['oracle_text_back'] == "")].index

df.loc[vanilla_creatures, ['oracle_text', 'oracle_text_back']] = 'NONE'

df.loc[vanilla_creatures].head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back
29,card,Leopard-Spotted Jiao,https://scryfall.com/card/gs1/23/leopard-spott...,normal,{1}{R},2.0,Creature — Beast,NONE,[R],[R],[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.11', 'usd_foil': None, 'eur': '0.03...",3,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE
44,card,Dakmor Scorpion,https://scryfall.com/card/s99/73/dakmor-scorpi...,normal,{1}{B},2.0,Creature — Scorpion,NONE,[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': None, 'usd_foil': None, 'eur': '0.22',...",2,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE
57,card,Scaled Wurm,https://scryfall.com/card/cns/178/scaled-wurm?...,normal,{7}{G},8.0,Creature — Wurm,NONE,[G],[G],[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.07', 'usd_foil': '0.22', 'eur': '0....",7,6,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE
116,card,Oreskos Swiftclaw,https://scryfall.com/card/m19/31/oreskos-swift...,normal,{1}{W},2.0,Creature — Cat Warrior,NONE,[W],[W],[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.03', 'usd_foil': '0.21', 'eur': '0....",3,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE
124,card,Minotaur Abomination,https://scryfall.com/card/m14/107/minotaur-abo...,normal,{4}{B}{B},6.0,Creature — Zombie Minotaur,NONE,[B],[B],[],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.11', 'usd_foil': '0.24', 'eur': '0....",4,6,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE


Now I'd like to break out the super-type of each card because by the rules of the game they are not considered to be card types. super-types include Basic, Legendary, Snow, and World

In [68]:
super_types = ['Legendary', 'Snow', 'World', 'Basic']

for card in df.index:
    
    card_super_type_list = []
    card_super_type_list_back = []

    for word in df.loc[card, 'card_type'].split():

        if word in super_types:
            card_super_type_list.append(word)
    
    for word in df.loc[card, 'card_type_back'].split():

        if word in super_types:
            card_super_type_list_back.append(word)
            
    
    if card_super_type_list == []:
        card_super_type_list = ['NONE']
        
    if card_super_type_list_back == []:
        card_super_type_list_back = ['NONE']
        
    df.loc[card, 'super_type'] = " ".join(card_super_type_list)
    df.loc[card, 'super_type_back'] = " ".join(card_super_type_list_back)
           
df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back,super_type,super_type_back
0,card,Static Orb,https://scryfall.com/card/7ed/319/static-orb?u...,normal,{3},3.0,Artifact,"As long as Static Orb is untapped, players can...",[],[],[],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '16.03', 'usd_foil': '84.99', 'eur': '...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Artifact,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
1,card,Sensory Deprivation,https://scryfall.com/card/m14/71/sensory-depri...,normal,{U},1.0,Enchantment — Aura,Enchant creature\nEnchanted creature gets -3/-0.,[U],[U],[Enchant],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Enchantment,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
2,card,Road of Return,https://scryfall.com/card/c19/34/road-of-retur...,normal,{G}{G},2.0,Sorcery,Choose one —\n• Return target permanent card f...,[G],[G],[Entwine],"{'standard': 'not_legal', 'future': 'not_legal...",rare,"{'usd': '0.43', 'usd_foil': None, 'eur': '0.77...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Sorcery,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
3,card,Storm Crow,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,{1}{U},2.0,Creature — Bird,Flying (This creature can't be blocked except ...,[U],[U],[Flying],"{'standard': 'not_legal', 'future': 'not_legal...",common,"{'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....",1,2,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
4,card,Walking Sponge,https://scryfall.com/card/ulg/47/walking-spong...,normal,{1}{U},2.0,Creature — Sponge,{T}: Target creature loses your choice of flyi...,[U],[U],[],"{'standard': 'not_legal', 'future': 'not_legal...",uncommon,"{'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....",1,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE


In [69]:
# now I'd like to remove the super_type from the card_type for each card
for index in df.loc[df['super_type'] != 'NONE'].index:
    df.loc[index, 'card_type'] = df.loc[index, 'card_type'].replace(df.loc[index, 'super_type'], '').strip()
    
# back half
for index in df.loc[df['super_type_back'] != 'NONE'].index:
    df.loc[index, 'card_type_back'] = df.loc[index, 'card_type_back'].replace(df.loc[index, 'super_type_back'], '').strip()

In [70]:
df['card_type'].value_counts()

Creature                10047
Instant                  2453
Enchantment              2333
Sorcery                  2194
Artifact                 1335
Land                      661
Artifact Creature         610
Planeswalker              207
Enchantment Creature      135
Tribal Instant             20
Tribal Sorcery             16
Tribal Enchantment         13
Artifact Land               6
Tribal Artifact             5
Enchantment Artifact        5
Hero Artifact               5
Hero                        2
Summon Dragon               2
Summon Knights              1
Land Creature               1
Summon Goblin               1
Autobot Character           1
Name: card_type, dtype: int64

In [71]:
df['card_type_back'].value_counts()

NONE                 19816
Creature                88
Sorcery                 69
Instant                 44
Land                    17
Planeswalker             8
Enchantment              7
Artifact                 3
Artifact Creature        1
Name: card_type_back, dtype: int64

In [72]:
# this is to clean up the leaglities column a little. This code extrudes all the formats from the nested json object
for index in df.index:
    legal_formats = []
    for form in df.loc[index, 'legalities']:
        if df.loc[index, 'legalities'][form] == 'legal' or df.loc[index, 'legalities'][form] == 'restricted':
            legal_formats.append(form)
    df.loc[index, 'legalities'] = " ".join(legal_formats)

In [73]:
df.loc[df['legalities'] == ''].shape

(42, 32)

In [74]:
# there are some other older cards that are not legal in any format, i.e. cards that refer to ante, so those 
# legalities need to be set to NONE to avoid Null values later.
non_legal_cards = df.loc[df['legalities'] == ""].index
df.loc[non_legal_cards, 'legalities'] = 'NONE'

In [75]:
df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back,super_type,super_type_back
0,card,Static Orb,https://scryfall.com/card/7ed/319/static-orb?u...,normal,{3},3.0,Artifact,"As long as Static Orb is untapped, players can...",[],[],[],legacy vintage commander duel,rare,"{'usd': '16.03', 'usd_foil': '84.99', 'eur': '...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Artifact,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
1,card,Sensory Deprivation,https://scryfall.com/card/m14/71/sensory-depri...,normal,{U},1.0,Enchantment — Aura,Enchant creature\nEnchanted creature gets -3/-0.,[U],[U],[Enchant],pioneer modern legacy pauper vintage penny com...,common,"{'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Enchantment,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
2,card,Road of Return,https://scryfall.com/card/c19/34/road-of-retur...,normal,{G}{G},2.0,Sorcery,Choose one —\n• Return target permanent card f...,[G],[G],[Entwine],legacy vintage commander duel,rare,"{'usd': '0.43', 'usd_foil': None, 'eur': '0.77...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Sorcery,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
3,card,Storm Crow,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,{1}{U},2.0,Creature — Bird,Flying (This creature can't be blocked except ...,[U],[U],[Flying],modern legacy pauper vintage penny commander duel,common,"{'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....",1,2,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
4,card,Walking Sponge,https://scryfall.com/card/ulg/47/walking-spong...,normal,{1}{U},2.0,Creature — Sponge,{T}: Target creature loses your choice of flyi...,[U],[U],[],legacy vintage penny commander duel,uncommon,"{'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....",1,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE


Now is a good time to also get the sub_type of a card.  
Sub_types include: Aura, Vehicle, Arcane, Equipment, etc.  
For creatures sub_types also include any further classifications i.e. human, elf, bird, etc.  
Planeswalkers also have a sub_type (with the exception of one card, The Wanderer) which are the name of the character i.e. Jace, Chandra, Oko, etc

In [76]:
for index in df.index:
    try:
        df.loc[index, 'sub_type'] = df.loc[index, 'type_line'].split(' // ')[0].split(' — ')[1]
    except:
        df.loc[index, 'sub_type'] = 'NONE'
    try:
        df.loc[index, 'sub_type_back'] = df.loc[index, 'type_line'].split(' // ')[1].split(' — ')[1]
    except:
        df.loc[index, 'sub_type_back'] = 'NONE'

In [77]:
df.head()

Unnamed: 0,object,name,scryfall_uri,layout,mana_cost,cmc,type_line,oracle_text,colors,color_identity,keywords,legalities,rarity,prices,power,toughness,produced_mana,card_faces,loyalty,content_warning,printed_name,flavor_name,card_type,oracle_text_back,colors_back,power_back,toughness_back,loyalty_back,card_type_back,mana_cost_back,super_type,super_type_back,sub_type,sub_type_back
0,card,Static Orb,https://scryfall.com/card/7ed/319/static-orb?u...,normal,{3},3.0,Artifact,"As long as Static Orb is untapped, players can...",[],[],[],legacy vintage commander duel,rare,"{'usd': '16.03', 'usd_foil': '84.99', 'eur': '...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Artifact,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
1,card,Sensory Deprivation,https://scryfall.com/card/m14/71/sensory-depri...,normal,{U},1.0,Enchantment — Aura,Enchant creature\nEnchanted creature gets -3/-0.,[U],[U],[Enchant],pioneer modern legacy pauper vintage penny com...,common,"{'usd': '0.08', 'usd_foil': '0.21', 'eur': '0....",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Enchantment,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Aura,NONE
2,card,Road of Return,https://scryfall.com/card/c19/34/road-of-retur...,normal,{G}{G},2.0,Sorcery,Choose one —\n• Return target permanent card f...,[G],[G],[Entwine],legacy vintage commander duel,rare,"{'usd': '0.43', 'usd_foil': None, 'eur': '0.77...",NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Sorcery,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE
3,card,Storm Crow,https://scryfall.com/card/9ed/100/storm-crow?u...,normal,{1}{U},2.0,Creature — Bird,Flying (This creature can't be blocked except ...,[U],[U],[Flying],modern legacy pauper vintage penny commander duel,common,"{'usd': '0.14', 'usd_foil': '2.33', 'eur': '0....",1,2,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Bird,NONE
4,card,Walking Sponge,https://scryfall.com/card/ulg/47/walking-spong...,normal,{1}{U},2.0,Creature — Sponge,{T}: Target creature loses your choice of flyi...,[U],[U],[],legacy vintage penny commander duel,uncommon,"{'usd': '0.18', 'usd_foil': '0.73', 'eur': '0....",1,1,NONE,NONE,NONE,NONE,NONE,NONE,Creature,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,Sponge,NONE


In [78]:
df.columns

Index(['object', 'name', 'scryfall_uri', 'layout', 'mana_cost', 'cmc',
       'type_line', 'oracle_text', 'colors', 'color_identity', 'keywords',
       'legalities', 'rarity', 'prices', 'power', 'toughness', 'produced_mana',
       'flavor_name', 'card_type', 'oracle_text_back', 'colors_back',
       'power_back', 'toughness_back', 'loyalty_back', 'card_type_back',
       'mana_cost_back', 'super_type', 'super_type_back', 'sub_type',
       'sub_type_back'],
      dtype='object')

In [79]:
# reordering the columns
df = df[['name', 'layout', 'colors', 'color_identity', 'mana_cost', 'cmc', 'type_line', 'card_type', 'super_type', 
    'sub_type', 'oracle_text', 'legalities', 'rarity', 'power', 'toughness', 'loyalty', 'card_faces',
    'oracle_text_back', 'colors_back', 'power_back', 'toughness_back', 'loyalty_back', 'card_type_back',
    'super_type_back', 'sub_type_back', 'mana_cost_back', 'scryfall_uri'
   ]]

___
I'd like to make a column that denotes if a card has an acivated ability and another column if the card has a triggered ability. I did have these as seperate processes then later realized I can combine them to save on computing power. To further increase performance I could do cluster computing with scala, but that might not be worth the hassle and may not save time in the long run.

In [80]:
for index in df.index:
    # an acitvated ability on a card is denoted by a ':'
    if ':' in df.loc[index, 'oracle_text'] or ':' in df.loc[index, 'oracle_text_back']:
        df.loc[index, 'activated_ability'] = 1
    else:
        df.loc[index, 'activated_ability'] = 0

    # triggered abilities occur when certain conditions are met and use the keywords 'when', 'whenever', and 'at'
    if re.search(r'\bwhenever\b', df.loc[index, 'oracle_text'].lower()) != None or re.search(r'\bwhenever\b', df.loc[index, 'oracle_text_back'].lower()) != None:
        df.loc[index, 'triggered_ability'] = 1
    elif re.search(r'\bwhen\b', df.loc[index, 'oracle_text'].lower()) != None or re.search(r'\bwhen\b', df.loc[index, 'oracle_text_back'].lower()) != None:
        df.loc[index, 'triggered_ability'] = 1
    elif re.search(r'\bat\b', df.loc[index, 'oracle_text'].lower()) != None or re.search(r'\bat\b', df.loc[index, 'oracle_text_back'].lower()) != None:
        df.loc[index, 'triggered_ability'] = 1
    else:
        df.loc[index, 'triggered_ability'] = 0
        
    # While I'm at it, I want to git rid of all the '/n' from the oracle texts
    df.loc[index, 'oracle_text'] = df.loc[index, 'oracle_text'].replace('\n', ' ')
    df.loc[index, 'oracle_text_back'] = df.loc[index, 'oracle_text_back'].replace('\n', ' ')

___

Now I need to clean up the oracle_text and oracle_text_back columns using RegEx

In [81]:
# instatiate the tokenizer
tknr = RegexpTokenizer(r"[a-zA-Z{}+'0-9-/−]+")

odf = df.loc[df['oracle_text'] != 'NONE']
obdf = df.loc[df['oracle_text_back'] != 'NONE']

# start with empty lists
tokens = []
tokens_back = []

# fill the lists with tokenized versions of each card's oracle_text
for card in odf['oracle_text']:
    tokens.append(" ".join(tknr.tokenize(card.lower())))

for card in obdf['oracle_text_back']:
    tokens_back.append(" ".join(tknr.tokenize(card.lower())))

# update the oracle_text columns
df.loc[odf.index, 'oracle_text_token'] = tokens
df.loc[obdf.index, 'oracle_text_back_token'] = tokens_back

In [82]:
# reordering the columns
df = df[['name', 'layout', 'colors', 'color_identity', 'mana_cost', 'cmc', 'type_line', 'card_type', 'super_type', 
    'sub_type', 'oracle_text', 'oracle_text_token', 'legalities', 'rarity', 'power', 'toughness', 'loyalty',
    'activated_ability', 'triggered_ability', 'oracle_text_back', 'oracle_text_back_token',
    'colors_back', 'power_back', 'toughness_back', 'loyalty_back', 'card_type_back', 'super_type_back',
    'sub_type_back', 'mana_cost_back', 'scryfall_uri'
   ]]

df.fillna('NONE', inplace=True)

In [83]:
df.head()

Unnamed: 0,name,layout,colors,color_identity,mana_cost,cmc,type_line,card_type,super_type,sub_type,oracle_text,oracle_text_token,legalities,rarity,power,toughness,loyalty,activated_ability,triggered_ability,oracle_text_back,oracle_text_back_token,colors_back,power_back,toughness_back,loyalty_back,card_type_back,super_type_back,sub_type_back,mana_cost_back,scryfall_uri
0,Static Orb,normal,[],[],{3},3.0,Artifact,Artifact,NONE,NONE,"As long as Static Orb is untapped, players can...",as long as static orb is untapped players can'...,legacy vintage commander duel,rare,NONE,NONE,NONE,0.0,0.0,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,https://scryfall.com/card/7ed/319/static-orb?u...
1,Sensory Deprivation,normal,[U],[U],{U},1.0,Enchantment — Aura,Enchantment,NONE,Aura,Enchant creature Enchanted creature gets -3/-0.,enchant creature enchanted creature gets -3/-0,pioneer modern legacy pauper vintage penny com...,common,NONE,NONE,NONE,0.0,0.0,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,https://scryfall.com/card/m14/71/sensory-depri...
2,Road of Return,normal,[G],[G],{G}{G},2.0,Sorcery,Sorcery,NONE,NONE,Choose one — • Return target permanent card fr...,choose one return target permanent card from y...,legacy vintage commander duel,rare,NONE,NONE,NONE,0.0,0.0,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,https://scryfall.com/card/c19/34/road-of-retur...
3,Storm Crow,normal,[U],[U],{1}{U},2.0,Creature — Bird,Creature,NONE,Bird,Flying (This creature can't be blocked except ...,flying this creature can't be blocked except b...,modern legacy pauper vintage penny commander duel,common,1,2,NONE,0.0,0.0,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,https://scryfall.com/card/9ed/100/storm-crow?u...
4,Walking Sponge,normal,[U],[U],{1}{U},2.0,Creature — Sponge,Creature,NONE,Sponge,{T}: Target creature loses your choice of flyi...,{t} target creature loses your choice of flyin...,legacy vintage penny commander duel,uncommon,1,1,NONE,1.0,0.0,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,NONE,https://scryfall.com/card/ulg/47/walking-spong...


In [84]:
df.shape

(20053, 30)

___

In [85]:
# save out our cleaned df
df.to_csv('../Data/cards_cleaned.csv', index=False)