<a href="https://www.kaggle.com/code/bugsydor/mtg-color-guesser?scriptVersionId=142920704" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # basic plotting
from sklearn.preprocessing import OrdinalEncoder # ordinal encoder for categorical features
from sklearn.preprocessing import LabelEncoder # for the target
from sklearn.feature_selection import mutual_info_regression # Mutual Information for feature selection
from sklearn.model_selection import train_test_split # Train-Test Splits for validation and testing
from sklearn.ensemble import RandomForestClassifier # Random Forest Classifier model

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

My goal here is to train a classifier model to guess which color a monocolored (has *exactly* one color) Magic card is. (The target will be a card's color identity, rather than its color.)

In [None]:
# load in the massive file
all_cards = pd.read_csv("/kaggle/input/mtg-all-cards/all_mtg_cards.csv")

In [None]:
# display the first few rows
all_cards.head()

In [None]:
all_cards.dtypes

In [None]:
all_cards.isnull().any()

In [None]:
# extract monocolored cards, remove cards with missing multiverse_id
monocolored = all_cards.loc[((all_cards.color_identity == "['W']") | (all_cards.color_identity == "['U']") | (all_cards.color_identity == "['B']") |
                           (all_cards.color_identity == "['R']") | (all_cards.color_identity == "['G']")) & (all_cards.multiverse_id.isnull() == False)]

# set the index to multiverse_id
monocolored = monocolored.set_index('multiverse_id')

# fill remaining NaNs with "none"
monocolored = monocolored.fillna("none")

# sample shows things working as expected
monocolored.head(3)

In [None]:
# select all potentially helpful features
they_might_be_features = ['name', 'cmc', 'color_identity', 'type', 'supertypes', 'subtypes', 'rarity', 'text', 'artist', 'power',
                         'toughness', 'loyalty'] # multiverse_id is the index, and thus doesn't need to be selected
data = monocolored[they_might_be_features]
data.head()

In [None]:
data.isnull().any()

In [None]:
# remove duplicate card names
data = data.drop_duplicates(subset = "name", keep = "first")

I'm thinking of splitting it into three separate models based on type: Creatures, Planeswalkers, and Others. This is largely because the first two of these have their own specific stats (power/toughness for creatures, loyalty for planeswalker) that each don't apply to the other two types. Recombining the final three tables into one should be trivial.

In [None]:
# Turns out the database lists star/star p/t creatures correctly.
#data.loc[data.name.str.contains("Maro")]
#data.loc[data.name == "Negate"]

I want to create one-hot-encoded categories for frequently used keyword abilities. My domain knowledge suggests that certain keywords are more common in some colors than in others (for instance, a green card is unlikely to have "flying"), so it's probably useful data.

In [None]:
# add keyword columns

## counterspells
data["counterspell"] = (data.text.str.contains("[Cc]ounter\s(?:it|target|all)") & # find counterspells
         (data.text.str.contains("[wW]ard(?:\s{|—])") == False)) # leave out ward cards

## exile
data["exile"] = (data.text.str.contains("[eE]xile\s(?:target|each|all|the|up\sto)") & # find exile
        (data.text.str.contains("the\stop") == False)) # leave out impulse draw

## fight
data["fight"] = (data.text.str.contains("fights"))

## mill
data["mill"] = (data.text.str.contains("[mM]ill"))

## sacrifice
data["sacrifice"] = (data.text.str.contains("[sS]acrifice"))

## scry
data["scry"] = (data.text.str.contains("[sS]cry"))

## tap other
data["tap"] = (data.text.str.contains("(?:\st|T)ap\s(?:it|target|each|all|or\suntap)") | # find active tappers
         data.text.str.contains("enter\sthe\sbattlefield\stapped")) # as well as passive ones

## untap other
data["untap"] = (data.text.str.contains("[uU]ntap\s(?:it|target|each|all)")) # find untappers

## deathtouch
data.loc[data.text.str.contains("[dD]eathtouch") | # find creatures that have deathtouch
        data.text.str.contains("deals combat damage to a creature, destroy that creature", regex = False)] # or that have "derptouch"

## defender
data["defender"] = (data.text.str.contains("[dD]efender") & # find creatures with defender or things that give defender
        (data.text.str.contains("(?:[tT]arget|[eE]ach|[aA]ll)\screature(?:\s|s\s)with\sdefender") == False)) # remove things that specifically affect defenders

## double_strike
data["double_strike"] = (data.text.str.contains("[dD]ouble\sstrike"))

## first_strike
data["first_strike"] = (data.text.str.contains("[fF]irst\sstrike"))

## flash
data["flash"] = (data.text.str.contains("(?:f|\nF|^F)lash") & # some engineering to avoid incorrectly grabbing cards with Flash in the name
        (data.text.str.contains("[fF]lashback") == False)) # dont' want to capture flashback

## flying
data["flying"] = (data.text.str.contains("[fF]lying"))

## haste
data["haste"] = (data.text.str.contains("[hH]aste"))

## hexproof
data["hexproof"] = (data.text.str.contains("[hH]exproof"))

## indestructible
data["indestructible"] = (data.text.str.contains("[iI]ndestructible") &
                         (data.text.str.contains("loses\sindestructible") == False))

## lifelink
data["lifelink"] = (data.text.str.contains("[lL]ifelink"))

## menace
data["menace"] = (data.text.str.contains("[mM]enace"))

## protection
data["protection"] = (data.text.str.contains("[pP]rotection\sfrom"))

## prowess
data["prowess"] = (data.text.str.contains("[pP]rowess"))

## reach
data["reach"] = (data.text.str.contains("(?:\sr|\nR|^R)each") &
        (data.text.str.contains("can't be blocked except by creatures with flying or reach", regex = False) == False)) # don't want flying reminder text

## trample
data["trample"] = (data.text.str.contains("[tT]rample"))

## vigilance
data["vigilance"] = (data.text.str.contains("[vV]igilance"))

## draw
data["draw"] = (data.text.str.contains("(?:\sd|\nD|^D)raw"))

## discard
data["discard"] = (data.text.str.contains("[dD]iscard"))

## damage
data["damage"] = (data.text.str.contains("deals\s\d\sdamage"))

## damage prevention
data["damage_prevention"] = (data.text.str.contains("[pP]revent\s"))

## life_gain
data["life_gain"] = (data.text.str.contains("gain(?:\s|s\s)\d+\slife"))

## life_loss
data["life_loss"] = (data.text.str.contains("loses") & 
                   data.text.str.contains("(?:their|\d+)\slife")) # capture both fixed and rational values

## tokens
data["tokens"] = (data.text.str.contains("[cC]reate"))

## destroy
data["destroy"] = (data.text.str.contains("[dD]estroy") &
                  (data.text.str.contains("don't\sdestroy\sit.") == False)) # reject indestructible's reminder text

## bounce
data["bounce"] = (data.text.str.contains("[rR]eturn") &
        data.text.str.contains("owner's\s(?:hand|library)") & # capture hand or library bounce effects
        (data.text.str.contains("graveyard\sto") == False)) # exclude grave recursion

## recursion
data["recursion"] = (data.text.str.contains("\sput|return") &
        data.text.str.contains("graveyard")&
        data.text.str.contains("hand|battlefield"))

data.sample(5)

In [None]:
# Split out the three populations:

## Creatures (drop loyalty column)
creatures = data.loc[data.type.str.contains("Creature")].drop("loyalty", axis = 1)

## Planeswalkers (drop power/toughness)
planeswalkers = data.loc[data.type.str.contains("Planeswalker")].drop(["power", "toughness"], axis = 1)

## Others (drop loyalty and power/toughness)
others = data.loc[data.type.str.contains("Creature|Planeswalker") == False].drop(["loyalty", "power", "toughness"], axis = 1)

In [None]:
# convert apropriate creature columns to floats
creatures[["power", "toughness"]] = creatures[["power", "toughness"]].replace("\d*\+*\*|\?|none|∞", 0, regex = True)
# asterisks successfully replaced
creatures = creatures.astype({"power": "float", "toughness": "float"})
print(creatures[["power", "toughness"]].dtypes) # it worked

In [None]:
# convert planeswalker loyalty to float
planeswalkers["loyalty"] = planeswalkers.loyalty.replace("\*", 0, regex = True)
planeswalkers["loyalty"] = planeswalkers["loyalty"].astype("float")
print(planeswalkers["loyalty"].dtypes) # worked far more easily than the creatures

In [None]:
# Creature Feature Engineering

## P/T
creatures["p/t"] = creatures["power"] / creatures["toughness"] # didn't throw an error. Probably creates NaNs
# it does, indeed, create NaNs and infs. That works.

## (P + T) / (2cmc)
creatures["(p+t)/(2cmc)"] = (creatures["power"] + creatures["toughness"]) / (2 * creatures["cmc"])

# there were no 0 cmc monocolored creatures, so only p/t had NaNs

## fill NaNs and infs w/ 0
creatures[["p/t", "(p+t)/(2cmc)"]] = creatures[["p/t", "(p+t)/(2cmc)"]].replace([np.inf, -np.inf], 0).fillna(0)
creatures.sort_values("p/t").tail()

In [None]:
# Planeswalker Feature Engineering

## L / cmc
planeswalkers["loyalty/cmc"] = planeswalkers.loyalty / planeswalkers.cmc
# no NaNs or infs, but let's replace them in case more show up in an updated dataset.
planeswalkers["loyalty/cmc"] = planeswalkers["loyalty/cmc"].replace([np.inf, -np.inf], 0).fillna(0)

In [None]:
creatures.dtypes

In [None]:
# Ordinal Encoding for categorical features and Label Encoding the target

## Create list of categorical features
cats = ["type", "supertypes", "subtypes", "rarity", "artist"] # don't forget to drop the text column now that you've extracted the features from it

## Encoders

### Create and fit label encoder
le = LabelEncoder()
le.fit(data.color_identity)
#le.classes_ # array(["['B']", "['G']", "['R']", "['U']", "['W']"], dtype=object)

### create ordinal encoder
oe = OrdinalEncoder()
oe.fit(data[cats])
#print(oe.categories_) # it works, but it's an extensive list

## Creatures

### crt label transform
y_crt = le.transform(creatures.color_identity)
#print(len(y_crt)) # 10607

### crt cat transform
X_crt = creatures.copy().drop(["name", "color_identity", "text"], axis = 1)
X_crt[cats] = oe.transform(X_crt[cats])

## Planeswalkers

### pln label transform
y_pln = le.transform(planeswalkers.color_identity)

### pln cat transform
X_pln = planeswalkers.copy().drop(["name", "color_identity", "text"], axis = 1)
X_pln[cats] = oe.transform(X_pln[cats])

## Others

### oth label transform
y_oth = le.transform(others.color_identity)
#print(len(y_oth)) # 7976

### oth cat transform
X_oth = others.copy().drop(["name", "color_identity", "text"], axis = 1)
X_oth[cats] = oe.transform(X_oth[cats])

In [None]:
X_crt.head() # go back and limit the scope of fill_na away from categorical columns

In [None]:
X_crt.dtypes

In [None]:
# determine discrete variable columns for each set

## Creatures

for colname in ["cmc", "type", "supertypes", "subtypes", "rarity", "artist", "power", "toughness"]:
    X_crt[colname] = X_crt[colname].astype("int") # convert discrete columns to int
### since I can't seem to do this in one line, crt_discrete is constructed over 3.
    crt_bool = (X_crt.dtypes == bool)
crt_int = X_crt.dtypes == int
crt_discrete = crt_bool | crt_int

## Planeswalkers
for colname in ["cmc", "type", "supertypes", "subtypes", "rarity", "artist", "loyalty"]:
    X_pln[colname] = X_pln[colname].astype("int") # convert discrete columns to int
### since I can't seem to do this in one line, crt_discrete is constructed over 3.
    pln_bool = (X_pln.dtypes == bool)
pln_int = X_pln.dtypes == int
pln_discrete = pln_bool | pln_int

## Others
for colname in ["cmc", "type", "supertypes", "subtypes", "rarity", "artist"]:
    X_oth[colname] = X_oth[colname].astype("int") # convert discrete columns to int
### since I can't seem to do this in one line, crt_discrete is constructed over 3.
    oth_bool = (X_oth.dtypes == bool)
oth_int = X_oth.dtypes == int
oth_discrete = oth_bool | oth_int

In [None]:
# Mutual Information (MI) time! Split by card type.

## define the function
def make_mi_scores(X, y, discrete_features):
    mi_scores = mutual_info_regression(X, y, discrete_features = discrete_features, random_state = 42)
    mi_scores = pd.Series(mi_scores, name="MI Scores", index=X.columns)
    mi_scores = mi_scores.sort_values(ascending=False)
    return mi_scores

## Creatures
crt_mi_scores = make_mi_scores(X_crt, y_crt, crt_discrete)

## Planeswalkers
pln_mi_scores = make_mi_scores(X_pln, y_pln, pln_discrete)

## Others
oth_mi_scores = make_mi_scores(X_oth, y_oth, oth_discrete)

In [None]:
crt_mi_scores[::3]

In [None]:
pln_mi_scores[::3]

In [None]:
oth_mi_scores[::3]

In [None]:
# set up plotting
def plot_mi_scores(scores):
    scores = scores.sort_values(ascending=True)
    width = np.arange(len(scores))
    ticks = list(scores.index)
    plt.barh(width, scores)
    plt.yticks(width, ticks)
    plt.title("Mutual Information Scores")

In [None]:
# plot creature MI scores
plt.figure(dpi=100, figsize=(8, 8))
plot_mi_scores(crt_mi_scores)

For creatures, type/subtype (i.e. the specific type of creature) appears to have an outsized influence on final color, which makes sense given my experience. You are far more likely to find blue merfolk and black zombies than the other way around, after all. The next biggest impact appears to be the artist, which will be discussed more below.

Flying seems to be the most impactful keyword, which also makes sense due to its relative concentration in blue and scarcity in green. It even edges out the power/toughness ratio, something no other keyword does.

In [None]:
# plot planeswalker MI scores
plt.figure(dpi=100, figsize=(8, 8))
plot_mi_scores(pln_mi_scores)

For planeswalkers, subtype again carries the day. Since a planeswalker's subtype tends to be the name of a recurring character, and those tend to be associated with a consistent color, I'm tempted to call it cheating to identify color by subtype here.

Keywords are the next most interesting features, with damage, exile, and double-strike being the most strongy colored.

Artist is, unusually, a *bad* indicator of color for planeswalkers. This is likely because a single artist often does a complete cycle of different-colored planeswalkers in a given set.

In [None]:
# plot other MI scores
plt.figure(dpi=100, figsize=(8, 8))
plot_mi_scores(oth_mi_scores)

For other card types, the artist edges out keywords as the most informative feature. Oftentimes, an artist will specialize more in one color or faction in Magic than in the others, so this makes sense to me.

Dealing direct damage to something is often a good sign (though far from a guarantee) that a card is red, but it surprised me that that was a stronger indicator than whether something countered spells (a mechanic almost entirely exclusive to blue). Maybe it's more that something *not* being a counterspell doesn't reveal much about what color a card is.

Now to select the features for each dataset.

In [None]:
# select creature features
crt_features = ['subtypes', 'flying']
X_crt_sub = X_crt[crt_features]

In [None]:
# select planeswalker features
pln_features = ['subtypes', 'trample', 'double_strike']
X_pln_sub = X_pln[pln_features]

In [None]:
# select other features
oth_features = ['damage', 'type', 'life_gain', 'draw', 'life_loss', 'haste', 'destroy', 'trample',
                'vigilance', 'reach']
#X_oth_sub = X_oth[oth_features]
X_oth_sub = X_oth.drop(['artist', 'subtypes', 'cmc', 'prowess', 'hexproof', 'double_strike'], axis = 1) # Trying out some negative selection

In [None]:
# prepare train-test-validation splits for each card type set

## Creatures
X_crt_mid, X_crt_test, y_crt_mid, y_crt_test = train_test_split(X_crt_sub, y_crt, test_size = 0.2, random_state = 42)
X_crt_train, X_crt_valid, y_crt_train, y_crt_valid = train_test_split(X_crt_mid, y_crt_mid, test_size = 0.2, random_state = 42)

## Planeswalkers
X_pln_mid, X_pln_test, y_pln_mid, y_pln_test = train_test_split(X_pln_sub, y_pln, test_size = 0.2, random_state = 42)
X_pln_train, X_pln_valid, y_pln_train, y_pln_valid = train_test_split(X_pln_mid, y_pln_mid, test_size = 0.2, random_state = 42)

## Other
X_oth_mid, X_oth_test, y_oth_mid, y_oth_test = train_test_split(X_oth_sub, y_oth, test_size = 0.2, random_state = 42)
X_oth_train, X_oth_valid, y_oth_train, y_oth_valid = train_test_split(X_oth_mid, y_oth_mid, test_size = 0.2, random_state = 42)

Now that all of the various splits are created from the cleaned data, I should be able to make, validate, and test models for each set.

That is, once I select the most salient features for each dataset using the information gleaned in the above MI studies.

For the moment, I only plan to try Random Forest models, since those are easier than XGBoost or Tensorflow to integrate into an sklearn pipeline, and I see pipelines being handy for streamlining things here and cutting down on programmer errors.

In [None]:
# Random Forest Creature Model

## instantiate the model
crt_forest = RandomForestClassifier(n_estimators = 100, random_state = 42)

## fit the model
crt_forest.fit(X_crt_train, y_crt_train)

## validate the model
crt_forest.score(X_crt_valid, y_crt_valid) # With ALL features selected, accuracy was 62%.
                                           # With just the top five features selected, accuracy crashes to 52%. Adding the next one hurts accuracy.
                                           # Pruning back to the top 3 raises accuracy to 53%. Adding p/t raises it to 54%.
                                           # Going all the way down to just subtypes raises accuracy to 66%. Adding artist back crashes to 53%.
                                           # Adding anything at all (except flying) appears to worsen the accuracy.

In [None]:
# Random Forest Planeswalker Model

## instantiate the model
pln_forest = RandomForestClassifier(n_estimators = 100, random_state = 42)

## fit the model
pln_forest.fit(X_pln_train, y_pln_train)

## validate the model
pln_forest.score(X_pln_valid, y_pln_valid) # With ALL features selected, accuracy was 61%.
                                           # With the top five features selected, accuracy shot up to 78%.
                                           # Adding any more seems to do nothing, but truncating down to just subtype boosts it to 83%
                                           # Adding just damage to subtypes crashes accuracy to 70%.
                                           # Removing damage and adding trample improves the score to 87%.
                                           # Adding double_strike improves it to 91%

In [None]:
# Random Forest Other Model

## instantiate the model
oth_forest = RandomForestClassifier(n_estimators = 100, random_state = 42)

## fit the model
oth_forest.fit(X_oth_train, y_oth_train)

## validate the model
oth_forest.score(X_oth_valid, y_oth_valid) # With ALL features selected, accuracy was 43%. I'd figured this would be the hard one.
                                           # Top five features crashes accuracy to 30%.
                                           # Just Artist is practically worthless (22%). Removing artist raises the top 5 (-artist) to 33%.
                                           # Removing type hurts the model. Removing damage hurts it more. Removing life_gain hurts slightly.
                                           # Removing counterspell doesn't seem to hurt the model at all. Slightly improves it.
                                           # Adding draw improves it to 34%. +life_loss +haste +destroy +trample 39.8%
                                           # -discard, -bounce, -exile, -subtypes, +vigilance
                                           # negative selection (just dropping artist and subtypes) improves the score to 45.5%.
                                           # 46.24%

In [None]:
# Find the format that .predict spits out
test_array = crt_forest.predict(X_crt_test) # It spits out an array. array([2, 2, 4, ..., 0, 0, 4])

X_crt_final = X_crt_test.copy()
X_crt_final["guesses"] = test_array
X_crt_final['answers'] = y_crt_test
X_crt_final = X_crt_final[['guesses', 'answers']]
X_crt_final.head()

I think what I end up doing here is:
1. Create predictions for each subset
2. Add the predictions and answers to their respective test X-data frames as columns (and drop all other columns but the index)
3. Append the dataframes together into a master list
4. Do math/comparison stuff to the guesses and answers columns to get the overall accuracy.

In [None]:
# create X_crt_final again, for the sake of having all three in one place
y_crt_preds = crt_forest.predict(X_crt_test)
X_crt_final = X_crt_test.copy()
X_crt_final["guesses"] = y_crt_preds
X_crt_final['answers'] = y_crt_test
X_crt_final = X_crt_final[['guesses', 'answers']]

# Do again for X_pln_final
y_pln_preds = pln_forest.predict(X_pln_test)
X_pln_final = X_pln_test.copy()
X_pln_final['guesses'] = y_pln_preds
X_pln_final['answers'] = y_pln_test
X_pln_final = X_pln_final[['guesses', 'answers']]

# And once more for X_oth_final
y_oth_preds = oth_forest.predict(X_oth_test)
X_oth_final = X_oth_test.copy()
X_oth_final['guesses'] = y_oth_preds
X_oth_final['answers'] = y_oth_test
X_oth_final = X_oth_final[['guesses', 'answers']]

# Combine all into a single dataframe, then sort if necessary
final_frame = X_crt_final.copy()
final_frame = final_frame.append(X_pln_final)
final_frame = final_frame.append(X_oth_final)
final_frame.sort_index(inplace = True)

In [None]:
# Score aggregate accuracy
final_frame_out = final_frame.groupby(['guesses', 'answers'])['answers'].count().to_frame().rename(columns= {'answers' : 'count'}).reset_index()
print(final_frame_out)

In [None]:
final_score = final_frame_out[final_frame_out['guesses'] == final_frame_out['answers']]['count'].sum() / len(final_frame) * 100
print(f"The color guesser has an overall accuracy of {round(final_score, ndigits = 2)}%.")

**To-Do List:**
* ~~Convert Power, Toughness, and Loyalty to numeric~~
    * ~~(non-numbers like 'X' or '\*' will be converted to '0')~~
* ~~Add relevant non-keywords as columns~~
    * ~~draw~~
    * ~~discard~~
    * ~~damage~~
    * ~~damage prevention~~
    * ~~life gain~~
    * ~~life loss~~
    * ~~destroy~~
    * ~~bounce (hand or library)~~
    * ~~recursion (hand or battlefield)~~
    * ~~create tokens~~
* ~~Engineer the following features w/o dividing by 0:~~
    * ~~P/T~~
    * ~~(P + T) / (2 * cmc)~~
    * ~~(loyalty) / (cmc)~~
* ~~Convert target color to numbers (WUBRG order, 0 - 4)~~
* ~~Convert categorical features with ordinal encoder~~
* ~~Run pre-model algorithms~~
    * ~~Mutual Information~~
    * K-means clustering? (Not for now, at least. Might not be appropriate.)
* ~~Prepare Train-Test Splits~~
    * ~~Separate train-test split for each set of card types~~
        * ~~Creature~~
        * ~~Planeswalker~~
        * ~~Other~~
* ~~Select model type~~
    * **Random Forest** (for now)
    * Gradient Boosted Trees
    * Deep Neural Network
* ~~Create and test models for each subset~~
    * ~~Select features~~
    * ~~Fit model~~
    * ~~Test model~~
    * ~~(Repeat above steps for each subset)~~
* Combine predictions into a single dataframe and score aggregate accuracy
    * ~~Figure out whether the predictions are in a dataframe format and have their indices intact~~
        * ~~If they don't, fix that.~~
    * ~~Use .append with ignore_index = FALSE and sort = TRUE to add the rows together~~
    * ~~Find a way to compare the combined predictions with the combined true answers and score the accuracy.~~
* **To be added later:** Integrate separate models into a single pipeline that takes in raw data and outputs predictions
    * Create different train-test split before the pipeline (i.e. splitting off the final testing data)
    * Split by type
    * Clean data
    * Engineer features
    * Select Features
    * Create validation splits for each of the three models
    * Train the three models
    * Merge their outputs into a list of predictions indexed by MultiverseID.
    * Create predictions based on the test data after the pipeline.