# Classifying Magic Cards by Role Performed

# Goal

My goal is to classify Magic: the Gathering cards by the role they play in their respective decks. <br>
<br>
Magic: the Gathering is a competitive card game. One of the major ways that Magic is played is a format called draft. In drafts 8 players will sit down at a table with three sealed packs of 15 cards each. Players will open the first pack and select one card from it for their deck. They will then pass the remaining 14 cards to the player on their left. When they recieve 14 cards from the player on their right, they will pick one card and pass the remaining 13 cards along. As each pack is passed around the table each player selects one card and passes the rest along until all the cards are gone. Then each player opens the second pack and the process is repeated until all the cards are gone. Once the third pack is drafted, each player should have selected 45 cards for their deck. They will then choose the best 22-24 of those cards to comprise their deck.<br>
<br>
All of the different cards fill different roles in a deck. Some cards are creatures that can attack the opponent to win the game. Some cards destroy opponents creatures. Some cards allow you to draw more cards, others force your opponent to discard cards. Some prevent the opponent from playing their cards, some negate the effects of the opponents cards after they've been played. In order to build the best deck, you need to identify the role that all of your cards will play. For example if all of your cards make the creatures you play better, but your deck contains no creatures for those cards to help, you are unlikely to win. <br>
<br>
My goal for this project is to build a classifier that will read the text on Magic cards and classify them by role. This is a necessary step for training a learner to build synergistic draft decks. 

# Data
For data, I am using a dataset generated through the mtgsdk library. The library allows me to create Card objects for every Magic card in print. For each of those Card objects I can access the text printed on the card as well as other metadata.<br>
<br>
Unfortunately the role of each card in a deck is not a part of the metadata provided. I need to generate those labels by hand. Because there are approximately 20,000 Magic cards in print I have chosen a subset to work with. I have chosen to include cards that are legal in the Pioneer format, which includes cards printed since October 2012. From those Pioneer legal cards I have selected all cards that have the type Instant or the type Sorcery. I chose to use instants and sorceries because they have a one time effect on the game. They do whatever the text of the card says and then are discarded from play. <br>
<br>
All told, there were 1686 instants and sorceries printed since Oct 2012. I wrote a script to label them 100 at a time and then save the labels in .txt files. I wound up dropping one row in which I mislabeled a card, bringing the total to 1685 cards. For each card I have the label, the text on the card, and the type of the card (instant vs sorcery).<br>
<br>
I have not included the actual dataset generation and labeling in this file. Once the dataset was generated, the labels were pickled and saved to a file named 'y.data' while the card objects were pickled and saved to 'IS.data'. In this file I simply load the pickled data I previously generated.<br>
<br>


### Description of Metadata
The primary data I will be working with is the card text. It is the text that describes what the card does when played. I will also be looking at the card type. For this dataset it will be restricted to 'Instant' or 'Sorcery'. Instants are cards that can be played at any time during the game, while Sorceries are cards that can be played only at specific moments during your turn. The last category is the Converted Mana Cost (cmc) of each card. It is the total amount of resources that need to be spent to play the card. <br>
<br>


In [1]:
#importing useful libraries
import mtgsdk
import pickle
from mtgsdk import Set
from mtgsdk import Card
import numpy as np
import pandas as pd


In [2]:
#starting by loading the complete list of labels
with open('y.data', 'rb') as filehandle:
     y = pickle.load(filehandle)
#loading the complete list of instants and sorceries
with open('IS.data', 'rb') as filehandle:
    IS = pickle.load(filehandle)

In [3]:
#pulling out card text and card type
X_text=[]
X_type=[]
X_cmc=[]
for thing in IS:
    X_text.append(thing.text)
    X_type.append(thing.type)
    X_cmc.append(thing.cmc)

In [4]:
#making them into pandas series
Type=pd.Series(X_type)
Y=pd.Series(y)
Text=pd.Series(X_text)
CMC=pd.Series(X_cmc)

In [5]:
#checking labels since I hand labeled them
Y.unique()

array(['draw', 'cond', 'token', 'trick', 'misc', 'tempo', 'threaten',
       'rdead', 'removal', 'burn', 'pump', 'counter', 'discard', 'aerem',
       'tutor', 'sweeper', 'ramp', 'mill', ''], dtype=object)

### Data Cont.

When I labeled the target I chose 18 different categories for the cards. One card was mislabeled as ' ', this will be deleted in the following cells. The dataset isn't balanced.

In [6]:
#looking at counts will probably use this to trim the dataset
Y.value_counts()

trick       262
cond        226
draw        186
removal     127
tempo       113
token       104
misc         99
counter      98
burn         91
sweeper      73
aerem        73
discard      54
tutor        50
pump         40
threaten     32
rdead        28
ramp         18
mill         11
              1
dtype: int64

In [7]:
#making a dataframe
frame={'type':Type,'text':Text,'cmc':CMC,'y':y}
df=pd.DataFrame(frame)

In [8]:
#drop the row with a missing value
df=df[df.y!='']
df=df.dropna()

### Data Cont.
The first ten entries in the dataframe are displayed below. The dataframe contains the text of the card, whether it is an instant or a sorcery, and the converted mana cost (cmc) of the card. The y label is also included in the dataframe. Since the text of the card is truncated when looking at the full dataframe, I have also printed the full text of the first 10 cards in the next cell.

In [9]:
df.head(10)

Unnamed: 0,type,text,cmc,y
0,Sorcery,Choose one or both —\n• Return target creature...,2.0,draw
1,Sorcery,Choose one —\n• Target player sacrifices an ar...,2.0,cond
2,Instant,Put nine +1/+1 counters on target land you con...,5.0,token
3,Instant,Up to two target creatures you control each de...,3.0,cond
4,Instant,Put a +1/+1 counter on target creature. That c...,2.0,trick
5,Sorcery,Up to one target creature gets -2/-2 until end...,3.0,cond
6,Instant,Blindblast deals 1 damage to target creature. ...,3.0,cond
7,Instant,This spell costs {3} less to cast if you contr...,4.0,misc
8,Sorcery,Tap all creatures your opponents control. Crea...,5.0,tempo
9,Sorcery,Look at the top three cards of your library. Y...,2.0,draw


In [10]:
for thing in df.text.head(10):
    print(thing+'\n')

Choose one or both —
• Return target creature card from your graveyard to your hand.
• Return target planeswalker card from your graveyard to your hand.

Choose one —
• Target player sacrifices an artifact.
• Target player sacrifices a creature.
• Target player sacrifices a planeswalker.

Put nine +1/+1 counters on target land you control. It becomes a legendary 0/0 Elemental creature with haste named Vitu-Ghazi. It's still a land.

Up to two target creatures you control each deal damage equal to their power to another target creature.

Put a +1/+1 counter on target creature. That creature gains first strike until end of turn. You gain 2 life.

Up to one target creature gets -2/-2 until end of turn. Amass 2. (Put two +1/+1 counters on an Army you control. If you don't control one, create a 0/0 black Zombie Army creature token first.)

Blindblast deals 1 damage to target creature. That creature can't block this turn.
Draw a card.

This spell costs {3} less to cast if you control a creat

# Model Development

I built my first model using just the text data from each card. I built a pipeline that used a CountVectorizer with nltk's default tokenization. I then fed that into a TfidfTransformer to generated tf-idf scores for all of the words used. Ultimately I used a LinearSVC as a classifier. I chose to use a LinearSVC because it tends to work quickly and effectively with datasets that contain many sparse features. I chose not to remove stopwords, or symbols in the preprocessing stage, as they often contain important contextual information.<br>
<br>
The initial model achieved .731 +/- 0.058 accuracy on 10-fold cross validation. This is an unbalanced 18 class problem, so I'm pleased with that initial accuracy

In [11]:
#building a pipeline
from sklearn.pipeline import Pipeline
from sklearn.model_selection import StratifiedKFold, train_test_split
from sklearn.metrics import accuracy_score, classification_report
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer, TfidfVectorizer
import matplotlib.pyplot as plt
import nltk
from sklearn.model_selection import cross_val_score
import warnings
from sklearn.exceptions import ConvergenceWarning

In [12]:
#prob redundant but just setting up my X and Ys
X=df.text
Y=df.y

In [13]:
#running and scoring a pipeline on just card text

from sklearn.svm import LinearSVC

svm_lin = Pipeline([('vect', CountVectorizer()), ('tfidf', TfidfTransformer()),
                    ('clf', LinearSVC(class_weight='balanced'))
                   ])
scores = cross_val_score(estimator=svm_lin,
                         X=X,
                         y=Y,
                         cv=10,
                         n_jobs=1)
print('CV accuracy scores: %s' % scores)
print('CV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))


CV accuracy scores: [0.71186441 0.78285714 0.80346821 0.68235294 0.71428571 0.67857143
 0.72289157 0.69090909 0.67283951 0.85093168]
CV accuracy: 0.731 +/- 0.058


### Model Development Cont.
My next step was to look at n-grams. The categories the cards fall into are fairly nuanced. For example, 'destroy target creature' falls into the 'removal' category, 'destroy all creatures' falls into the 'sweeper' category, and 'destroy target creature with power 2 or less' falls into the 'conditional removal' category. By looking at n-grams I hope to capture more of that nuance and improve predictive accuracy.<br>
<br>
To incorporate n-grams, I used a TfidfVectorizer with ngram_range of (1,3). I tried several different ranges of ngram_range  with 10-fold cross validation and found (1,3) to be the best fit.<br>
<br>
I made a new dataframe (X2) that contained the feature names and tf-idf scores for the n-grams. I also included the cmc (converted mana cost) of the card and a binary feature instant_or_sorcery that differentiated instants and sorceries. The dataframe X2 is shown below.<br>
<br>
Again i used a linearSVC for classification. It achieved a 10-fold cross validation accuracy of .777 +/-0.055. This was a significant improvement from the .731 +/-0.058 from when n-grams, cmc, and instant_or_sorcery weren't considered.


In [14]:
#now looking at n-grams
tfidfvect=TfidfVectorizer(ngram_range=(1,3))
X2=tfidfvect.fit_transform(X,Y)

In [15]:
#making a dataframe
X2=pd.DataFrame(X2.todense(),columns=tfidfvect.get_feature_names())

In [16]:
#adding CMC to see if it helps
df['cmc'].unique()
X2['_cmc']=df['cmc']
#getting an error where a single nan appears when I set X2['_cmc']=df['cmc']
X2['_cmc'].unique()
#imputing mean CMC for the one nan
X2=X2.fillna(X2.mean())

In [17]:
# adding an instant_or_sorcery feature derived from type. 1 if instant 0 if sorcery
X2['_type']=df['type']
X2['instant_or_sorcery'] = np.where(X2['_type']=='Instant', 1, 0)
X2.drop(['_type'],axis=1,inplace=True)

In [18]:
X2

Unnamed: 0,10,10 damage,10 damage divided,10 life,10 life instead,10 or,10 or more,13,13 13,13 13 until,...,zombie in addition,zombie that,zombie that player,zombie this,zombie this turn,zombies,zombies you,zombies you control,_cmc,instant_or_sorcery
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,1
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,1
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1680,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1
1681,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1
1682,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0
1683,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,1


In [19]:
#trying a linearSVC with ngram_range=(1,3) and CMC and instant_or_sorcery added
with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=ConvergenceWarning)
    scores = cross_val_score(estimator=LinearSVC(class_weight='balanced'),
                             X=X2,
                             y=Y,
                             cv=10,
                             n_jobs=1)
    print('CV accuracy scores: %s' % scores)
    print('CV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))

CV accuracy scores: [0.76836158 0.82857143 0.80924855 0.75882353 0.83333333 0.71428571
 0.72891566 0.72727273 0.72222222 0.88198758]
CV accuracy: 0.777 +/- 0.055


### Model Development Cont.
Next I tried using word embeddings rather than tf-idf scores. I used gensim word2vec to generate a Continuous Bag of Words (CBOW) model and a Skip-Gram model. For each model I used hiden layer size 200, window size of 5, min count of 1, and 20 iterations. I generated an X-matrix for each using the mean of the word vectors.<br>
<br>
By using word embeddings, I was able to reduce the number of features down to 200. This meant I was no longer limited to LinearSVC for speed reasons. I set up a list of 5 classifiers that included a LinearSVC, RandomForest, GaussianNB (Naive-Bayes), rbf kernel SVC, and Logistic Regression. I then iterated through that list and tried 10-fold cross validation for each using the cbow model and the skip-gram model.<br>
<br>
Unfortunately the highest accuracy was the skip-gram using LinearSVC with .656 +/- 0.058 accuracy. 

In [20]:
# trying gensim word2vec
from gensim.models import Word2Vec
from nltk.tokenize import word_tokenize

In [21]:
#word tokenizing the documents
w2vText = [word_tokenize(doc) for doc in df['text']]
#encoding y
ydocs = np.array([list(Y.unique()).index(_) for _ in Y])

In [22]:
#generating a cbow model
cbowmodel = Word2Vec(w2vText, size=200, sg=0, window=5, min_count=1, iter=20, workers=4)
Xdocs_cbow = np.array([np.mean([cbowmodel.wv[word] for word in doc], axis=0) for doc in w2vText])
#generating a skip-gram model
sgmodel = Word2Vec(w2vText, size=200, sg=1, window=5, min_count=1, iter=20, workers=4)
Xdocs_sg = np.array([np.mean([sgmodel.wv[word] for word in doc], axis=0) for doc in w2vText])

In [23]:
#importing classifiers to try
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression

In [24]:
#making some lists to iterate through for model evaluation
clfs=[LinearSVC(class_weight='balanced'),GaussianNB(),RandomForestClassifier(n_jobs=4, n_estimators=300, max_depth=10, random_state=None, class_weight='balanced'),SVC(kernel='rbf', gamma='scale', class_weight='balanced'),LogisticRegression(solver='lbfgs', multi_class='auto', max_iter =300, class_weight='balanced')]
Xes=[Xdocs_cbow,Xdocs_sg]
X_labels=['CBOW','Skip-Gram']

In [25]:
#catching convergence errors
with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=ConvergenceWarning)
    #doing cbow first then skip-gram
    for Xdocs,xlab in zip(Xes,X_labels):
        #trying each classifier
        for clf in clfs:
            scores = cross_val_score(estimator=clf,
                             X=Xdocs,
                             y=ydocs,
                             cv=10,
                             n_jobs=1)
            #print('\n')
            print(xlab)
            print(clf)
            #print('CV accuracy scores: %s' % scores)
            print('CV accuracy: %.3f +/- %.3f \n' % (np.mean(scores), np.std(scores)))
        
        

CBOW
LinearSVC(C=1.0, class_weight='balanced', dual=True, fit_intercept=True,
          intercept_scaling=1, loss='squared_hinge', max_iter=1000,
          multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
          verbose=0)
CV accuracy: 0.614 +/- 0.071 

CBOW
GaussianNB(priors=None, var_smoothing=1e-09)
CV accuracy: 0.426 +/- 0.053 

CBOW
RandomForestClassifier(bootstrap=True, class_weight='balanced',
                       criterion='gini', max_depth=10, max_features='auto',
                       max_leaf_nodes=None, min_impurity_decrease=0.0,
                       min_impurity_split=None, min_samples_leaf=1,
                       min_samples_split=2, min_weight_fraction_leaf=0.0,
                       n_estimators=300, n_jobs=4, oob_score=False,
                       random_state=None, verbose=0, warm_start=False)
CV accuracy: 0.581 +/- 0.060 

CBOW
SVC(C=1.0, cache_size=200, class_weight='balanced', coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma=

# Conclusion
I found tf-idf to work much better with the dataset than word embedding did. The two models that used tf-idf both exceeded the best word embedding model on 10-fold cross validation. Ultimately the the model that performed best was the tf-idf model that used n-grams and included cmc and instant_or_sorcery. That model achieved 10-fold cross validation accuracy of 0.777 +/- 0.055. Considering this is an unbalanced 18-class problem, I am very pleased with that perfomance. Especially considering how much overlap and nuance exists between the classes.<br>
<br>
I think further impovements would come from labeling the data differently. I think the model would achieve higher accuracy if I reduced the number of levels in the target variable from 18 down to around 10. I could do this by combining similar categories like 'conditional removal' and 'removal' or 'pump' spells and 'combat tricks'. I could also create a 'multiple' category that included spells that combine two categories. 