# Goals:

**1. Develop a natural language processor capable of predicting the COLOR of a magic card based on the rules text of that card.**

**2. Develop a natural language processor capable of predicting the TYPE of a magic card based on the rules text of that card.**

In [1]:
# imports and display options
import pandas as pd
import numpy as np
import math
from math import sqrt

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, explained_variance_score

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

import unicodedata
import re
import nltk
from nltk.corpus import stopwords

import prepare as p
# import explore as e
# import model as m

pd.set_option('display.max_colwidth', -1)

# Acquire

* Used file from previouse 
* A CSV, containing an up to date breakdown of each card that has been printed so far, was obtained from MTGJSON.com
* Each row represents a card or a version of a card
* The CSV was read into a pandas dataframe
* The original dataframe contained 50,412 rows and 71 columns

# Prepare (beginning)

The following steps were taken to prepare the data:

1. Restricted dataframe to relevant columns
 
2. Restricted dataframe to rows containing cards that exist in physical form

3. Restricted dataframe to row containing a flavor text

4. Restricted dataframe to rows with a single 'color identity' (see data dictionary Color)

5. Merged rows containing multiple similer types into one of the seven major game types

6. Dropped rows containing multiple types that could not be merged

7. Cleaned up flavor text by removing quote attributions so could merge on flavor text and eliminate most of the duplicates 

8. Reordered columns

9. Dropped rows where the flavor text was not in english 

10. Dropped duplicate rows (I dropped all that I could find it is possible that some duplicates remain.)

11. Added sentiment column showing compound sentiment score using VADER

12. Added intensity column showing the absolute value of sentiment

13. Renamed columns

14. Rounded number values in the data frame to two decimals	

15. Wrote prepared data to ‘mtgprep.csv’ for ease of access

In [2]:
df = p.wrangle_mtg()

In [3]:
df.shape

(50412, 71)

In [4]:
df.columns

Index(['index', 'id', 'artist', 'borderColor', 'colorIdentity',
       'colorIndicator', 'colors', 'convertedManaCost', 'duelDeck',
       'edhrecRank', 'faceConvertedManaCost', 'flavorText', 'frameEffect',
       'frameEffects', 'frameVersion', 'hand', 'hasFoil', 'hasNoDeckLimit',
       'hasNonFoil', 'isAlternative', 'isArena', 'isBuyABox', 'isDateStamped',
       'isFullArt', 'isMtgo', 'isOnlineOnly', 'isOversized', 'isPaper',
       'isPromo', 'isReprint', 'isReserved', 'isStarter', 'isStorySpotlight',
       'isTextless', 'isTimeshifted', 'layout', 'leadershipSkills', 'life',
       'loyalty', 'manaCost', 'mcmId', 'mcmMetaId', 'mtgArenaId', 'mtgoFoilId',
       'mtgoId', 'multiverseId', 'name', 'names', 'number', 'originalText',
       'originalType', 'otherFaceIds', 'power', 'printings', 'purchaseUrls',
       'rarity', 'scryfallId', 'scryfallIllustrationId', 'scryfallOracleId',
       'setCode', 'side', 'subtypes', 'supertypes', 'tcgplayerProductId',
       'text', 'toughness', 

In [5]:
df = df[['name','colorIdentity','isPaper','types','text']]

df = df[df.isPaper==1]
df = df.drop(columns='isPaper')

df = df[df.text.notna()]

In [6]:
df.head(5)

Unnamed: 0,name,colorIdentity,types,text
0,Abundance,G,Enchantment,"If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order."
1,Academy Researchers,U,Creature,"When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers."
2,Adarkar Wastes,"U,W",Land,{T}: Add {C}.\n{T}: Add {W} or {U}. Adarkar Wastes deals 1 damage to you.
3,Afflict,B,Instant,Target creature gets -1/-1 until end of turn.\nDraw a card.
4,Aggressive Urge,G,Instant,Target creature gets +1/+1 until end of turn.\nDraw a card.


In [7]:
df.text

0        If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order.                                                
1        When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers.                                                                                                                                                                               
2        {T}: Add {C}.\n{T}: Add {W} or {U}. Adarkar Wastes deals 1 damage to you.                                                                                                                                                                                                                                                    
3        Target cre

In [8]:
def symble_to_word(text):
    
    
    text = text.replace("{T}","Tap")
    text = text.replace("{C}","ColorlessMana")
    text = text.replace("{W}","WhiteMana")
    text = text.replace("{B}","BlackMana")


    return text


In [9]:
df["text"] = df.text.apply(symble_to_word)
df.head(10)

Unnamed: 0,name,colorIdentity,types,text
0,Abundance,G,Enchantment,"If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order."
1,Academy Researchers,U,Creature,"When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers."
2,Adarkar Wastes,"U,W",Land,Tap: Add ColorlessMana.\nTap: Add WhiteMana or {U}. Adarkar Wastes deals 1 damage to you.
3,Afflict,B,Instant,Target creature gets -1/-1 until end of turn.\nDraw a card.
4,Aggressive Urge,G,Instant,Target creature gets +1/+1 until end of turn.\nDraw a card.
5,Agonizing Memories,B,Sorcery,Look at target player's hand and choose two cards from it. Put them on top of that player's library in any order.
6,Air Elemental,U,Creature,Flying
7,Air Elemental,U,Creature,Flying
8,Ambassador Laquatus,U,Creature,{3}: Target player puts the top three cards of their library into their graveyard.
9,Anaba Bodyguard,R,Creature,First strike (This creature deals combat damage before creatures without first strike.)


In [10]:
def get_ASCII(article):
    '''
    normalizes a string into ASCII characters
    '''

    article = unicodedata.normalize('NFKD', article)\
    .encode('ascii', 'ignore')\
    .decode('utf-8', 'ignore')
    
    return article

In [11]:

def purge_non_characters(article):
    '''
    removes special characters from a string
    '''
    
    article = re.sub(r"[^a-z\s]", ' ', article)
    
    return article

In [12]:
def basic_clean(article):
    '''
    calls child functions preforms basic cleaning on a string
    converts string to lowercase, ASCII characters,
    and eliminates special characters
    '''
    # lowercases letters
    article = article.lower()

    # convert to ASCII characters
    article = get_ASCII(article)

    # remove non characters
    article = purge_non_characters(article)
    
    return article

In [13]:
def remove_stopwords(article,extra_words=[],exclude_words=[]):
    '''
    removes stopwords from a string
    user may specify a list of words to add or remove from the list of stopwords
    '''

    # create stopword list using english
    stopword_list = stopwords.words('english') + ["a", "aa","aaa","aaaa","aaaaa","aaaaaa"]
    
    # remove words in extra_words from stopword list 
    [stopword_list.remove(f'{word}') for word in extra_words]
    
    # add words fin exclude_words to stopword list
    [stopword_list.append(f'{word}') for word in exclude_words]
    
    # slpit article into list of words
    words = article.split()

    # remove words in stopwords from  list of words
    filtered_words = [w for w in words if w not in stopword_list]
    
    # rejoin list of words into article
    article_without_stopwords = ' '.join(filtered_words)
    
    return article_without_stopwords

In [14]:
def lemmatize(article):
    '''
    lemmatizes words in a string
    '''

    # create lemmatize object
    wnl = nltk.stem.WordNetLemmatizer()
    
    # split article into list of words and stem each word
    lemmas = [wnl.lemmatize(word) for word in article.split()]

    #  join words in list into a string
    article_lemmatized = ' '.join(lemmas)
    
    return article_lemmatized

In [15]:
# create column applying basic_cleaning and lemmatize functions
df['text_lemmatized'] = df.text.apply(basic_clean).apply(remove_stopwords).apply(lemmatize)

In [16]:
df.text_lemmatized

0        would draw card may instead choose land nonland reveal card top library reveal card chosen kind put card hand put card revealed way bottom library order                                                                                      
1        academy researcher enters battlefield may put aura card hand onto battlefield attached academy researcher                                                                                                                                     
2        tap add colorlessmana tap add whitemana u adarkar waste deal damage                                                                                                                                                                           
3        target creature get end turn draw card                                                                                                                                                                                                        
4       

In [17]:
# use only cards with a single color identity 
colors = ['W','U','B','R','G']
df_color = df.loc[df.colorIdentity.isin(colors)]

In [18]:
df['types'] = np.where(df['types'] == 'Tribal,Instant', 'Instant', df['types'])

df['types'] = np.where(df['types'] == 'Tribal,Sorcery', 'Sorcery', df['types'])

df['types'] = np.where(df['types'] == 'Tribal,Enchantment', 'Enchantment', df['types'])

df['types'] = np.where(df['types'] == 'instant', 'Instant', df['types'])

# remove remaining cards that are not exclusive to one of the seven card types
types = ['Creature','Instant','Sorcery','Enchantment','Land','Artifact','Planeswalker']
df_type = df.loc[df.types.isin(types)]
df_type.drop(columns='colorIdentity')

Unnamed: 0,name,types,text,text_lemmatized
0,Abundance,Enchantment,"If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order.",would draw card may instead choose land nonland reveal card top library reveal card chosen kind put card hand put card revealed way bottom library order
1,Academy Researchers,Creature,"When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers.",academy researcher enters battlefield may put aura card hand onto battlefield attached academy researcher
2,Adarkar Wastes,Land,Tap: Add ColorlessMana.\nTap: Add WhiteMana or {U}. Adarkar Wastes deals 1 damage to you.,tap add colorlessmana tap add whitemana u adarkar waste deal damage
3,Afflict,Instant,Target creature gets -1/-1 until end of turn.\nDraw a card.,target creature get end turn draw card
4,Aggressive Urge,Instant,Target creature gets +1/+1 until end of turn.\nDraw a card.,target creature get end turn draw card
...,...,...,...,...
50407,Windborne Charge,Sorcery,Two target creatures you control each get +2/+2 and gain flying until end of turn.,two target creature control get gain flying end turn
50408,Windrider Eel,Creature,"Flying\nLandfall — Whenever a land enters the battlefield under your control, Windrider Eel gets +2/+2 until end of turn.",flying landfall whenever land enters battlefield control windrider eel get end turn
50409,World Queller,Creature,"At the beginning of your upkeep, you may choose a card type. If you do, each player sacrifices a permanent of that type.",beginning upkeep may choose card type player sacrifice permanent type
50410,Zektar Shrine Expedition,Enchantment,"Landfall — Whenever a land enters the battlefield under your control, you may put a quest counter on Zektar Shrine Expedition.\nRemove three quest counters from Zektar Shrine Expedition and sacrifice it: Create a 7/1 red Elemental creature token with trample and haste. Exile it at the beginning of the next end step.",landfall whenever land enters battlefield control may put quest counter zektar shrine expedition remove three quest counter zektar shrine expedition sacrifice create red elemental creature token trample haste exile beginning next end step


In [19]:
df_color.head()

Unnamed: 0,name,colorIdentity,types,text,text_lemmatized
0,Abundance,G,Enchantment,"If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order.",would draw card may instead choose land nonland reveal card top library reveal card chosen kind put card hand put card revealed way bottom library order
1,Academy Researchers,U,Creature,"When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers.",academy researcher enters battlefield may put aura card hand onto battlefield attached academy researcher
3,Afflict,B,Instant,Target creature gets -1/-1 until end of turn.\nDraw a card.,target creature get end turn draw card
4,Aggressive Urge,G,Instant,Target creature gets +1/+1 until end of turn.\nDraw a card.,target creature get end turn draw card
5,Agonizing Memories,B,Sorcery,Look at target player's hand and choose two cards from it. Put them on top of that player's library in any order.,look target player hand choose two card put top player library order


In [20]:
df_type.head()

Unnamed: 0,name,colorIdentity,types,text,text_lemmatized
0,Abundance,G,Enchantment,"If you would draw a card, you may instead choose land or nonland and reveal cards from the top of your library until you reveal a card of the chosen kind. Put that card into your hand and put all other cards revealed this way on the bottom of your library in any order.",would draw card may instead choose land nonland reveal card top library reveal card chosen kind put card hand put card revealed way bottom library order
1,Academy Researchers,U,Creature,"When Academy Researchers enters the battlefield, you may put an Aura card from your hand onto the battlefield attached to Academy Researchers.",academy researcher enters battlefield may put aura card hand onto battlefield attached academy researcher
2,Adarkar Wastes,"U,W",Land,Tap: Add ColorlessMana.\nTap: Add WhiteMana or {U}. Adarkar Wastes deals 1 damage to you.,tap add colorlessmana tap add whitemana u adarkar waste deal damage
3,Afflict,B,Instant,Target creature gets -1/-1 until end of turn.\nDraw a card.,target creature get end turn draw card
4,Aggressive Urge,G,Instant,Target creature gets +1/+1 until end of turn.\nDraw a card.,target creature get end turn draw card


In [21]:
df.shape

(43937, 5)

In [22]:
df = df[df.text.notna()]

In [23]:
df.shape

(43937, 5)