## Goals:

**1. Determine drivers of sentiment in the flavor text in Magic: The Gathering cards.**

**2. Develop a model to predict the sentiment of flavor text in Magic: The Gathering cards.**

In [1]:
# imports and display options

import pandas as pd
import numpy as np
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

import prepare as p

pd.set_option('display.max_colwidth', -1)

# Acquire

1. A CSV, containing an up to date breakdown of each Magic card that has been printed so far, was obtained from MTGJSON.com. Each row represented a card or a version of a card.The dataframe contained 50,412 rows and 71 columns.

2. The CSV was read into a pandas dataframe

# Prepare

1. Restricted dataframe to only to columns I considered to be relevant. (colorIdentity, types, convertedManaCost, rarity, flavorText, isPaper)
 
2. Restricted dataframe to only rows containing cards that exist in physical form

3. Restricted dataframe to only row containing a flavor text

4. Restricted dataframe to only rows with a single color-identity

5. Merged rows with similar or overlapping types into one of the seven major game types

6. Restricted dataframe to include only rows with a single type belonging to one of the major game types

7. Cleaned up flavor text then aggregated on flavorText in an attempt to eliminate duplicates. This game me some success. However, it is likely that a few duplicates remain.

8. Reordered columns

9. Restricted dataframe to rows with English flavor text 

10. Dropped rows with duplicates I happened to spot

11. Added sentiment column showing compound sentiment score using VADER

12. Added intensity column showing the absolute value of the compound sentiment score 

In [2]:
# load and prepare data
#df = p.prepare_mgt(p.wrangle_mtg())

In [3]:
#df.to_csv('mtgprep.csv', index=False)

In [4]:
df = pd.read_csv('mtgprep.csv')

In [5]:
df.head(5)

Unnamed: 0,colorIdentity,types,convertedManaCost,rarity,flavorText,sentiment,intensity
0,Green,Creature,5.0,common,""" . . . And the third little boar built his house out of rootwalla plates . . . .""",0.0,0.0
1,Black,Creature,1.0,common,""" . . . Cao Pi, Cao Rui, Fang, Mao, and briefly, Huan— The Sima took the empire in their turn. . . .""",0.0,0.0
2,Blue,Creature,5.0,uncommon,""" . . . When the trees bow down their heads, The wind is passing by.""",0.0,0.0
3,White,Creature,4.0,uncommon,""" . . . and you must also apply for an application license, file documents 136(iv) and 22-C and -D in triplicate, pay all requisite fees, request a . . .""",-0.1027,0.1027
4,Green,Creature,4.0,common,"""'Air superiority?' Not while our archers scan the skies.""",0.0,0.0


# Explore

In [6]:
df.shape

(12450, 7)

In [7]:
df.describe()

Unnamed: 0,convertedManaCost,sentiment,intensity
count,12450.0,12450.0,12450.0
mean,3.175382,-0.024309,0.321123
std,1.611502,0.423125,0.276581
min,0.0,-0.9792,0.0
25%,2.0,-0.34,0.0
50%,3.0,0.0,0.3182
75%,4.0,0.296,0.5423
max,15.0,0.9545,0.9792


In [8]:
df.sort_values('sentiment').head(10)

Unnamed: 0,colorIdentity,types,convertedManaCost,rarity,flavorText,sentiment,intensity
5112,Black,Creature,3.0,common,"""We mourn our dead. We shroud our dead. We bury our dead. Too often, it seems, we must kill our dead again.""",-0.9792,0.9792
2943,White,Creature,6.0,uncommon,"""No more fear. No more failure. No more death. No more!""",-0.9605,0.9605
9135,White,Enchantment,2.0,uncommon,"No one spoke. There was no need. The threat of the Eldrazi presented a simple choice: lay down your weapons and die for nothing, or hold them fast and die for something.",-0.9552,0.9552
2434,Black,Enchantment,3.0,uncommon,"""Kill a creature, destroy the present. Kill the land, destroy the future.""",-0.9545,0.9545
11432,Black,Creature,3.0,common,"There are laws against it, but the dead have no one to complain to and the living are too frightened to investigate.",-0.9505,0.9505
7752,Black,Creature,4.0,common,"Heartless killer in life, brainless killer in death.",-0.9493,0.9493
9904,Red,Creature,5.0,common,"Some tried cremating their dead to stop the ghoulcallers. But the dead returned, furious about their fate.",-0.9455,0.9455
4705,Black,Enchantment,3.0,uncommon,"""This pestilence robs us of glorious death in battle. We starve to death with full bellies and drown trying to slake our unnatural thirst.""",-0.9413,0.9413
3319,Red,Instant,1.0,common,"""Rage is a dangerous weapon. Your enemies will try to use your anger against you. Use it against them first.""",-0.9413,0.9413
1212,Red,Enchantment,3.0,common,"""Goblins charge with a deafening war cry. The cry doesn't mean anything—it just drowns out the drums!""",-0.9412,0.9412


In [9]:
df.sort_values('sentiment',ascending=False).head(10)

Unnamed: 0,colorIdentity,types,convertedManaCost,rarity,flavorText,sentiment,intensity
9887,Blue,Creature,3.0,common,"Some spectators love an underdog, but others are just as happy to support a proven winner.",0.9545,0.9545
8850,Green,Instant,2.0,common,"MORE TO LOVE: Friendly, nature-loving, Bunyonesque SEM seeks SEF looking for a huge commitment. . . . seeks atog prince",0.9426,0.9426
3853,White,Creature,2.0,common,"""The aven are heralds of divinity. The greatest glory is to join them in the sky.""",0.9274,0.9274
11773,White,Creature,2.0,rare,"To become an officer, an Icatian Soldier had to pass a series of tests. These evaluated not only fighting and leadership skills, but also integrity, honor, and moral strength.",0.9268,0.9268
5313,White,Instant,1.0,common,"""When I wish to be strong, I train. When I wish to be wise, I study. When I wish to rest, I start again.""",0.926,0.926
7968,Green,Creature,4.0,uncommon,"If you find yourself and a friend being chased by a king cheetah, you have but one chance: Trip your friend. —Suq'Ata wisdom",0.926,0.926
7967,Green,Creature,4.0,common,"If you find yourself and a friend being chased by a King Cheetah, you have but one chance: Trip your friend. —Suq'Ata wisdom",0.926,0.926
4610,Red,Enchantment,5.0,uncommon,"""They said obey and you'll be happy. They said you'll be safe. But we're not safe. We're not happy. And we will not obey.""",0.9217,0.9217
6562,White,Creature,2.0,common,"Before a woman marries in the village of Sursi, she must visit the land of the Mesa Pegasus. Legend has it that if the woman is pure of heart and her love is true, a Mesa Pegasus will appear, blessing her family with long life and good fortune.",0.9201,0.9201
6563,White,Creature,2.0,common,"Before a woman marries in the village of Sursi, she must visit the land of the mesa pegasus. Legend has it that if the woman is pure of heart and her love is true, a mesa pegasus will appear, blessing her family with long life and good fortune.",0.9201,0.9201


# remove 7968 6562

In [10]:
colors = ['White','Blue','Black','Red','Green']

for color in colors:

    number = df[df.colorIdentity==f'{color}'].sentiment.mean()
      
    print(f'{color}: {number}')

White: 0.033336136096988614
Blue: 0.009366791510611757
Black: -0.11650165790537789
Red: -0.05752559500585269
Green: 0.010246740016299917


In [11]:
colors = ['White','Blue','Black','Red','Green']

for color in colors:

    number = df[df.colorIdentity==f'{color}'].intensity.mean()
      
    print(f'{color}: {number}')

White: 0.34625189675400925
Blue: 0.2877126092384525
Black: 0.3489686211079672
Red: 0.3209056964494746
Green: 0.299823186634067


In [12]:
colors = ['White','Blue','Black','Red','Green']

for color in colors:

    number = df[df.colorIdentity==f'{color}'].intensity.median()
      
    print(f'{color}: {number}')

White: 0.3612
Blue: 0.2755
Black: 0.3612
Red: 0.3382
Green: 0.29600000000000004


In [13]:
colors = ['White','Blue','Black','Red','Green']

for color in colors:

    number = df[df.colorIdentity==f'{color}'][df.sentiment!=0].sentiment.median()
      
    print(f'{color}: {number}')

White: 0.1275
Blue: 0.0442
Black: -0.2933
Red: -0.1695
Green: 0.0516


  """


In [14]:
df[df.sentiment==0].colorIdentity.value_counts()

Green    820
Blue     788
Red      734
White    639
Black    599
Name: colorIdentity, dtype: int64

In [15]:
rarity = ['common','uncommon','rare','mythic']

for grade in rarity:

    number = df[df.rarity==f'{grade}'].sentiment.mean()
      
    print(f'{grade}: {number}')

common: -0.02718227058029704
uncommon: -0.016754747530186628
rare: -0.028907476979742233
mythic: -0.012133128834355836


In [16]:
rarity = ['common','uncommon','rare','mythic']

for grade in rarity:

    number = df[df.rarity==f'{grade}'].intensity.median()
      
    print(f'{grade}: {number}')

common: 0.33299999999999996
uncommon: 0.3182
rare: 0.3182
mythic: 0.3182


In [17]:
types = ['Artifact','Creature','Enchantment','Land','Planeswalker','Instant','Sorcery']

for group in types:

    number = df[df.types==f'{group}'].sentiment.mean()
      
    print(f'{group}: {number}')

Artifact: 0.036036585365853656
Creature: -0.021316346704871165
Enchantment: -0.023603877940241574
Land: -0.07022207792207792
Planeswalker: 0.5789
Instant: -0.017080009920634912
Sorcery: -0.04480777525539167


In [18]:
types = ['Artifact','Creature','Enchantment','Land','Planeswalker','Instant','Sorcery']

for group in types:

    number = df[df.types==f'{group}'].intensity.mean()
      
    print(f'{group}: {number}')

Artifact: 0.20490975609756096
Creature: 0.32212187679082915
Enchantment: 0.3315320406865859
Land: 0.1988662337662337
Planeswalker: 0.5789
Instant: 0.3155666170634929
Sorcery: 0.3221342224744609


In [19]:
costs = [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0]

for cost in costs: 

    number = df[df.convertedManaCost==float(f'{cost}')].sentiment.mean()
    
    number2 = df[df.convertedManaCost==float(f'{cost}')].intensity.mean()
      
    print(f'{cost}: {number}  {number2}')

1.0: -0.012216839677047288  0.3213468281430217
2.0: -0.015910475199445023  0.32042133194589056
3.0: -0.023210602568324  0.32740342443200676
4.0: -0.028640524017467285  0.311132314410481
5.0: -0.034786757337151086  0.3312229778095921
6.0: -0.05863503875969003  0.3260719379844958
7.0: 0.007904918032786895  0.3175581967213113
8.0: -0.10053444444444447  0.2773433333333332
9.0: -0.12317931034482758  0.33608965517241374
10.0: 0.3361272727272727  0.3361272727272727
11.0: -0.15926  0.62946
12.0: 0.2202  0.2202


# Look at a frquency distribution of total cards into sentament and intensity buckets

# Examin frequency of positive and negative sentament 