# Introduction
This work is inspired by [this paper](https://www.elie.net/publication/i-am-a-legend) from  Elie and Celine Bursztein and will try to reproduce their findings applying some different ideas.

In [1]:
from hearthpricer import hearthpricer
import numpy
import os.path
import pandas

#Cards data

## Load the collectible cards inside JSON data

Let's start loading the game cards data from [Hearthstone JSON](http://hearthstonejson.com/) and loading it to python using the json library.

In [2]:
all_sets_filename = os.path.join('data', 'AllSets.json')

# Uncomment the following lines to update the date file
#import urllib
#urllib.urlretrieve ('http://hearthstonejson.com/json/AllSets.json', all_sets_filename)

all_collectible_cards = hearthpricer.load_json(all_sets_filename)
print('# of collectible cards:', len(all_collectible_cards))

# of collectible cards: 566


## Card types

In [3]:
print('Card types:', ', '.join(set((x['type'] for x in all_collectible_cards))))

Card types: Minion, Weapon, Spell


## Card attributes

In [4]:
print('Card attributes:', ', '.join(set(sum((list(x.keys()) for x in all_collectible_cards), list()))))

Card attributes: name, text, playerClass, attack, cost, health, type, mechanics, durability


## Card mechanics

In [5]:
print('Card mechanics:', ', '.join(
        set(sum((x['mechanics'] for x in all_collectible_cards if 'mechanics' in x), list()))))

Card mechanics: Divine Shield, AdjacentBuff, Charge, Battlecry, HealTarget, Poisonous, Taunt, Combo, Secret, Freeze, Spellpower, Windfury, ImmuneToSpellpower, Stealth, Enrage, Aura, Silence, Deathrattle, AffectedBySpellPower


# The model
Card analysis will be done based on the following base model equation:
$$cost = \sum (attribute_i \cdot coeff_i) + intrinsic$$
The *intrinsic* value represents the cost of having *that* card in your deck and also can be viewed as the *slot_cost*.

## Modelling the cards
With the previous model, let's create a matrix with all the information to work with.

In [6]:
all_collectible_cards_df = pandas.DataFrame(all_collectible_cards)
all_collectible_cards_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 566 entries, 0 to 565
Data columns (total 9 columns):
attack         380 non-null float64
cost           566 non-null int64
durability     18 non-null float64
health         362 non-null float64
mechanics      294 non-null object
name           566 non-null object
playerClass    324 non-null object
text           548 non-null object
type           566 non-null object
dtypes: float64(3), int64(1), object(5)
memory usage: 33.2+ KB


## Vanilla minions modelling
To test the model, let's extract the coefficients for *attack* and *health* with only the minions with no text (vanilla minions).

In [7]:
vanilla_minions_df = pandas.DataFrame(
    all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion') &
                             (all_collectible_cards_df['text'].isnull())])
vanilla_minions_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 14 entries, 14 to 559
Data columns (total 9 columns):
attack         14 non-null float64
cost           14 non-null int64
durability     0 non-null float64
health         14 non-null float64
mechanics      0 non-null object
name           14 non-null object
playerClass    0 non-null object
text           0 non-null object
type           14 non-null object
dtypes: float64(3), int64(1), object(5)
memory usage: 840.0+ bytes


Let the pricing begin! Results will be stored in a new *price* attribute in each card. Also, coeffs are computed taking into account that a card costs: $2 \cdot cost + 1$. Although, *price* value will be comparable to *cost*.

In [8]:
vanilla_columns = ['attack', 'health']
vanilla_coeffs = hearthpricer.pricing(vanilla_minions_df, vanilla_columns, debug=True)

   intrinsic    attack    health
0  -0.501034  1.167118  0.984048


With these coeffs, we can define a *ratio* attribute with the ratio between the real price and cost:
$$ratio = \frac{(price - intrinsic) - (cost - intrinsic)}{cost - intrinsic} = \frac{price - cost}{cost - intrinsic}$$
Then sort the results from the best to the worst in terms of *ratio*.

In [9]:
intrinsic = vanilla_coeffs[0][0]
vanilla_minions_df['ratio'] = (vanilla_minions_df['price'] -  vanilla_minions_df['cost']) / \
                              (vanilla_minions_df['cost'] - intrinsic)
vanilla_minions_df[['name', 'cost', 'price', 'ratio']].sort('ratio', ascending=False)

Unnamed: 0,name,cost,price,ratio
559,Wisp,0,0.325066,0.648789
286,Salty Dog,5,5.30249,0.054988
258,Lost Tallstrider,4,4.135373,0.030076
18,Boulderfist Ogre,6,6.195004,0.029996
20,Chillwind Yeti,4,4.043838,0.00974
24,Core Hound,7,6.961632,-0.005115
14,Bloodfen Raptor,2,1.984207,-0.006315
281,Puddlestomper,2,1.984207,-0.006315
301,Spider Tank,3,2.968255,-0.009067
125,War Golem,7,6.778562,-0.029521


These results can be bad to anyone with some experience, because *Wisp* is listed in the first place with a great distance to the second and *River Crocolisk* (a good vanilla) is on the low end. But this was only a example of how the model works. More complex examples below.

## Adding simple mechanics
To enrich the model, let's add simple mechanics to a minion-only matrix.

### Processing the cards mechanics
Cards have to be processed to extract the card *mechanics* (Charge, Stealth, Windfury, Taunt, Divine Shield) from the *text*, adding the *text_mechanics* attribute with the complex mechanics. All cards with unknown *mechanics* are discarded when processed.

In [10]:
all_mechanics_cards = hearthpricer.process_mechanics(all_collectible_cards)
print('# of processed cards: {} ({:.2%})'.format(
        len(all_mechanics_cards), 1.0 * len(all_mechanics_cards) / len(all_collectible_cards)))

# of processed cards: 90 (15.90%)


Let's price this bunch of minions as described before.

In [11]:
all_mechanics_cards_df = pandas.DataFrame(all_mechanics_cards)
mechanics_coeffs = hearthpricer.pricing(all_mechanics_cards_df, debug=True)

   intrinsic    attack    charge    clumsy  deal_board_damage  deal_damage  \
0  -0.035427  1.155116  0.661401 -1.931062           1.075559     1.107001   

   deal_enemy_hero_damage  deal_own_hero_damage  discard_card  divine shield  \
0                0.558085             -0.543073     -2.231999       2.427762   

    elusive    health  overload  poisonous  spell_damage   stealth     taunt  \
0  0.590709  0.910626 -2.004747   2.501755      0.865698  0.897925  0.219994   

   windfury  
0  0.786156  


In [12]:
all_processed_cards = hearthpricer.process_mechanics(all_collectible_cards,
                                                     discard_unknown_mechanics=False)

text_mechanics = dict()
for card in all_processed_cards:
    if 'text_mechanics' in card:
        text_mechanics[card['text_mechanics']] = text_mechanics.get(card['text_mechanics'], 0) + 1
sorted(((y, x) for x, y in text_mechanics.items()), reverse=True)

[(6, 'Spell Damage +1'),
 (3, 'Destroy any minion damaged by this minion'),
 (3, "Can't be targeted by spells or Hero Powers"),
 (2, 'Freeze any character damaged by this minion'),
 (2, 'Enrage: +3 Attack'),
 (2, 'Costs (1) less for each minion that died this turn'),
 (2, 'Battlecry: Silence a minion'),
 (2, 'Battlecry: Return a friendly minion from the battlefield to your hand'),
 (2, 'Battlecry: Give a minion +2 Attack this turn'),
 (2, 'Battlecry: Draw a card'),
 (2, 'Battlecry: Deal 1 damage'),
 (2, 'At the end of your turn give another random friendly minion +1 Health'),
 (2, '50% chance to attack the wrong enemy'),
 (1, 'Your spells cost (1) less'),
 (1, 'Your other minions have +1/+1'),
 (1, 'Your other minions have +1 Attack'),
 (1, 'Your other Pirates have +1/+1'),
 (1, 'Your other Demons have +2/+2 Your hero is Immune'),
 (1, 'Your other Beasts have +1 Attack'),
 (1, 'Your minions trigger their Deathrattles twice'),
 (1, 'Your minions cost (3) more'),
 (1, 'Your minions cost 

In [13]:
intrinsic = vanilla_coeffs[0][0]
all_mechanics_cards_df['ratio'] = (all_mechanics_cards_df['price'] -  all_mechanics_cards_df['cost']) / \
                                  (all_mechanics_cards_df['cost'] + intrinsic)
results_df = all_mechanics_cards_df[['playerClass', 'name', 'cost', 'price', 'ratio']].sort(
    'ratio', ascending=False)

## Results

### 0 cost minions

In [14]:
results_df[results_df.cost == 0]

Unnamed: 0,playerClass,name,cost,price,ratio
87,,Wisp,0,0.515157,-1.028188
58,,Target Dummy,0,0.612906,-1.223283


### 1 cost minions

In [15]:
results_df[results_df.cost == 1]

Unnamed: 0,playerClass,name,cost,price,ratio
32,Warlock,Voidwalker,1,1.755775,1.514683
81,,Shieldbearer,1,1.743527,1.490136
63,,Argent Squire,1,1.729039,1.461099
73,,Leper Gnome,1,1.650801,1.3043
88,,Worgen Infiltrator,1,1.541678,1.085602
70,Warlock,Flame Imp,1,1.310977,0.623244
12,,Goldshire Footman,1,1.190465,0.381719
54,Priest,Shadowbomber,1,1.115234,0.230946
20,,Murloc Raider,1,1.092716,0.185815
9,,Elven Archer,1,1.068658,0.137601


### 2 cost minions

In [16]:
results_df[results_df.cost == 2]

Unnamed: 0,playerClass,name,cost,price,ratio
59,Shaman,Whirling Zap-o-matic,2,3.304821,0.870481
55,Paladin,Shielded Minibot,2,2.76191,0.50829
45,,Gilblin Stalker,2,2.452304,0.301744
68,,Faerie Dragon,2,2.420941,0.280821
38,,Annoy-o-Tron,2,2.404346,0.26975
37,,Unstable Ghoul,2,2.293554,0.195838
76,Rogue,Patient Assassin,2,2.214997,0.14343
1,,Bloodfen Raptor,2,2.125587,0.083782
51,,Puddlestomper,2,2.125587,0.083782
31,Warlock,Succubus,2,2.042458,0.028325


### 3 cost minions

In [17]:
results_df[results_df.cost == 3]

Unnamed: 0,playerClass,name,cost,price,ratio
79,Rogue,SI:7 Agent,3,3.687901,0.275274
53,Paladin,Scarlet Purifier,3,3.259991,0.104039
67,,Emperor Cobra,3,3.254219,0.10173
71,,Jungle Panther,3,3.152107,0.060868
57,,Spider Tank,3,3.036213,0.014491
56,Mage,Soot Spewer,3,3.013749,0.005502
15,,Ironfur Grizzly,3,2.910891,-0.035658
80,,Scarlet Crusader,3,2.884155,-0.046357
19,,Magma Rager,3,2.82539,-0.069873
85,,Thrallmar Farseer,3,2.789498,-0.084236


### 4 cost minions

In [18]:
results_df[results_df.cost == 4]

Unnamed: 0,playerClass,name,cost,price,ratio
17,Warrior,Kor'kron Elite,4,4.481259,0.137543
47,,Lost Tallstrider,4,4.191329,0.054682
41,Shaman,Dunemaul Shaman,4,4.188815,0.053963
5,,Chillwind Yeti,4,4.069084,0.019744
23,,Ogre Magi,4,4.04662,0.013324
26,,Sen'jin Shieldmasta,4,4.041511,0.011864
75,,Mogu'shan Warden,4,4.017015,0.004863
22,,Oasis Snapjaw,4,3.824594,-0.050131
82,,Silvermoon Guardian,4,3.794781,-0.058651
39,,Arcane Nullifier X-21,4,3.759308,-0.06879


### 5 cost minions

In [19]:
results_df[results_df.cost == 5]

Unnamed: 0,playerClass,name,cost,price,ratio
52,,Salty Dog,5,5.346445,0.077006
50,Rogue,Ogre Ninja,5,5.162945,0.036218
60,,Abomination,5,5.129318,0.028744
83,,Stranglethorn Tiger,5,5.095604,0.02125
66,Shaman,Earth Elemental,5,5.040554,0.009014
64,Warlock,Doomguard,5,4.978771,-0.004719
36,,Spectral Knight,5,4.819751,-0.040064
40,,Bomb Lobber,5,4.794903,-0.045588
3,,Booty Bay Bodyguard,5,4.631317,-0.081948
69,,Fen Creeper,5,4.606821,-0.087393


### 6 cost minions

In [20]:
results_df[results_df.cost == 6]

Unnamed: 0,playerClass,name,cost,price,ratio
10,Shaman,Fire Elemental,6,6.884702,0.160885
8,Warlock,Dread Infernal,6,6.217292,0.039515
4,,Boulderfist Ogre,6,6.134826,0.024518
84,,Sunwalker,6,5.83295,-0.030378
18,,Lord of the Arena,6,5.774186,-0.041065
86,,Windfury Harpy,6,5.641397,-0.065213
35,,Maexxna,6,5.530784,-0.085328
0,,Archmage,6,5.412559,-0.106828
62,,Argent Commander,6,5.239827,-0.138239
24,,Reckless Rocketeer,6,4.934204,-0.193817


### 7+ cost minions

In [21]:
results_df[results_df.cost >= 7]

Unnamed: 0,playerClass,name,cost,price,ratio
72,Hunter,King Krush,9,10.390858,0.16365
13,Druid,Ironbark Protector,8,8.625232,0.083376
74,,Malygos,9,9.420521,0.049479
6,,Core Hound,7,6.956875,-0.006636
44,,Force-Tank MAX,8,7.926265,-0.009833
33,,War Golem,7,6.712384,-0.044256
61,Shaman,Al'Akir the Windlord,8,7.426728,-0.076447
78,,Ravenholdt Assassin,7,6.250721,-0.115292


# Some statistics

## Deathrattle minions
Let's see what is the expected damage from a *Scarlet Purifier*.

In [22]:
minions_df = all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion')]
deathrattle_minions_df = minions_df[minions_df.apply(
        lambda x: x['mechanics'] is not numpy.nan and ('Deathrattle' in x['mechanics']), axis=1)]

print('Deathrattle minions population: {} ({:.2%})'.format(
        len(deathrattle_minions_df), 1.0 * len(deathrattle_minions_df) / len(minions_df)))

Deathrattle minions population: 33 (9.12%)


## 2-cost minions
Let's see what is the expected minion stats from a *Piloted Shredder* deathrattle.

In [23]:
two_cost_minions_df = all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion') &
                                               (all_collectible_cards_df['cost'] == 2)]
print('2-cost minion mean attack: {:.2f}'.format(two_cost_minions_df.attack.mean()))
print('2-cost minion mean health: {:.2f}'.format(two_cost_minions_df.health.mean()))

2-cost minion mean attack: 1.88
2-cost minion mean health: 2.45
