# Introduction
This work is inspired by [this paper](https://www.elie.net/publication/i-am-a-legend) from  Elie and Celine Bursztein and will try to reproduce their findings applying some different ideas.

In [34]:
from hearthpricer import hearthpricer
import numpy
import os.path
import pandas

#Cards data

## Load the collectible cards inside JSON data

Let's start loading the game cards data from [Hearthstone JSON](http://hearthstonejson.com/) and loading it to python using the json library.

In [2]:
all_sets_filename = os.path.join('data', 'AllSets.json')

# Uncomment the following lines to update the date file
#import urllib
#urllib.urlretrieve ('http://hearthstonejson.com/json/AllSets.json', all_sets_filename)

all_collectible_cards = hearthpricer.load_json(all_sets_filename)
print('# of collectible cards:', len(all_collectible_cards))

# of collectible cards: 535


## Card types

In [3]:
print('Card types:', ', '.join(set((x['type'] for x in all_collectible_cards))))

Card types: Minion, Spell, Weapon


## Card attributes

In [4]:
print('Card attributes:', ', '.join(set(sum((list(x.keys()) for x in all_collectible_cards), list()))))

Card attributes: cost, attack, name, mechanics, playerClass, health, type, durability, text


## Card mechanics

In [5]:
print('Card mechanics:', ', '.join(
        set(sum((x['mechanics'] for x in all_collectible_cards if 'mechanics' in x), list()))))

Card mechanics: Battlecry, Aura, Taunt, Spellpower, Silence, Enrage, Poisonous, Combo, HealTarget, Secret, Windfury, Stealth, ImmuneToSpellpower, Deathrattle, Divine Shield, Freeze, AdjacentBuff, AffectedBySpellPower, Charge


# The model
Card analysis will be done based on the following base model equation:
$$cost = \sum (attribute_i \cdot coeff_i) + intrinsic$$
The *intrinsic* value represents the cost of having *that* card in your deck and also can be viewed as the *slot_cost*.

## Modelling the cards
With the previous model, let's create a matrix with all the information to work with.

In [6]:
all_collectible_cards_df = pandas.DataFrame(all_collectible_cards)
all_collectible_cards_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 535 entries, 0 to 534
Data columns (total 9 columns):
attack         357 non-null float64
cost           535 non-null int64
durability     18 non-null float64
health         339 non-null float64
mechanics      279 non-null object
name           535 non-null object
playerClass    306 non-null object
text           517 non-null object
type           535 non-null object
dtypes: float64(3), int64(1), object(5)
memory usage: 31.3+ KB


## Vanilla minions modelling
To test the model, let's extract the coefficients for *attack* and *health* with only the minions with no text (vanilla minions).

In [7]:
vanilla_minions_df = pandas.DataFrame(
    all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion') &
                             (all_collectible_cards_df['text'].isnull())])
vanilla_minions_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 14 entries, 272 to 527
Data columns (total 9 columns):
attack         14 non-null float64
cost           14 non-null int64
durability     0 non-null float64
health         14 non-null float64
mechanics      0 non-null object
name           14 non-null object
playerClass    0 non-null object
text           0 non-null object
type           14 non-null object
dtypes: float64(3), int64(1), object(5)
memory usage: 840.0+ bytes


Let the pricing begin! Results will be stored in a new *price* attribute in each card. Also, coeffs are computed taking into account that a card costs: $2 \cdot cost + 1$. Although, *price* value will be comparable to *cost*.

In [8]:
vanilla_columns = ['attack', 'health']
vanilla_coeffs = hearthpricer.pricing(vanilla_minions_df, vanilla_columns, debug=True)

   intrinsic    attack    health
0  -0.501034  1.167118  0.984048


With these coeffs, we can define a *ratio* attribute with the ratio between the real price and cost:
$$ratio = \frac{(price - intrinsic) - (cost - intrinsic)}{cost - intrinsic} = \frac{price - cost}{cost - intrinsic}$$
Then sort the results from the best to the worst in terms of *ratio*.

In [9]:
intrinsic = vanilla_coeffs[0][0]
vanilla_minions_df['ratio'] = (vanilla_minions_df['price'] -  vanilla_minions_df['cost']) / \
                              (vanilla_minions_df['cost'] - intrinsic)
vanilla_minions_df[['name', 'cost', 'price', 'ratio']].sort('ratio', ascending=False)

Unnamed: 0,name,cost,price,ratio
272,Wisp,0,0.325066,0.648789
369,Salty Dog,5,5.30249,0.054988
341,Lost Tallstrider,4,4.135373,0.030076
420,Boulderfist Ogre,6,6.195004,0.029996
422,Chillwind Yeti,4,4.043838,0.00974
426,Core Hound,7,6.961632,-0.005115
364,Puddlestomper,2,1.984207,-0.006315
416,Bloodfen Raptor,2,1.984207,-0.006315
384,Spider Tank,3,2.968255,-0.009067
527,War Golem,7,6.778562,-0.029521


These results can be bad to anyone with some experience, because *Wisp* is listed in the first place with a great distance to the second and *River Crocolisk* (a good vanilla) is on the low end. But this was only a example of how the model works. More complex examples below.

## Adding simple mechanics
To enrich the model, let's add simple mechanics to a minion-only matrix.

### Processing the cards mechanics
Cards have to be processed to extract the card *mechanics* (Charge, Stealth, Windfury, Taunt, Divine Shield) from the *text*, adding the *text_mechanics* attribute with the complex mechanics. All cards with unknown *mechanics* are discarded when processed.

In [10]:
all_mechanics_cards = hearthpricer.process_mechanics(all_collectible_cards)
print('# of processed cards:', len(all_mechanics_cards))

# of processed cards: 77


Let's price this bunch of minions as described before.

In [14]:
all_mechanics_cards_df = pandas.DataFrame(all_mechanics_cards)
mechanics_coeffs = hearthpricer.pricing(all_mechanics_cards_df, debug=True)

   intrinsic    attack    charge  deal_board_damage  deal_damage  \
0  -0.154823  1.162568  0.666579           1.129695     1.131602   

   deal_enemy_hero_damage  deal_own_hero_damage  discard_card  divine shield  \
0                 0.58311             -0.532429     -2.220561       2.476247   

     health  overload  poisonous   stealth     taunt  windfury  
0  0.919149 -1.986093    2.63774  0.708624  0.218591  0.843321  


In [17]:
all_processed_cards = hearthpricer.process_mechanics(all_collectible_cards,
                                                     discard_unknown_mechanics=False)

text_mechanics = set()
for card in all_processed_cards:
    if 'text_mechanics' in card:
        text_mechanics.add(card['text_mechanics'])
text_mechanics

{'50% chance to attack the wrong enemy',
 'ALL minions cost (1) more',
 'ALL other Murlocs have +1 Attack',
 'ALL other Murlocs have +2/+1',
 "Adjacent minions can't be targeted by spells or Hero Powers",
 'Adjacent minions have +1 Attack',
 'Adjacent minions have +2 Attack',
 'After you cast a spell deal 1 damage to ALL minions',
 'After you summon a minion deal 1 damage to a random enemy',
 'All minions have a 50% chance to attack the wrong enemy',
 'Also damages the minions next to whomever he attacks',
 "At the end of each player's turn that player draws until they have 3 cards",
 "At the end of each turn destroy this minion if it's your only one",
 'At the end of each turn gain +1/+1',
 'At the end of each turn summon all friendly minions that died this turn',
 'At the end of your turn deal 1 damage to this minion and summon a 1/1 Imp',
 'At the end of your turn deal 2 damage to ALL other characters',
 'At the end of your turn deal 2 damage to a non-Mech minion',
 'At the end of y

In [18]:
intrinsic = vanilla_coeffs[0][0]
all_mechanics_cards_df['ratio'] = (all_mechanics_cards_df['price'] -  all_mechanics_cards_df['cost']) / \
                                  (all_mechanics_cards_df['cost'] + intrinsic)
all_mechanics_cards_df[['name', 'cost', 'price', 'ratio']].sort('ratio', ascending=False)

Unnamed: 0,name,cost,price,ratio
74,Voidwalker,1,1.710483,1.423911
5,Argent Squire,1,1.701570,1.406049
21,Shieldbearer,1,1.698069,1.399032
14,Leper Gnome,1,1.627841,1.258285
45,Whirling Zap-o-matic,2,3.350570,0.901001
28,Worgen Infiltrator,1,1.399043,0.799740
11,Flame Imp,1,1.286945,0.575080
42,Shielded Minibot,2,2.742429,0.495294
56,Goldshire Footman,1,1.141613,0.283812
19,SI:7 Agent,3,3.676765,0.270818


# Some statistics

## Deathrattle minions
Let's see what is the expected damage from a *Scarlet Purifier*

In [41]:
minions_df = all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion')]
deathrattle_minions_df = minions_df[minions_df.apply(
        lambda x: x['mechanics'] is not numpy.nan and ('Deathrattle' in x['mechanics']), axis=1)]

print('Deathrattle minions population: {} ({:.2%})'.format(
        len(deathrattle_minions_df), len(deathrattle_minions_df) / len(minions_df)))

Deathrattle minions population: 32 (9.44%)


## 2-cost minions
Let's see what is the expected minion from a *Piloted Shredder*

In [42]:
two_cost_minions_df = all_collectible_cards_df[(all_collectible_cards_df['type'] == 'Minion') &
                                               (all_collectible_cards_df['cost'] == 2)]
print('2-cost minion mean attack: {:.2f}'.format(two_cost_minions_df.attack.mean()))
print('2-cost minion mean health: {:.2f}'.format(two_cost_minions_df.health.mean()))

2-cost minion mean attack: 1.88
2-cost minion mean health: 2.45
