# More modeling

After a cleanup of the code from notebook 10 let's build some more features into our model

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pandas as pd
import statsmodels.api as sm
pd.options.display.max_columns = 50

In [3]:
from terra_mystica_models.features.model_data import player_level_df
from terra_mystica_models.models import train_model

## Setup and validate refactor

In [4]:
predict_df = player_level_df()

Start by re-running the model from the previous notebook, just to make sure nothing weird happened on the refactor

In [5]:
simple_model = train_model.simple_model(predict_df)
simple_model.summary()

0,1,2,3
Dep. Variable:,vp_margin,R-squared:,0.032
Model:,OLS,Adj. R-squared:,0.032
Method:,Least Squares,F-statistic:,464.2
Date:,"Sun, 29 Mar 2020",Prob (F-statistic):,0.0
Time:,11:38:31,Log-Likelihood:,-846390.0
No. Observations:,198960,AIC:,1693000.0
Df Residuals:,198945,BIC:,1693000.0
Df Model:,14,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-3.7683,0.220,-17.166,0.000,-4.199,-3.338
player_num,-0.3301,0.035,-9.372,0.000,-0.399,-0.261
faction_auren,-0.2274,0.315,-0.723,0.470,-0.844,0.389
faction_chaosmagicians,4.0712,0.233,17.444,0.000,3.614,4.529
faction_cultists,6.7553,0.257,26.303,0.000,6.252,7.259
faction_darklings,9.7877,0.225,43.527,0.000,9.347,10.228
faction_dwarves,3.3857,0.284,11.904,0.000,2.828,3.943
faction_engineers,6.3924,0.240,26.622,0.000,5.922,6.863
faction_fakirs,-4.2379,0.396,-10.692,0.000,-5.015,-3.461

0,1,2,3
Omnibus:,1026.977,Durbin-Watson:,2.495
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1251.219
Skew:,-0.107,Prob(JB):,2.0000000000000002e-272
Kurtosis:,3.325,Cond. No.,59.4


Ok, that's always nice to see, the refactor didn't mess things up. Let's move on to test some more assumptions and then build the model that will actually identify what we care about

## Check some assumptions

The thing we actually care about here is how the presence of various bonus or scoring tiles impacts our faction preference. To accomplish this I'm going to try and estimate the marginal impact of specific scoring/bonus tiles conditional on the player being a specific faction. I have a binary column identifying whether or not a player is playing each faction, and a binary outcome for if a specific scoring or bonus tile is present (and in the case of scoring tiles I can further narrow that down to which turn it's present for). Finally, I've generated interaction terms that are true when both the faction and the tile condition are met. For example, if a player is the swarmlings and bonus tile 1 is present then the column "faction_swarmlings_x_BON1" will be true. If either of those conditions is not met then it will be false. The coefficient assigned to that tile is the estimated marginal impact of score from that combination. In the case of factions we have pretty convincing evidence from the initial regression that we'll need to consider their impact on their own, on top of the interaction impact. As an example, say we find that the coefficient for "faction_fakir_x_BON2" 2, and that this is greater than the coefficient for any of the other faction coefficients with BON2. This still doesn't mean that you should pick the Fakirs when BON2 is present. Their base penalty is such that they're still quite possibly a poor choice. What we really want is to find which faction has the highest marginal score based on its base level on top of its interaction coefficients for a given game setup.

So based on this I clearly need to include the standalone faction indicators in my regression in addition to their interactions. However, I don't think I should have to include the standalone indicators for the scoring or bonus tiles. Because those are constant for all players, any advantage they give to one player should be offset by a net disadvantage across the other three players. Therefore their overall marginal impact on victory point margin should be zero. Let's do a regression to make sure that is indeed the case.

In [6]:
assumption_test_model = train_model.base_score_model(predict_df)
assumption_test_model.summary()

0,1,2,3
Dep. Variable:,vp_margin,R-squared:,0.032
Model:,OLS,Adj. R-squared:,0.032
Method:,Least Squares,F-statistic:,211.1
Date:,"Sun, 29 Mar 2020",Prob (F-statistic):,0.0
Time:,11:38:32,Log-Likelihood:,-846370.0
No. Observations:,198960,AIC:,1693000.0
Df Residuals:,198928,BIC:,1693000.0
Df Model:,31,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-3.8976,0.714,-5.458,0.000,-5.297,-2.498
player_num,-0.3276,0.035,-9.301,0.000,-0.397,-0.259
faction_auren,-0.2148,0.315,-0.682,0.495,-0.832,0.403
faction_chaosmagicians,4.1234,0.234,17.643,0.000,3.665,4.581
faction_cultists,6.8308,0.257,26.548,0.000,6.326,7.335
faction_darklings,9.8812,0.226,43.799,0.000,9.439,10.323
faction_dwarves,3.4167,0.285,11.984,0.000,2.858,3.976
faction_engineers,6.5009,0.241,27.000,0.000,6.029,6.973
faction_fakirs,-4.1955,0.397,-10.575,0.000,-4.973,-3.418

0,1,2,3
Omnibus:,1049.853,Durbin-Watson:,2.496
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1277.21
Skew:,-0.109,Prob(JB):,4.54e-278
Kurtosis:,3.326,Cond. No.,88.6


OK, all of the bonus or scoring tiles have individual coefficients that are not statistically significantly different than zero (look at the last two columns for the confidence interval, since they contain 0 we can say they're not significant. I can do a more formal test to determine if they're jointly significant or not, just to be safe.

For comparison let's first test the assumption that faction choice has no impact on marginal points.

In [7]:
factions = ['faction_auren', 'faction_chaosmagicians',
       'faction_cultists', 'faction_darklings', 'faction_dwarves',
       'faction_engineers', 'faction_fakirs', 'faction_giants',
       'faction_halflings', 'faction_mermaids', 'faction_nomads',
       'faction_swarmlings', 'faction_witches']
hypotheses = ", ".join(f"({faction} = 0)" for faction in factions)
f_test = assumption_test_model.f_test(hypotheses)
print(f_test)

<F test: F=array([[467.46892028]]), p=0.0, df_denom=1.99e+05, df_num=13>


For this one we can reject the null hypothesis, which was that jointly all of the faction impacts are = to 0

Ok, now let's test the ones we care about

In [8]:
hypotheses = ", ".join(f"(BON{i} = 0)" for i in range(2, 11))
f_test = assumption_test_model.f_test(hypotheses)
print(f_test)

<F test: F=array([[1.13040284]]), p=0.3366194633898626, df_denom=1.99e+05, df_num=9>


In [9]:
hypotheses = ", ".join(f"(SCORE{i} = 0)" for i in range(2, 10))
f_test = assumption_test_model.f_test(hypotheses)
print(f_test)

<F test: F=array([[4.03661717]]), p=8.260628182910269e-05, df_denom=1.99e+05, df_num=8>


Interesting. As we'd expect, the bonus tiles don't have any impact on victory point margin, but the scoring tiles seem to have a very small but statistically significant one. I'm not totally sure what to make of this, but the effect is still quite small so I think at least for a first pass we can still exclude the base scoring tile features. Just as one final test it looks like SCORE9 is the only one that's individually significant, let's see what happens if I exclude it from the test

In [10]:
hypotheses = ", ".join(f"(SCORE{i} = 0)" for i in range(2, 9))
f_test = assumption_test_model.f_test(hypotheses)
print(f_test)

<F test: F=array([[0.60436588]]), p=0.752870331313469, df_denom=1.99e+05, df_num=7>


Ok, so it is just SCORE9. I remember that being an expansion tile, but still I'm not sure how that would relate to a difference. Worth keeping in mind though.

## The actual model we care about

Let's see what these interaction effects look like

In [11]:
interact_model = train_model.interact_model(predict_df)
interact_model.summary()

0,1,2,3
Dep. Variable:,vp_margin,R-squared:,0.045
Model:,OLS,Adj. R-squared:,0.044
Method:,Least Squares,F-statistic:,40.21
Date:,"Sun, 29 Mar 2020",Prob (F-statistic):,0.0
Time:,11:38:37,Log-Likelihood:,-844970.0
No. Observations:,198960,AIC:,1690000.0
Df Residuals:,198724,BIC:,1693000.0
Df Model:,235,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-4.0918,0.218,-18.738,0.000,-4.520,-3.664
player_num,-0.1919,0.035,-5.434,0.000,-0.261,-0.123
faction_auren,-7.0841,4.223,-1.677,0.093,-15.361,1.193
faction_chaosmagicians,7.6305,2.015,3.787,0.000,3.682,11.579
faction_cultists,25.9287,2.779,9.329,0.000,20.481,31.376
faction_darklings,3.3317,1.674,1.991,0.047,0.052,6.612
faction_dwarves,32.6837,3.633,8.995,0.000,25.562,39.805
faction_engineers,22.6196,2.259,10.012,0.000,18.192,27.047
faction_fakirs,-14.8806,6.162,-2.415,0.016,-26.958,-2.803

0,1,2,3
Omnibus:,1052.135,Durbin-Watson:,2.49
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1284.617
Skew:,-0.108,Prob(JB):,1.1199999999999999e-279
Kurtosis:,3.329,Cond. No.,521.0


Ok, well this is all pretty hard to read. It does look like at least some SCORE/BONUS tiles have an impact on the marginal scoring of factions. The way the data is presented isn't super useful, so the next step would be to turn this into a function where you could pass your score and bonus tile setup and have it return the factions ranked by their estimated marginal vp.

We'll leave that for the next round. I'd like to get some feedback on if these coefficients match intution. So pick a couple factions that you know have particularly advantageous scoring or bonus tiles (or disadvantageous) and see if the coefficient reflects that.