## **OUTPUT**

### Load data from last prediction

In [4]:
import pandas as pd

pd.set_option('display.max_columns', 30)
pd.set_option('display.max_rows', 30)

In [2]:
df = pd.read_csv('../data/results/predicted_nxt_ct_winner.csv')

In [5]:
df.head()

Unnamed: 0,file,round,wp_ct_val,wp_t_val,nade_ct_val,nade_t_val,ct_alive,t_alive,prev_ct_winner,ct_winner,prev_bomb_planted,bomb_planted,ct_cons_wins,t_cons_wins,ct_val_pred,t_val_pred,ct_round_type,t_round_type,ct_nxt_rnd_type_pred,t_nxt_rnd_type_pred,nxt_ct_winner,nxt_ct_winner_pred
0,0,1,1000.0,1166.666667,550,1200,5,5,0.5,1,0.5,0,0,0,4078.134589,3943.272665,0,0,2,1,0,0
1,0,2,10100.0,3687.5,1100,50,4,0,1.0,0,0.0,0,1,0,17819.702711,6290.616771,2,1,3,1,1,0
2,0,3,4125.0,11700.0,900,2450,0,1,0.0,0,0.0,1,0,1,7038.468589,19600.790638,2,3,1,3,1,1
3,0,4,1000.0,11700.0,0,1600,0,3,0.0,0,1.0,1,0,2,1452.468928,22568.098741,1,2,3,3,0,0
4,0,5,15500.0,12750.0,1400,1700,0,4,0.0,1,1.0,0,0,3,22676.205763,24459.855175,3,3,1,3,0,0


### Let's do some modifications to the DataFrame to make it more legible as an output

### **First** 

Change name of <code>ct_alive</code> and <code>t_alive</code> columns. These columns refer to players alive in the previous round, so let's change the name to <code>prev_ct_alive</code> and <code>prev_t_alive</code>

In [7]:
df.rename(columns={'ct_alive':'prev_ct_alive', 't_alive':'prev_t_alive'}, inplace=True)

In [8]:
df.head()

Unnamed: 0,file,round,wp_ct_val,wp_t_val,nade_ct_val,nade_t_val,prev_ct_alive,prev_t_alive,prev_ct_winner,ct_winner,prev_bomb_planted,bomb_planted,ct_cons_wins,t_cons_wins,ct_val_pred,t_val_pred,ct_round_type,t_round_type,ct_nxt_rnd_type_pred,t_nxt_rnd_type_pred,nxt_ct_winner,nxt_ct_winner_pred
0,0,1,1000.0,1166.666667,550,1200,5,5,0.5,1,0.5,0,0,0,4078.134589,3943.272665,0,0,2,1,0,0
1,0,2,10100.0,3687.5,1100,50,4,0,1.0,0,0.0,0,1,0,17819.702711,6290.616771,2,1,3,1,1,0
2,0,3,4125.0,11700.0,900,2450,0,1,0.0,0,0.0,1,0,1,7038.468589,19600.790638,2,3,1,3,1,1
3,0,4,1000.0,11700.0,0,1600,0,3,0.0,0,1.0,1,0,2,1452.468928,22568.098741,1,2,3,3,0,0
4,0,5,15500.0,12750.0,1400,1700,0,4,0.0,1,1.0,0,0,3,22676.205763,24459.855175,3,3,1,3,0,0


### **Second** 

Erase <code>nxt_ct_winner</code> column. This is a value we have because we have all the information, but when the code will go into production state this column will not exist as the information will be added round by round.

In [10]:
df.drop('nxt_ct_winner', axis=1, inplace=True)

In [11]:
df.head()

Unnamed: 0,file,round,wp_ct_val,wp_t_val,nade_ct_val,nade_t_val,prev_ct_alive,prev_t_alive,prev_ct_winner,ct_winner,prev_bomb_planted,bomb_planted,ct_cons_wins,t_cons_wins,ct_val_pred,t_val_pred,ct_round_type,t_round_type,ct_nxt_rnd_type_pred,t_nxt_rnd_type_pred,nxt_ct_winner_pred
0,0,1,1000.0,1166.666667,550,1200,5,5,0.5,1,0.5,0,0,0,4078.134589,3943.272665,0,0,2,1,0
1,0,2,10100.0,3687.5,1100,50,4,0,1.0,0,0.0,0,1,0,17819.702711,6290.616771,2,1,3,1,0
2,0,3,4125.0,11700.0,900,2450,0,1,0.0,0,0.0,1,0,1,7038.468589,19600.790638,2,3,1,3,1
3,0,4,1000.0,11700.0,0,1600,0,3,0.0,0,1.0,1,0,2,1452.468928,22568.098741,1,2,3,3,0
4,0,5,15500.0,12750.0,1400,1700,0,4,0.0,1,1.0,0,0,3,22676.205763,24459.855175,3,3,1,3,0


### **Third** 

Decode the encoded features to make it legible.

Encoded features:
- <code>ct_round_type</code>
- <code>t_round_type</code>
- <code>ct_nxt_rnd_type_pred</code>
- <code>t_nxt_rnd_type_pred</code>

In [13]:
round_type_dic_decode = {0:'PISTOL_ROUND', 1:'ECO', 2:'MEDIUM', 3:'FULL', 4:'LAST'}

In [14]:
features = ['ct_round_type', 't_round_type', 'ct_nxt_rnd_type_pred', 't_nxt_rnd_type_pred']

for feature in features:
    df[feature] = df[feature].apply(lambda x: round_type_dic_decode[x])

In [15]:
df.head()

Unnamed: 0,file,round,wp_ct_val,wp_t_val,nade_ct_val,nade_t_val,prev_ct_alive,prev_t_alive,prev_ct_winner,ct_winner,prev_bomb_planted,bomb_planted,ct_cons_wins,t_cons_wins,ct_val_pred,t_val_pred,ct_round_type,t_round_type,ct_nxt_rnd_type_pred,t_nxt_rnd_type_pred,nxt_ct_winner_pred
0,0,1,1000.0,1166.666667,550,1200,5,5,0.5,1,0.5,0,0,0,4078.134589,3943.272665,PISTOL_ROUND,PISTOL_ROUND,MEDIUM,ECO,0
1,0,2,10100.0,3687.5,1100,50,4,0,1.0,0,0.0,0,1,0,17819.702711,6290.616771,MEDIUM,ECO,FULL,ECO,0
2,0,3,4125.0,11700.0,900,2450,0,1,0.0,0,0.0,1,0,1,7038.468589,19600.790638,MEDIUM,FULL,ECO,FULL,1
3,0,4,1000.0,11700.0,0,1600,0,3,0.0,0,1.0,1,0,2,1452.468928,22568.098741,ECO,MEDIUM,FULL,FULL,0
4,0,5,15500.0,12750.0,1400,1700,0,4,0.0,1,1.0,0,0,3,22676.205763,24459.855175,FULL,FULL,ECO,FULL,0


### **Now the DataFrame is legible and full of info** 

Let's define each feature to make it more clear:

- <code>file</code>: The file/game analyzed. In real time prediction this feature will not be necessary.
- <code>round</code>: Number of the round analyzed.
- <code>wp_ct_val</code>: **Prediction** of the value of the weapons the ct side is carrying.
- <code>wp_t_val</code>: **Prediction** of the value of the weapons the t side is carrying.
- <code>nade_ct_val</code>: **Prediction** of the value of the grenades the ct side is carrying.
- <code>nade_t_val</code>: **Prediction** of the value of the grenades the t side is carrying.
- <code>prev_ct_alive</code>: The number of players on the ct side that survived the previous round.
- <code>prev_t_alive</code>: The number of players on the t side that survived the previous round.
- <code>prev_ct_winner</code>: 1 if ct side won the previous round. 0 if t side won the previous round.
- <code>ct_winner</code>: 1 if ct side wins the current round. 0 if t side wins the current round.
- <code>prev_bomb_planted</code>: 1 if the bomb were planted the previous round. 0 if not.
- <code>bomb_planted</code>: 1 if the bomb is planted the current round. 0 if not.
- <code>ct_cons_wins</code>: The number of consecutive wins the ct side is achieving.
- <code>t_cons_wins</code>: The number of consecutive wins the t side is achieving.
- <code>ct_val_pred</code>: **Prediction** of the value of all the equipment the ct side is carrying.
- <code>t_val_pred</code>: **Prediction** of the value of all the equipment the t side is carrying.
- <code>ct_round_type</code>: Type of round of the ct side.
- <code>t_round_type</code>: Type of round of the t side.
- <code>ct_nxt_rnd_type_pred</code>: **Prediction** of the type of round of the ct side for the next round.
- <code>t_nxt_rnd_type_pred</code>: **Prediction** of the type of round of the t side for the next round.
- <code>nxt_ct_winner_pred</code>: **Prediction** of the winner side for the next round: 1 if ct side, 0 if t side.

### Let's create a smaller DataFrame with the more relevant information to make an easy read

In [23]:
df_condensed = df[['file', 'round', 'ct_val_pred', 't_val_pred', 'ct_round_type', 't_round_type',
       'ct_nxt_rnd_type_pred', 't_nxt_rnd_type_pred', 'nxt_ct_winner_pred']]

In [24]:
df_condensed.head()

Unnamed: 0,file,round,ct_val_pred,t_val_pred,ct_round_type,t_round_type,ct_nxt_rnd_type_pred,t_nxt_rnd_type_pred,nxt_ct_winner_pred
0,0,1,4078.134589,3943.272665,PISTOL_ROUND,PISTOL_ROUND,MEDIUM,ECO,0
1,0,2,17819.702711,6290.616771,MEDIUM,ECO,FULL,ECO,0
2,0,3,7038.468589,19600.790638,MEDIUM,FULL,ECO,FULL,1
3,0,4,1452.468928,22568.098741,ECO,MEDIUM,FULL,FULL,0
4,0,5,22676.205763,24459.855175,FULL,FULL,ECO,FULL,0


There are some features that are also relevant like consecutive wins or players alive, but this information is known by the players through the play or the hud of the game, so it is not necessary to include it in the outcome dataframe.


## Conclusion

Through these processes we have created 3 types of models to make the predictions of:
- Value of the team: 2 regression models.
- Next round type: 2 multiclass classification models. 
- Winner for the next round: 1 classification model.

The models to predict the value of the teams are **very accurate** and they reach high r2 scores.

We cannot say the same about the other 3 models. They are also accurate but not so much, the error is higher. It could be interesting work a bit more in these models with more feature engineering, trying other algorithms, or maybe trying with neuronal networks.

This is the first stage for a larger project to implement the prediction in the game and make it in real-time.

The next step is to create a pipeline that returns the prediction of the <code>df_condensed</code> features when we introduce the round data.