# Modeling

Begin writing a function that creates and attempts to optimize a Random Forest Classifier model. It will utilize cross-validation and grid search. Once this is complete and functional, we can begin adding other algorithms.

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score, GridSearchCV, train_test_split, cross_validate
from sklearn.ensemble import RandomForestClassifier

# Get Some Data

In [27]:
df = pd.read_csv('../final.csv')
df

Unnamed: 0,assistsplayer_1,assistsplayer_10,assistsplayer_2,assistsplayer_3,assistsplayer_4,assistsplayer_5,assistsplayer_6,assistsplayer_7,assistsplayer_8,assistsplayer_9,...,team_totalGold_100,team_totalGold_200,team_trueDamageDoneToChampions_100,team_trueDamageDoneToChampions_200,team_ward_player_100,team_ward_player_200,team_assistsplayer_100,team_assistsplayer_200,team_xp_100,team_xp_200
0,2.0,3.0,3.0,3.0,2.0,3.0,3.0,3.0,1.0,2.0,...,36356,35237,2951,2594,74,391,13,12,42198,41697
1,1.0,15.0,3.0,1.0,4.0,5.0,0.0,5.0,11.0,7.0,...,33239,47104,1757,1697,38,95,14,38,37906,47483
2,0.0,5.0,4.0,2.0,1.0,3.0,1.0,5.0,4.0,7.0,...,33257,37239,3897,4351,158,90,10,22,37746,41185
3,3.0,4.0,0.0,7.0,4.0,14.0,5.0,4.0,5.0,2.0,...,40216,35871,4308,1738,82,108,28,20,41354,36424
4,3.0,4.0,6.0,6.0,6.0,5.0,3.0,1.0,2.0,0.0,...,37900,31360,873,1885,50,41,26,10,40723,37217
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8768,5.0,7.0,6.0,10.0,5.0,12.0,1.0,8.0,4.0,7.0,...,38176,36746,4398,4646,102,68,38,27,42435,39616
8769,2.0,5.0,9.0,2.0,1.0,6.0,1.0,2.0,1.0,2.0,...,38015,37013,2933,2496,68,79,20,11,42133,41796
8770,4.0,11.0,10.0,3.0,6.0,7.0,1.0,2.0,3.0,6.0,...,43423,43224,2726,4244,130,67,30,23,48350,46779
8771,5.0,17.0,3.0,3.0,0.0,3.0,0.0,9.0,3.0,5.0,...,33444,40786,3882,1100,36,74,14,34,38668,41425


Although this data has already been prepared, I still need to drop the column called 'killsplayer_0'. It represents how many kills were made by game objects, not players, and contains several null values. Then, all I need to do is split it up into X and y groups and then into train and test sets. Please keep in mind this data set is only a fraction of our expected data set, and is only being used to check the funcionality of my model.

__Drop 'killsplayer_0' Column__

In [None]:
#Killsplayer_0 can be dropped because its not an actual player.
df.drop(columns = ['killsplayer_0'], inplace = True)

__Split into X and y Groups__

In [45]:
X, y = df.drop(columns = ['winningTeam']), df.winningTeam

__Create Train and Test Sets__

In [46]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 123)

In [47]:
X_train.shape, y_train.shape

((7018, 206), (7018,))

__Create Dummy Variables__

In [48]:
X_train = pd.get_dummies(X_train, drop_first = True)
X_train

Unnamed: 0,assistsplayer_1,assistsplayer_10,assistsplayer_2,assistsplayer_3,assistsplayer_4,assistsplayer_5,assistsplayer_6,assistsplayer_7,assistsplayer_8,assistsplayer_9,...,gameVersion_11.16.390.1945,gameVersion_11.17.393.607,gameVersion_11.17.394.4489,gameVersion_11.18.395.7538,gameVersion_11.19.398.2521,gameVersion_11.19.398.9466,gameVersion_11.20.400.7328,gameVersion_11.21.403.3002,gameVersion_11.22.406.3587,gameVersion_11.23.409.111
8449,0.0,10.0,6.0,4.0,2.0,4.0,1.0,4.0,4.0,7.0,...,0,0,0,0,0,0,0,1,0,0
1364,2.0,12.0,3.0,7.0,3.0,7.0,2.0,4.0,9.0,10.0,...,0,0,0,0,0,0,0,0,1,0
1822,2.0,6.0,10.0,5.0,3.0,14.0,0.0,1.0,1.0,2.0,...,0,0,0,0,0,0,0,0,1,0
6069,10.0,12.0,7.0,7.0,7.0,17.0,4.0,9.0,3.0,2.0,...,0,0,0,0,0,0,0,1,0,0
390,0.0,9.0,5.0,4.0,4.0,4.0,0.0,6.0,0.0,3.0,...,0,0,0,0,0,0,0,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7382,1.0,11.0,1.0,0.0,3.0,7.0,4.0,6.0,0.0,1.0,...,0,0,1,0,0,0,0,0,0,0
7763,0.0,5.0,1.0,5.0,5.0,6.0,5.0,3.0,2.0,1.0,...,0,0,0,0,0,0,0,0,1,0
5218,2.0,5.0,0.0,1.0,2.0,2.0,5.0,3.0,2.0,4.0,...,0,0,0,0,0,0,0,1,0,0
1346,1.0,6.0,5.0,6.0,3.0,11.0,0.0,4.0,2.0,4.0,...,0,0,0,0,0,0,1,0,0,0


__Create a Baseline__

Since this is a classification problem, I will set the baseline to whichever team has the most wins.

In [32]:
#Set team 100.0 to be blue_team and team 200.0 to be red_team
def get_team_color(value):
    if value == 100.0:
        return 'blue_team'
    else:
        return 'red_team'

In [51]:
y_train = y_train.apply(get_team_color)

In [52]:
y_train.value_counts()

red_team     3649
blue_team    3369
Name: winningTeam, dtype: int64

In [53]:
#Use the dummy classifier to set the baseline
#red_team has the most wins
from sklearn.dummy import DummyClassifier

baseline = DummyClassifier(strategy = 'constant', constant = 'red_team')
baseline.fit(X_train, y_train)

#Now get the baseline accuracy
baseline.score(X_train, y_train)

0.5199487033342832

__Train a Single Model__

Train a single model to find out about how long it will take with so many features. From there, I will be able to estimate how long the grid search might take to complete.

In [11]:
#Create the model (just use default hyperparameters for now, except random_state)
model = RandomForestClassifier(random_state = 123)

#Fit the model
model.fit(X_train, y_train)

#Score the model
model.score(X_train, y_train)

1.0

The above model finished training extremely quickly, so I don't think there is anything to worry about. Just be mindful of how many models will actually be produced with the given ranges for the hyperparameters.

__Implement GridSearchCV__

In [12]:
clf = RandomForestClassifier(random_state = 123)

grid = GridSearchCV(clf, {'max_depth': range(5, 11), 'min_samples_leaf': range(5, 11)}, cv = 5)
grid.fit(X_train, y_train)

GridSearchCV(cv=5, estimator=RandomForestClassifier(random_state=123),
             param_grid={'max_depth': range(5, 11),
                         'min_samples_leaf': range(5, 11)})

In [13]:
#What was the best score and best parameters
grid.best_score_, grid.best_params_

(0.9532983508245877, {'max_depth': 6, 'min_samples_leaf': 5})

__Write RandomForestClassifier Function__

In [3]:
rf_dict = {
    'max_depth': range(1, 16),
    'min_samples_leaf': range(1, 16)
}

In [2]:
def get_random_forest_models(X_train, y_train, param_dict, cv = 5):
    """
    This function creates and returns an optimized random forest classification model. It also
    prints out the best model's mean cross-validated accuracy score and parameters.
    
    This function takes in the X and y training sets to fit the models.
    
    This function takes in a dictionary that contains the parameters to be iterated through.
    
    This function also takes in a value for the number of cross validation folds to do.
    The cv value defaults to 5.
    """
    #Create the classifier model
    clf = RandomForestClassifier(random_state = 123)
    
    #Create the GridSearchCV object
    grid = GridSearchCV(clf, param_dict, cv = 5)
    
    #Fit the GridSearchCV object
    grid.fit(X_train, y_train)
    
    #Print the best model's score and parameters
    print('Mean Cross-Validated Accuracy: ', round(grid.best_score_, 4))
    print('Max Depth: ', grid.best_params_['max_depth'])
    print('Min Samples Per Leaf: ', grid.best_params_['min_samples_leaf'])
    
    #Return the best model
    return grid.best_estimator_

In [65]:
best_model = get_random_forest_models(X_train, y_train, rf_dict)

Mean Cross-Validated Accuracy:  0.9674
Max Depth:  14
Min Samples Per Leaf:  3


In [57]:
#Check to see if the function returned the model correctly
#Scoring it on the train data should yield a similar result to the average score
best_model.score(X_train, y_train)

0.9933029353092049

__What were the Most Important Features?__

In [67]:
best_features = pd.DataFrame(best_model.feature_importances_, X_train.columns)
best_features.sort_values(by = 0, ascending = False).head(10)

Unnamed: 0,0
towers_lost_team200,0.153878
inhibs_lost_team100,0.10776
towers_lost_team100,0.10712
inhibs_lost_team200,0.088003
baron_team200,0.044381
team_totalGold_200,0.035326
baron_team100,0.035008
dragon_team200,0.033469
dragon_team100,0.031174
team_totalGold_100,0.026489


### AdaBoostClassifier

I will use the AdaBoostClassifier with a RandomForestClassifier as the base_estimator.

In [58]:
from sklearn.ensemble import AdaBoostClassifier

In [19]:
#Create the RandomForestClassifier object
rf = RandomForestClassifier(random_state = 123)

#Create the AdaBoostClassifier object
adaBoost = AdaBoostClassifier(rf, random_state = 123)

#Create GridSearchCV object
grid = GridSearchCV(adaBoost, {'n_estimators': range(50, 101, 10)}, cv = 5)

#Fit the grid object
grid.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=AdaBoostClassifier(base_estimator=RandomForestClassifier(random_state=123),
                                          random_state=123),
             param_grid={'n_estimators': range(50, 101, 10)})

In [20]:
#What was the best score and best parameters
grid.best_score_, grid.best_params_

(0.9567316341829086, {'n_estimators': 50})

__Let's see if it can improve performance of our best RandomForest model from earlier__

In [21]:
#Create the AdaBoostClassifier object
adaBoost = AdaBoostClassifier(best_model, random_state = 123)

#Create GridSearchCV object
grid = GridSearchCV(adaBoost, {'n_estimators': range(50, 101, 5)}, cv = 5)

#Fit the grid object
grid.fit(X_train, y_train)

#What was the best score and best parameters
grid.best_score_, grid.best_params_

(0.9584557721139431, {'n_estimators': 50})

It is actually slightly better than before

In [59]:
#Create a function for AdaBoost
def get_adaBoosted_model(X_train, y_train, model_to_boost, param_dict, cv = 5):
    """
    This function creates and returns an optimized AdaBoosted random forest classification model. It also
    prints out the best model's mean cross-validated accuracy score and parameters.
    
    This function takes in the X and y training sets to fit the models.
    
    This function takes in a dictionary that contains the parameters to be iterated through.
    
    This function also takes in a value for the number of cross validation folds to do.
    The cv value defaults to 5.
    """
    #Create the AdaBoost Classifier
    adaBoost_clf = AdaBoostClassifier(model_to_boost, random_state = 123)
    
    #Create the GridSearchCV object
    grid = GridSearchCV(adaBoost_clf, param_dict, cv = 5)
    
    #Fit the GridSearchCV object
    grid.fit(X_train, y_train)
    
    #Print the best model's score and parameters
    print('Mean Cross-Validated Accuracy: ', round(grid.best_score_, 4))
    print('Num Estimators: ', grid.best_params_['n_estimators'])
    print('Learning Rate: ', grid.best_params_['learning_rate'])
    
    #Return the best model
    return grid.best_estimator_

In [60]:
adaBoost_params = {
    'n_estimators': range(50, 61),
    'learning_rate': range(1, 6)
}

In [61]:
#Test the above function
ada_boosted_clf = get_adaBoosted_model(X_train, y_train, best_model, adaBoost_params)

Mean Cross-Validated Accuracy:  0.9675
Num Estimators:  50
Learning Rate:  1


In [62]:
#This performed slightly better than the random forest alone.
#What were the most important features?
best_features = pd.DataFrame(ada_boosted_clf.feature_importances_, X_train.columns)
best_features.sort_values(by = 0, ascending = False).head(10)

Unnamed: 0,0
towers_lost_team200,0.116428
towers_lost_team100,0.104307
inhibs_lost_team200,0.098691
inhibs_lost_team100,0.098336
baron_team200,0.040657
baron_team100,0.034809
dragon_team100,0.0274
dragon_team200,0.025379
team_totalGold_100,0.014561
team_xp_100,0.014471


# Test Dataset at 15 Minute Mark

To get the data at the 15 minute mark, I'll have to reload all of the match data json files and run them through the prepare function.

In [2]:
#Create a list of timeline files to iterate through
timeline_files = ['timeline_data_start_4000_end_5000.json',
                  'timeline_data_start_5000_end_6000.json',
                  'timeline_data_start_6000_end_7000.json', 
                  'timeline_data_start_7000_end_8000.json',
                  'timeline_data_start_8000_end_9000.json',
                  'timeline_data_start_9000_end_10000.json',
                  'timeline_data_start_10000_end_10657.json']

In [3]:
#Create a list of other game data files to iterate through
other_data_files = ['other_game_data_start_4000_end_5000.json',
                  'other_game_data_start_5000_end_6000.json',
                  'other_game_data_start_6000_end_7000.json', 
                  'other_game_data_start_7000_end_8000.json',
                  'other_game_data_start_8000_end_9000.json',
                  'other_game_data_start_9000_end_10000.json',
                  'other_game_data_start_10000_end_10657.json']

In [4]:
#Create empty list to store the timeline info
#Save the single file so we don't have to do this again in the future
timeline_list = []

#Now loop through the files list, read each file, and extend the timeline_list with each entry
for file in timeline_files:
    #Read the file
    temp_file = pd.read_json(file)
    
    #Turn it into a list of dicts
    temp_file = temp_file.to_dict(orient = 'records')
    
    #Extend the timeline_list with the temp file
    timeline_list.extend(temp_file)

In [8]:
#Convert to df
timeline_df = pd.DataFrame(timeline_list)

#Now save this complete file as a single json
timeline_df.to_json('timeline_data_start_4000_end_10657.json')

In [9]:
#Create empty list to store the other game data
#Save the single file so we don't have to do this again in the future
game_data_list = []

#Now loop through the files list, read each file, and extend the game_data_list with each entry
for file in other_data_files:
    #Read the file
    temp_file = pd.read_json(file)
    
    #Turn it into a list of dicts
    temp_file = temp_file.to_dict(orient = 'records')
    
    #Extend the timeline_list with the temp file
    game_data_list.extend(temp_file)

In [10]:
#Convert to df
game_data_df = pd.DataFrame(game_data_list)

#Now save this complete file as a single json
game_data_df.to_json('other_game_data_start_4000_end_10657.json')

In [12]:
import prepare

#Now the lists are created, run them through Joshua C's prepare and prep functions
match_info_minute_15 = prepare.prepare(timeline_list, game_data_list, 15)

Skipping: 0 due to <20 min or not classic
Skipping: 1 due to <20 min or not classic
Finished with: 2 of 6654
Finished with: 3 of 6654
Skipping: 4 due to <20 min or not classic
Finished with: 5 of 6654
Finished with: 6 of 6654
Finished with: 7 of 6654
Finished with: 8 of 6654
Finished with: 9 of 6654
Skipping: 10 due to <20 min or not classic
Finished with: 11 of 6654
Skipping: 12 due to <20 min or not classic
Finished with: 13 of 6654
Finished with: 14 of 6654
Finished with: 15 of 6654
Finished with: 16 of 6654
Finished with: 17 of 6654
Finished with: 18 of 6654
Finished with: 19 of 6654
Finished with: 20 of 6654
Finished with: 21 of 6654
Finished with: 22 of 6654
Skipping: 23 due to <20 min or not classic
Skipping: 24 due to <20 min or not classic
Skipping: 25 due to <20 min or not classic
Finished with: 26 of 6654
Finished with: 27 of 6654
Skipping: 28 due to <20 min or not classic
Skipping: 29 due to <20 min or not classic
Finished with: 30 of 6654
Finished with: 31 of 6654
Finished

Finished with: 263 of 6654
Skipping: 264 due to <20 min or not classic
Finished with: 265 of 6654
Finished with: 266 of 6654
Finished with: 267 of 6654
Finished with: 268 of 6654
Finished with: 269 of 6654
Finished with: 270 of 6654
Finished with: 271 of 6654
Finished with: 272 of 6654
Finished with: 273 of 6654
Skipping: 274 due to <20 min or not classic
Finished with: 275 of 6654
Finished with: 276 of 6654
Finished with: 277 of 6654
Finished with: 278 of 6654
Finished with: 279 of 6654
Finished with: 280 of 6654
Skipping: 281 due to <20 min or not classic
Skipping: 282 due to <20 min or not classic
Skipping: 283 due to <20 min or not classic
Finished with: 284 of 6654
Finished with: 285 of 6654
Skipping: 286 due to <20 min or not classic
Finished with: 287 of 6654
Skipping: 288 due to <20 min or not classic
Skipping: 289 due to <20 min or not classic
Finished with: 290 of 6654
Finished with: 291 of 6654
Finished with: 292 of 6654
Finished with: 293 of 6654
Finished with: 294 of 6654


Finished with: 520 of 6654
Finished with: 521 of 6654
Finished with: 522 of 6654
Skipping: 523 due to <20 min or not classic
Finished with: 524 of 6654
Finished with: 525 of 6654
Finished with: 526 of 6654
Skipping: 527 due to <20 min or not classic
Skipping: 528 due to <20 min or not classic
Finished with: 529 of 6654
Skipping: 530 due to <20 min or not classic
Finished with: 531 of 6654
Finished with: 532 of 6654
Finished with: 533 of 6654
Finished with: 534 of 6654
Finished with: 535 of 6654
Finished with: 536 of 6654
Skipping: 537 due to <20 min or not classic
Finished with: 538 of 6654
Skipping: 539 due to <20 min or not classic
Skipping: 540 due to <20 min or not classic
Finished with: 541 of 6654
Finished with: 542 of 6654
Finished with: 543 of 6654
Skipping: 544 due to <20 min or not classic
Finished with: 545 of 6654
Finished with: 546 of 6654
Finished with: 547 of 6654
Finished with: 548 of 6654
Finished with: 549 of 6654
Finished with: 550 of 6654
Finished with: 551 of 6654


Finished with: 775 of 6654
Skipping: 776 due to <20 min or not classic
Skipping: 777 due to <20 min or not classic
Finished with: 778 of 6654
Skipping: 779 due to <20 min or not classic
Finished with: 780 of 6654
Finished with: 781 of 6654
Finished with: 782 of 6654
Finished with: 783 of 6654
Finished with: 784 of 6654
Finished with: 785 of 6654
Skipping: 786 due to <20 min or not classic
Finished with: 787 of 6654
Finished with: 788 of 6654
Skipping: 789 due to <20 min or not classic
Finished with: 790 of 6654
Finished with: 791 of 6654
Skipping: 792 due to <20 min or not classic
Finished with: 793 of 6654
Skipping: 794 due to <20 min or not classic
Finished with: 795 of 6654
Finished with: 796 of 6654
Finished with: 797 of 6654
Finished with: 798 of 6654
Finished with: 799 of 6654
Finished with: 800 of 6654
Finished with: 801 of 6654
Skipping: 802 due to <20 min or not classic
Finished with: 803 of 6654
Finished with: 804 of 6654
Finished with: 805 of 6654
Finished with: 806 of 6654


Finished with: 1028 of 6654
Finished with: 1029 of 6654
Finished with: 1030 of 6654
Finished with: 1031 of 6654
Skipping: 1032 due to <20 min or not classic
Skipping: 1033 due to <20 min or not classic
Finished with: 1034 of 6654
Finished with: 1035 of 6654
Finished with: 1036 of 6654
Finished with: 1037 of 6654
Skipping: 1038 due to <20 min or not classic
Finished with: 1039 of 6654
Finished with: 1040 of 6654
Finished with: 1041 of 6654
Skipping: 1042 due to <20 min or not classic
Finished with: 1043 of 6654
Finished with: 1044 of 6654
Skipping: 1045 due to <20 min or not classic
Finished with: 1046 of 6654
Skipping: 1047 due to <20 min or not classic
Skipping: 1048 due to <20 min or not classic
Finished with: 1049 of 6654
Finished with: 1050 of 6654
Finished with: 1051 of 6654
Skipping: 1052 due to <20 min or not classic
Finished with: 1053 of 6654
Finished with: 1054 of 6654
Finished with: 1055 of 6654
Skipping: 1056 due to <20 min or not classic
Finished with: 1057 of 6654
Skippin

Finished with: 1280 of 6654
Finished with: 1281 of 6654
Finished with: 1282 of 6654
Finished with: 1283 of 6654
Finished with: 1284 of 6654
Skipping: 1285 due to <20 min or not classic
Finished with: 1286 of 6654
Finished with: 1287 of 6654
Finished with: 1288 of 6654
Finished with: 1289 of 6654
Skipping: 1290 due to <20 min or not classic
Finished with: 1291 of 6654
Finished with: 1292 of 6654
Finished with: 1293 of 6654
Finished with: 1294 of 6654
Finished with: 1295 of 6654
Skipping: 1296 due to <20 min or not classic
Finished with: 1297 of 6654
Finished with: 1298 of 6654
Finished with: 1299 of 6654
Finished with: 1300 of 6654
Finished with: 1301 of 6654
Finished with: 1302 of 6654
Finished with: 1303 of 6654
Finished with: 1304 of 6654
Skipping: 1305 due to <20 min or not classic
Skipping: 1306 due to <20 min or not classic
Finished with: 1307 of 6654
Skipping: 1308 due to <20 min or not classic
Skipping: 1309 due to <20 min or not classic
Skipping: 1310 due to <20 min or not clas

Finished with: 1533 of 6654
Finished with: 1534 of 6654
Finished with: 1535 of 6654
Finished with: 1536 of 6654
Finished with: 1537 of 6654
Finished with: 1538 of 6654
Finished with: 1539 of 6654
Finished with: 1540 of 6654
Skipping: 1541 due to <20 min or not classic
Finished with: 1542 of 6654
Skipping: 1543 due to <20 min or not classic
Finished with: 1544 of 6654
Finished with: 1545 of 6654
Finished with: 1546 of 6654
Finished with: 1547 of 6654
Finished with: 1548 of 6654
Skipping: 1549 due to <20 min or not classic
Finished with: 1550 of 6654
Finished with: 1551 of 6654
Finished with: 1552 of 6654
Finished with: 1553 of 6654
Finished with: 1554 of 6654
Finished with: 1555 of 6654
Skipping: 1556 due to <20 min or not classic
Finished with: 1557 of 6654
Finished with: 1558 of 6654
Finished with: 1559 of 6654
Finished with: 1560 of 6654
Skipping: 1561 due to <20 min or not classic
Finished with: 1562 of 6654
Finished with: 1563 of 6654
Finished with: 1564 of 6654
Skipping: 1565 due 

Finished with: 1781 of 6654
Skipping: 1782 due to <20 min or not classic
Skipping: 1783 due to <20 min or not classic
Finished with: 1784 of 6654
Finished with: 1785 of 6654
Finished with: 1786 of 6654
Finished with: 1787 of 6654
Finished with: 1788 of 6654
Finished with: 1789 of 6654
Finished with: 1790 of 6654
Finished with: 1791 of 6654
Skipping: 1792 due to <20 min or not classic
Finished with: 1793 of 6654
Finished with: 1794 of 6654
Skipping: 1795 due to <20 min or not classic
Finished with: 1796 of 6654
Finished with: 1797 of 6654
Finished with: 1798 of 6654
Finished with: 1799 of 6654
Finished with: 1800 of 6654
Skipping: 1801 due to <20 min or not classic
Finished with: 1802 of 6654
Finished with: 1803 of 6654
Skipping: 1804 due to <20 min or not classic
Finished with: 1805 of 6654
Finished with: 1806 of 6654
Finished with: 1807 of 6654
Finished with: 1808 of 6654
Finished with: 1809 of 6654
Finished with: 1810 of 6654
Skipping: 1811 due to <20 min or not classic
Finished with

Finished with: 2031 of 6654
Finished with: 2032 of 6654
Finished with: 2033 of 6654
Finished with: 2034 of 6654
Finished with: 2035 of 6654
Finished with: 2036 of 6654
Finished with: 2037 of 6654
Skipping: 2038 due to <20 min or not classic
Finished with: 2039 of 6654
Finished with: 2040 of 6654
Skipping: 2041 due to <20 min or not classic
Skipping: 2042 due to <20 min or not classic
Finished with: 2043 of 6654
Finished with: 2044 of 6654
Skipping: 2045 due to <20 min or not classic
Finished with: 2046 of 6654
Finished with: 2047 of 6654
Finished with: 2048 of 6654
Finished with: 2049 of 6654
Finished with: 2050 of 6654
Skipping: 2051 due to <20 min or not classic
Skipping: 2052 due to <20 min or not classic
Finished with: 2053 of 6654
Finished with: 2054 of 6654
Skipping: 2055 due to <20 min or not classic
Skipping: 2056 due to <20 min or not classic
Finished with: 2057 of 6654
Skipping: 2058 due to <20 min or not classic
Skipping: 2059 due to <20 min or not classic
Finished with: 206

Finished with: 2279 of 6654
Finished with: 2280 of 6654
Finished with: 2281 of 6654
Skipping: 2282 due to <20 min or not classic
Skipping: 2283 due to <20 min or not classic
Finished with: 2284 of 6654
Finished with: 2285 of 6654
Skipping: 2286 due to <20 min or not classic
Finished with: 2287 of 6654
Finished with: 2288 of 6654
Finished with: 2289 of 6654
Finished with: 2290 of 6654
Skipping: 2291 due to <20 min or not classic
Finished with: 2292 of 6654
Skipping: 2293 due to <20 min or not classic
Finished with: 2294 of 6654
Finished with: 2295 of 6654
Finished with: 2296 of 6654
Finished with: 2297 of 6654
Finished with: 2298 of 6654
Skipping: 2299 due to <20 min or not classic
Skipping: 2300 due to <20 min or not classic
Finished with: 2301 of 6654
Finished with: 2302 of 6654
Skipping: 2303 due to <20 min or not classic
Finished with: 2304 of 6654
Finished with: 2305 of 6654
Skipping: 2306 due to <20 min or not classic
Skipping: 2307 due to <20 min or not classic
Finished with: 230

Finished with: 2525 of 6654
Skipping: 2526 due to <20 min or not classic
Finished with: 2527 of 6654
Finished with: 2528 of 6654
Finished with: 2529 of 6654
Skipping: 2530 due to <20 min or not classic
Skipping: 2531 due to <20 min or not classic
Finished with: 2532 of 6654
Finished with: 2533 of 6654
Finished with: 2534 of 6654
Skipping: 2535 due to <20 min or not classic
Finished with: 2536 of 6654
Finished with: 2537 of 6654
Skipping: 2538 due to <20 min or not classic
Skipping: 2539 due to <20 min or not classic
Finished with: 2540 of 6654
Finished with: 2541 of 6654
Finished with: 2542 of 6654
Skipping: 2543 due to <20 min or not classic
Finished with: 2544 of 6654
Skipping: 2545 due to <20 min or not classic
Finished with: 2546 of 6654
Finished with: 2547 of 6654
Skipping: 2548 due to <20 min or not classic
Skipping: 2549 due to <20 min or not classic
Finished with: 2550 of 6654
Finished with: 2551 of 6654
Finished with: 2552 of 6654
Finished with: 2553 of 6654
Skipping: 2554 due

Finished with: 2773 of 6654
Finished with: 2774 of 6654
Finished with: 2775 of 6654
Finished with: 2776 of 6654
Finished with: 2777 of 6654
Skipping: 2778 due to <20 min or not classic
Finished with: 2779 of 6654
Finished with: 2780 of 6654
Finished with: 2781 of 6654
Finished with: 2782 of 6654
Skipping: 2783 due to <20 min or not classic
Finished with: 2784 of 6654
Finished with: 2785 of 6654
Finished with: 2786 of 6654
Skipping: 2787 due to <20 min or not classic
Finished with: 2788 of 6654
Finished with: 2789 of 6654
Skipping: 2790 due to <20 min or not classic
Finished with: 2791 of 6654
Finished with: 2792 of 6654
Finished with: 2793 of 6654
Skipping: 2794 due to <20 min or not classic
Finished with: 2795 of 6654
Finished with: 2796 of 6654
Finished with: 2797 of 6654
Finished with: 2798 of 6654
Finished with: 2799 of 6654
Finished with: 2800 of 6654
Finished with: 2801 of 6654
Finished with: 2802 of 6654
Finished with: 2803 of 6654
Finished with: 2804 of 6654
Skipping: 2805 due 

Finished with: 3025 of 6654
Skipping: 3026 due to <20 min or not classic
Finished with: 3027 of 6654
Skipping: 3028 due to <20 min or not classic
Finished with: 3029 of 6654
Skipping: 3030 due to <20 min or not classic
Finished with: 3031 of 6654
Finished with: 3032 of 6654
Skipping: 3033 due to <20 min or not classic
Finished with: 3034 of 6654
Finished with: 3035 of 6654
Skipping: 3036 due to <20 min or not classic
Finished with: 3037 of 6654
Skipping: 3038 due to <20 min or not classic
Skipping: 3039 due to <20 min or not classic
Skipping: 3040 due to <20 min or not classic
Skipping: 3041 due to <20 min or not classic
Finished with: 3042 of 6654
Finished with: 3043 of 6654
Skipping: 3044 due to <20 min or not classic
Finished with: 3045 of 6654
Finished with: 3046 of 6654
Finished with: 3047 of 6654
Finished with: 3048 of 6654
Finished with: 3049 of 6654
Finished with: 3050 of 6654
Finished with: 3051 of 6654
Finished with: 3052 of 6654
Skipping: 3053 due to <20 min or not classic
F

Finished with: 3277 of 6654
Skipping: 3278 due to <20 min or not classic
Finished with: 3279 of 6654
Finished with: 3280 of 6654
Finished with: 3281 of 6654
Finished with: 3282 of 6654
Finished with: 3283 of 6654
Finished with: 3284 of 6654
Finished with: 3285 of 6654
Finished with: 3286 of 6654
Finished with: 3287 of 6654
Finished with: 3288 of 6654
Skipping: 3289 due to <20 min or not classic
Skipping: 3290 due to <20 min or not classic
Finished with: 3291 of 6654
Skipping: 3292 due to <20 min or not classic
Finished with: 3293 of 6654
Finished with: 3294 of 6654
Skipping: 3295 due to <20 min or not classic
Finished with: 3296 of 6654
Finished with: 3297 of 6654
Finished with: 3298 of 6654
Finished with: 3299 of 6654
Skipping: 3300 due to <20 min or not classic
Skipping: 3301 due to <20 min or not classic
Finished with: 3302 of 6654
Skipping: 3303 due to <20 min or not classic
Finished with: 3304 of 6654
Skipping: 3305 due to <20 min or not classic
Finished with: 3306 of 6654
Finishe

Finished with: 3524 of 6654
Finished with: 3525 of 6654
Skipping: 3526 due to <20 min or not classic
Finished with: 3527 of 6654
Finished with: 3528 of 6654
Skipping: 3529 due to <20 min or not classic
Skipping: 3530 due to <20 min or not classic
Finished with: 3531 of 6654
Finished with: 3532 of 6654
Finished with: 3533 of 6654
Finished with: 3534 of 6654
Skipping: 3535 due to <20 min or not classic
Finished with: 3536 of 6654
Skipping: 3537 due to <20 min or not classic
Skipping: 3538 due to <20 min or not classic
Finished with: 3539 of 6654
Finished with: 3540 of 6654
Finished with: 3541 of 6654
Finished with: 3542 of 6654
Finished with: 3543 of 6654
Finished with: 3544 of 6654
Finished with: 3545 of 6654
Finished with: 3546 of 6654
Skipping: 3547 due to <20 min or not classic
Finished with: 3548 of 6654
Finished with: 3549 of 6654
Finished with: 3550 of 6654
Finished with: 3551 of 6654
Finished with: 3552 of 6654
Finished with: 3553 of 6654
Skipping: 3554 due to <20 min or not clas

Finished with: 3779 of 6654
Finished with: 3780 of 6654
Finished with: 3781 of 6654
Finished with: 3782 of 6654
Skipping: 3783 due to <20 min or not classic
Finished with: 3784 of 6654
Finished with: 3785 of 6654
Finished with: 3786 of 6654
Finished with: 3787 of 6654
Finished with: 3788 of 6654
Finished with: 3789 of 6654
Finished with: 3790 of 6654
Skipping: 3791 due to <20 min or not classic
Skipping: 3792 due to <20 min or not classic
Finished with: 3793 of 6654
Skipping: 3794 due to <20 min or not classic
Skipping: 3795 due to <20 min or not classic
Skipping: 3796 due to <20 min or not classic
Finished with: 3797 of 6654
Skipping: 3798 due to <20 min or not classic
Skipping: 3799 due to <20 min or not classic
Finished with: 3800 of 6654
Finished with: 3801 of 6654
Finished with: 3802 of 6654
Finished with: 3803 of 6654
Finished with: 3804 of 6654
Finished with: 3805 of 6654
Finished with: 3806 of 6654
Finished with: 3807 of 6654
Finished with: 3808 of 6654
Finished with: 3809 of 6

Finished with: 4028 of 6654
Finished with: 4029 of 6654
Finished with: 4030 of 6654
Skipping: 4031 due to <20 min or not classic
Finished with: 4032 of 6654
Finished with: 4033 of 6654
Finished with: 4034 of 6654
Skipping: 4035 due to <20 min or not classic
Finished with: 4036 of 6654
Finished with: 4037 of 6654
Finished with: 4038 of 6654
Finished with: 4039 of 6654
Finished with: 4040 of 6654
Finished with: 4041 of 6654
Finished with: 4042 of 6654
Skipping: 4043 due to <20 min or not classic
Finished with: 4044 of 6654
Finished with: 4045 of 6654
Finished with: 4046 of 6654
Skipping: 4047 due to <20 min or not classic
Finished with: 4048 of 6654
Skipping: 4049 due to <20 min or not classic
Skipping: 4050 due to <20 min or not classic
Finished with: 4051 of 6654
Finished with: 4052 of 6654
Finished with: 4053 of 6654
Finished with: 4054 of 6654
Finished with: 4055 of 6654
Skipping: 4056 due to <20 min or not classic
Skipping: 4057 due to <20 min or not classic
Finished with: 4058 of 6

Finished with: 4282 of 6654
Finished with: 4283 of 6654
Finished with: 4284 of 6654
Skipping: 4285 due to <20 min or not classic
Skipping: 4286 due to <20 min or not classic
Finished with: 4287 of 6654
Finished with: 4288 of 6654
Skipping: 4289 due to <20 min or not classic
Skipping: 4290 due to <20 min or not classic
Finished with: 4291 of 6654
Finished with: 4292 of 6654
Finished with: 4293 of 6654
Finished with: 4294 of 6654
Finished with: 4295 of 6654
Skipping: 4296 due to <20 min or not classic
Skipping: 4297 due to <20 min or not classic
Finished with: 4298 of 6654
Finished with: 4299 of 6654
Skipping: 4300 due to <20 min or not classic
Finished with: 4301 of 6654
Finished with: 4302 of 6654
Finished with: 4303 of 6654
Finished with: 4304 of 6654
Finished with: 4305 of 6654
Finished with: 4306 of 6654
Finished with: 4307 of 6654
Finished with: 4308 of 6654
Finished with: 4309 of 6654
Finished with: 4310 of 6654
Skipping: 4311 due to <20 min or not classic
Finished with: 4312 of 6

Finished with: 4530 of 6654
Finished with: 4531 of 6654
Finished with: 4532 of 6654
Finished with: 4533 of 6654
Finished with: 4534 of 6654
Finished with: 4535 of 6654
Finished with: 4536 of 6654
Finished with: 4537 of 6654
Finished with: 4538 of 6654
Finished with: 4539 of 6654
Skipping: 4540 due to <20 min or not classic
Finished with: 4541 of 6654
Finished with: 4542 of 6654
Skipping: 4543 due to <20 min or not classic
Skipping: 4544 due to <20 min or not classic
Finished with: 4545 of 6654
Finished with: 4546 of 6654
Finished with: 4547 of 6654
Finished with: 4548 of 6654
Finished with: 4549 of 6654
Skipping: 4550 due to <20 min or not classic
Skipping: 4551 due to <20 min or not classic
Finished with: 4552 of 6654
Skipping: 4553 due to <20 min or not classic
Finished with: 4554 of 6654
Finished with: 4555 of 6654
Finished with: 4556 of 6654
Finished with: 4557 of 6654
Finished with: 4558 of 6654
Finished with: 4559 of 6654
Finished with: 4560 of 6654
Finished with: 4561 of 6654
Fi

Finished with: 4784 of 6654
Finished with: 4785 of 6654
Finished with: 4786 of 6654
Finished with: 4787 of 6654
Finished with: 4788 of 6654
Finished with: 4789 of 6654
Skipping: 4790 due to <20 min or not classic
Skipping: 4791 due to <20 min or not classic
Finished with: 4792 of 6654
Finished with: 4793 of 6654
Skipping: 4794 due to <20 min or not classic
Finished with: 4795 of 6654
Skipping: 4796 due to <20 min or not classic
Finished with: 4797 of 6654
Finished with: 4798 of 6654
Finished with: 4799 of 6654
Finished with: 4800 of 6654
Skipping: 4801 due to <20 min or not classic
Skipping: 4802 due to <20 min or not classic
Skipping: 4803 due to <20 min or not classic
Skipping: 4804 due to <20 min or not classic
Finished with: 4805 of 6654
Finished with: 4806 of 6654
Skipping: 4807 due to <20 min or not classic
Skipping: 4808 due to <20 min or not classic
Finished with: 4809 of 6654
Finished with: 4810 of 6654
Finished with: 4811 of 6654
Finished with: 4812 of 6654
Finished with: 481

Finished with: 5031 of 6654
Finished with: 5032 of 6654
Finished with: 5033 of 6654
Skipping: 5034 due to <20 min or not classic
Skipping: 5035 due to <20 min or not classic
Finished with: 5036 of 6654
Skipping: 5037 due to <20 min or not classic
Finished with: 5038 of 6654
Finished with: 5039 of 6654
Finished with: 5040 of 6654
Finished with: 5041 of 6654
Finished with: 5042 of 6654
Skipping: 5043 due to <20 min or not classic
Skipping: 5044 due to <20 min or not classic
Finished with: 5045 of 6654
Skipping: 5046 due to <20 min or not classic
Skipping: 5047 due to <20 min or not classic
Finished with: 5048 of 6654
Finished with: 5049 of 6654
Finished with: 5050 of 6654
Finished with: 5051 of 6654
Finished with: 5052 of 6654
Finished with: 5053 of 6654
Finished with: 5054 of 6654
Finished with: 5055 of 6654
Skipping: 5056 due to <20 min or not classic
Skipping: 5057 due to <20 min or not classic
Finished with: 5058 of 6654
Skipping: 5059 due to <20 min or not classic
Finished with: 506

Finished with: 5280 of 6654
Finished with: 5281 of 6654
Finished with: 5282 of 6654
Finished with: 5283 of 6654
Skipping: 5284 due to <20 min or not classic
Skipping: 5285 due to <20 min or not classic
Finished with: 5286 of 6654
Skipping: 5287 due to <20 min or not classic
Skipping: 5288 due to <20 min or not classic
Skipping: 5289 due to <20 min or not classic
Finished with: 5290 of 6654
Skipping: 5291 due to <20 min or not classic
Skipping: 5292 due to <20 min or not classic
Finished with: 5293 of 6654
Finished with: 5294 of 6654
Finished with: 5295 of 6654
Finished with: 5296 of 6654
Skipping: 5297 due to <20 min or not classic
Finished with: 5298 of 6654
Finished with: 5299 of 6654
Skipping: 5300 due to <20 min or not classic
Skipping: 5301 due to <20 min or not classic
Finished with: 5302 of 6654
Finished with: 5303 of 6654
Skipping: 5304 due to <20 min or not classic
Finished with: 5305 of 6654
Skipping: 5306 due to <20 min or not classic
Finished with: 5307 of 6654
Finished wit

Finished with: 5529 of 6654
Finished with: 5530 of 6654
Finished with: 5531 of 6654
Skipping: 5532 due to <20 min or not classic
Finished with: 5533 of 6654
Skipping: 5534 due to <20 min or not classic
Finished with: 5535 of 6654
Finished with: 5536 of 6654
Finished with: 5537 of 6654
Finished with: 5538 of 6654
Skipping: 5539 due to <20 min or not classic
Finished with: 5540 of 6654
Finished with: 5541 of 6654
Finished with: 5542 of 6654
Finished with: 5543 of 6654
Finished with: 5544 of 6654
Finished with: 5545 of 6654
Skipping: 5546 due to <20 min or not classic
Finished with: 5547 of 6654
Finished with: 5548 of 6654
Finished with: 5549 of 6654
Skipping: 5550 due to <20 min or not classic
Finished with: 5551 of 6654
Finished with: 5552 of 6654
Finished with: 5553 of 6654
Finished with: 5554 of 6654
Finished with: 5555 of 6654
Finished with: 5556 of 6654
Skipping: 5557 due to <20 min or not classic
Skipping: 5558 due to <20 min or not classic
Finished with: 5559 of 6654
Finished with

Finished with: 5786 of 6654
Finished with: 5787 of 6654
Finished with: 5788 of 6654
Finished with: 5789 of 6654
Finished with: 5790 of 6654
Skipping: 5791 due to <20 min or not classic
Finished with: 5792 of 6654
Finished with: 5793 of 6654
Finished with: 5794 of 6654
Finished with: 5795 of 6654
Finished with: 5796 of 6654
Finished with: 5797 of 6654
Skipping: 5798 due to <20 min or not classic
Skipping: 5799 due to <20 min or not classic
Finished with: 5800 of 6654
Finished with: 5801 of 6654
Finished with: 5802 of 6654
Finished with: 5803 of 6654
Finished with: 5804 of 6654
Skipping: 5805 due to <20 min or not classic
Skipping: 5806 due to <20 min or not classic
Finished with: 5807 of 6654
Finished with: 5808 of 6654
Finished with: 5809 of 6654
Finished with: 5810 of 6654
Finished with: 5811 of 6654
Finished with: 5812 of 6654
Finished with: 5813 of 6654
Skipping: 5814 due to <20 min or not classic
Finished with: 5815 of 6654
Skipping: 5816 due to <20 min or not classic
Skipping: 581

Finished with: 6037 of 6654
Skipping: 6038 due to <20 min or not classic
Skipping: 6039 due to <20 min or not classic
Finished with: 6040 of 6654
Finished with: 6041 of 6654
Finished with: 6042 of 6654
Finished with: 6043 of 6654
Finished with: 6044 of 6654
Finished with: 6045 of 6654
Finished with: 6046 of 6654
Finished with: 6047 of 6654
Finished with: 6048 of 6654
Finished with: 6049 of 6654
Finished with: 6050 of 6654
Finished with: 6051 of 6654
Skipping: 6052 due to <20 min or not classic
Finished with: 6053 of 6654
Finished with: 6054 of 6654
Finished with: 6055 of 6654
Finished with: 6056 of 6654
Finished with: 6057 of 6654
Finished with: 6058 of 6654
Skipping: 6059 due to <20 min or not classic
Finished with: 6060 of 6654
Finished with: 6061 of 6654
Skipping: 6062 due to <20 min or not classic
Skipping: 6063 due to <20 min or not classic
Finished with: 6064 of 6654
Finished with: 6065 of 6654
Skipping: 6066 due to <20 min or not classic
Skipping: 6067 due to <20 min or not clas

Finished with: 6294 of 6654
Finished with: 6295 of 6654
Finished with: 6296 of 6654
Finished with: 6297 of 6654
Finished with: 6298 of 6654
Skipping: 6299 due to <20 min or not classic
Finished with: 6300 of 6654
Finished with: 6301 of 6654
Finished with: 6302 of 6654
Skipping: 6303 due to <20 min or not classic
Skipping: 6304 due to <20 min or not classic
Skipping: 6305 due to <20 min or not classic
Finished with: 6306 of 6654
Finished with: 6307 of 6654
Finished with: 6308 of 6654
Skipping: 6309 due to <20 min or not classic
Skipping: 6310 due to <20 min or not classic
Finished with: 6311 of 6654
Skipping: 6312 due to <20 min or not classic
Finished with: 6313 of 6654
Finished with: 6314 of 6654
Finished with: 6315 of 6654
Skipping: 6316 due to <20 min or not classic
Finished with: 6317 of 6654
Finished with: 6318 of 6654
Finished with: 6319 of 6654
Finished with: 6320 of 6654
Finished with: 6321 of 6654
Skipping: 6322 due to <20 min or not classic
Skipping: 6323 due to <20 min or no

Finished with: 6543 of 6654
Finished with: 6544 of 6654
Skipping: 6545 due to <20 min or not classic
Finished with: 6546 of 6654
Finished with: 6547 of 6654
Skipping: 6548 due to <20 min or not classic
Finished with: 6549 of 6654
Finished with: 6550 of 6654
Finished with: 6551 of 6654
Skipping: 6552 due to <20 min or not classic
Finished with: 6553 of 6654
Finished with: 6554 of 6654
Finished with: 6555 of 6654
Finished with: 6556 of 6654
Finished with: 6557 of 6654
Finished with: 6558 of 6654
Finished with: 6559 of 6654
Finished with: 6560 of 6654
Skipping: 6561 due to <20 min or not classic
Finished with: 6562 of 6654
Finished with: 6563 of 6654
Finished with: 6564 of 6654
Finished with: 6565 of 6654
Finished with: 6566 of 6654
Skipping: 6567 due to <20 min or not classic
Skipping: 6568 due to <20 min or not classic
Skipping: 6569 due to <20 min or not classic
Finished with: 6570 of 6654
Finished with: 6571 of 6654
Skipping: 6572 due to <20 min or not classic
Finished with: 6573 of 6

In [13]:
#Save this csv
match_info_minute_15.to_csv('match_data_start_4000_end_10657_minute_15.csv', index = False)

In [15]:
#Now that we have the prepared data for the 15 minute mark, go through the same process you did before
match_info_15 = prepare.prep(match_info_minute_15)

In [17]:
#Killsplayer_0 can be dropped because its not an actual player.
match_info_15.drop(columns = ['killsplayer_0'], inplace = True)

In [20]:
#Split into X and y
X, y = match_info_15.drop(columns = ['winningTeam']), match_info_15.winningTeam

In [21]:
#Create dummy vars
X = pd.get_dummies(X, drop_first = True)

In [22]:
#Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 123)

In [24]:
#Create the dict of params to iterate through
rf_dict = {
    'max_depth': range(1, 16),
    'min_samples_leaf': range(1, 16)
}

In [26]:
best_model = get_random_forest_models(X_train, y_train, rf_dict, cv = 5)

Mean Cross-Validated Accuracy:  0.9678
Max Depth:  7
Min Samples Per Leaf:  7


In [27]:
#What were the most important features?
best_features = pd.DataFrame(best_model.feature_importances_, X_train.columns)
best_features.sort_values(by = 0, ascending = False).head(10)

Unnamed: 0,0
towers_lost_team200,0.142449
inhibs_lost_team100,0.127329
inhibs_lost_team200,0.122605
towers_lost_team100,0.121551
baron_team100,0.066082
dragon_team100,0.055502
baron_team200,0.04106
dragon_team200,0.034043
RedTeamTotalGold,0.023914
BlueTeamTotalGold,0.022807


# Build Models For New Data at 15 Minutes

In [4]:
#Load the extracted dataframe for the new data
extracted_df = pd.read_csv('new_extracted_data_smith.csv')

In [6]:
#Prepare the extracted data
import prepare

train, test = prepare.prepare(extracted_df)

In [7]:
train.shape, test.shape

((961, 230), (241, 230))

In [17]:
#Drop columns that are categorical. These columns don't offer any value
cols_to_drop = train.select_dtypes('object').columns
cols_to_drop

train.drop(columns = cols_to_drop, inplace = True)
test.drop(columns = cols_to_drop, inplace = True)

In [18]:
train.shape, test.shape

((961, 225), (241, 225))

In [19]:
#Now split into X and y groups
X_train, X_test = train.drop(columns = ['winningTeam']), test.drop(columns = ['winningTeam'])
y_train, y_test = train.winningTeam, test.winningTeam

In [20]:
#Create the dict of params to iterate through
rf_dict = {
    'max_depth': range(1, 16),
    'min_samples_leaf': range(1, 16)
}

In [21]:
#Now create models and return the best one
best_model = get_random_forest_models(X_train, y_train, rf_dict, cv = 5)

Mean Cross-Validated Accuracy:  0.9417
Max Depth:  11
Min Samples Per Leaf:  1


In [27]:
#What were the top ten features?
def get_best_features(model, X_train, num_features = 10):
    """
    This function gets the best features of the desired model and prints them out.
    You can change how many features are shown with num_features.
    This function returns nothing.
    """
    #What were the most important features?
    best_features = pd.DataFrame(best_model.feature_importances_, X_train.columns)
    print(best_features.sort_values(by = 0, ascending = False).head(num_features))

In [28]:
get_best_features(best_model, X_train)

                                   0
towers_lost_team200         0.148232
towers_lost_team100         0.127159
inhibs_lost_team100         0.121042
inhibs_lost_team200         0.098885
dragon_team200              0.028439
dragon_team100              0.023849
baron_team100               0.022604
baron_team200               0.021653
RedTeamTotalGoldDifference  0.014865
BlueTeamXp                  0.009312


# Build Models For New Data at 10 Minutes

In [1]:
import acquire
import prepare
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score, GridSearchCV, train_test_split, cross_validate
from sklearn.ensemble import RandomForestClassifier

__Extract Function Does Not Work For Time = 10__

I need to wait until Joshua can take a look and fix the issue. However, this is not top priority.

In [None]:
#Need to extract data for the 10 minute mark
extracted_data = acquire.build_extracted_df(username = 'smith', path = './new_data/', time = 10)