# Final Model Performance
Following is a notebook with the performance of the final models with hyperparameter tuning. The ideal hyperparameters were found for the offense-only model in `model.ipynb` and for the full model in `full_model.ipynb`.

## Library Imports

In [None]:
from pipeline import FullPipeWrapper
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

## Pipeline and Model Instantiation and Fitting
Next, we instantiated the pipeline wrapper class.

In [2]:
full_pipe = FullPipeWrapper()

positional data already downloaded.
reading positional data.
returning positional data.


Then, we instantiated the model and the best parameters for the offense-only model predicting the x grid.

In [32]:
# Instantiate model
lr_model = LogisticRegression(random_state=0)

# Create list of columns to reference for offense-only model
perm_feats = ['possessionTeam','down','offenseFormation']
situational = ['score_differential', 'timeRemaining','yardline_first_dir','yardline_100_dir']
reduction = ['perc_left','perc_right','perc_behind_los']
pos = ['FB','HB','QB','RB','TE','WR']
coords = ['FBL0_x','FBR0_x','HBL0_x','HBL1_x','HBR0_x','HBR1_x','QB0_x','QB1_x','RBL0_x','RBL1_x','RBL2_x','RBR0_x','RBR1_x','RBR2_x','TEL0_x','TEL1_x','TEL2_x','TER0_x','TER1_x','TER2_x','WRL0_x','WRL1_x','WRL2_x','WRL3_x','WRR0_x','WRR1_x','WRR2_x','WRR3_x','FBL0_y','FBR0_y','HBL0_y','HBL1_y','HBR0_y','HBR1_y','QB0_y','QB1_y','RBL0_y','RBL1_y','RBL2_y','RBR0_y','RBR1_y','RBR2_y','TEL0_y','TEL1_y','TEL2_y','TER0_y','TER1_y','TER2_y','WRL0_y','WRL1_y','WRL2_y','WRL3_y','WRR0_y','WRR1_y','WRR2_y','WRR3_y','FBL0_in','FBR0_in','HBL0_in','HBL1_in','HBR0_in','HBR1_in','QB0_in','QB1_in','RBL0_in','RBL1_in','RBL2_in','RBR0_in','RBR1_in','RBR2_in','TEL0_in','TEL1_in','TEL2_in','TER0_in','TER1_in','TER2_in','WRL0_in','WRL1_in','WRL2_in','WRL3_in','WRR0_in','WRR1_in','WRR2_in','WRR3_in']

# Create dictionary with best parameters for offensive x prediction.
best_params_off_x = {'model__C': [1],
'model__class_weight': ['balanced'],
'model__penalty': ['l2'],
'model__solver': ['newton-cg'],
'off_full_pipe__full_cols__select_cols__columns': [perm_feats + situational + coords]}

# Create pipeline object and grid search, then fit.
lr_pipe_off_x = full_pipe.build_pipe(side='off', model=lr_model)
best_off_x = GridSearchCV(lr_pipe_off_x, best_params_off_x, cv=5, scoring="f1_macro")
best_off_x.fit(full_pipe.X_train, full_pipe.y_train_x)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('select_cols',
                                        FunctionTransformer(func=<function FullPipeWrapper.build_pipe.<locals>.<lambda> at 0x7fae1cc73c10>)),
                                       ('off_full_pipe',
                                        Pipeline(steps=[('full_cols',
                                                         Pipeline(steps=[('off_pre_one',
                                                                          ColumnTransformer(transformers=[('info_scale',
                                                                                                           StandardScaler(),
                                                                                                           Index(['perc_left', 'perc_right', 'perc_behind_los', 'FB...
                         'off_full_pipe__full_cols__select_cols__columns': [['possessionTeam',
                                                            

Next, we created a dictionary holding the best offensive hyperparameters for predicting the y grid space. We then fitted a grid search object.

In [35]:
best_params_off_y = {'model__C': [0.001],
                      'model__class_weight': ['balanced'],
                      'model__penalty': ['none'],
                      'model__solver': ['newton-cg'],
                      'off_full_pipe__full_cols__select_cols__columns': [perm_feats + situational + coords]}
lr_pipe_off_y = full_pipe.build_pipe(side='off', model=lr_model)
best_off_y = GridSearchCV(lr_pipe_off_y, best_params_off_y, cv=5, scoring = 'f1_macro')
best_off_y.fit(full_pipe.X_train, full_pipe.y_train_y)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('select_cols',
                                        FunctionTransformer(func=<function FullPipeWrapper.build_pipe.<locals>.<lambda> at 0x7fadd2382790>)),
                                       ('off_full_pipe',
                                        Pipeline(steps=[('full_cols',
                                                         Pipeline(steps=[('off_pre_one',
                                                                          ColumnTransformer(transformers=[('info_scale',
                                                                                                           StandardScaler(),
                                                                                                           Index(['perc_left', 'perc_right', 'perc_behind_los', 'FB...
                         'off_full_pipe__full_cols__select_cols__columns': [['possessionTeam',
                                                            

We then created a dictionary holding the best parameters for the full model predicting the x grid. We also fitted the grid search to the pipeline and these hyperparameters.

In [43]:
best_params_full_x = {'full_pipe__def__def_clust_pass__def_clust__cols': [['%B', '%M', '%Z']],
                      'full_pipe__def__def_clust_pass__def_clust__n_clusters': [13],
                      'full_pipe__def__def_clust_pass__def_clust__pca_variance': [0.55],
                      'full_pipe__def__def_clust_pass__pass__select_cols__columns': [[]],
                      'model__C': [100],
                      'model__multi_class': ['auto'],
                      'model__penalty': ['l2'],
                      'model__solver': ['newton-cg']}
lr_pipe_full_x = full_pipe.build_pipe(side='both', model=lr_model)
best_full_x = GridSearchCV(lr_pipe_full_x, best_params_full_x, cv=5, scoring = 'f1_macro')
best_full_x.fit(full_pipe.X_train, full_pipe.y_train_x)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('full_pipe',
                                        ColumnTransformer(transformers=[('off',
                                                                         Pipeline(steps=[('full_cols',
                                                                                          Pipeline(steps=[('off_pre_one',
                                                                                                           ColumnTransformer(transformers=[('info_scale',
                                                                                                                                            StandardScaler(),
                                                                                                                                            Index(['perc_left', 'perc_right', 'perc_behind_los', 'FB', 'HB', 'QB', 'RB',
       'TE', 'WR', 'score_differential', 'timeRemaining', 'yardline_firs...
             param

We then did the same with the full model in predicting the y gridspace.

In [46]:
best_params_full_y = {'full_pipe__def__def_clust_pass__def_clust__cols': [['DB',
                                                     'LB',
                                                     'DL',
                                                     '%B',
                                                     '%M',
                                                     '%Z']],
                      'full_pipe__def__def_clust_pass__def_clust__n_clusters': [5],
                      'full_pipe__def__def_clust_pass__def_clust__pca_variance': [0.95],
                      'full_pipe__def__def_clust_pass__pass__select_cols__columns': [full_pipe.def_start_col_x],
                      'model__C': [0.1],
                      'model__multi_class': ['auto'],
                      'model__penalty': ['l2'],
                      'model__solver': ['lbfgs']}
lr_pipe_full_y = full_pipe.build_pipe(side='both', model=lr_model)
best_full_y = GridSearchCV(lr_pipe_full_y, best_params_full_y, cv=5, scoring = 'f1_macro')
best_full_y.fit(full_pipe.X_train, full_pipe.y_train_y)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('full_pipe',
                                        ColumnTransformer(transformers=[('off',
                                                                         Pipeline(steps=[('full_cols',
                                                                                          Pipeline(steps=[('off_pre_one',
                                                                                                           ColumnTransformer(transformers=[('info_scale',
                                                                                                                                            StandardScaler(),
                                                                                                                                            Index(['perc_left', 'perc_right', 'perc_behind_los', 'FB', 'HB', 'QB', 'RB',
       'TE', 'WR', 'score_differential', 'timeRemaining', 'yardline_firs...
                  

## Model Predictions and Performance
### Full Model X Grid
The full model performed similarly well on the test set as on the training set. The recall for predicting the second quadrant was particularly high at 77%. The accuracy was also fairly high given that there are four classes.

In [47]:
from sklearn.metrics import classification_report

y_pred_full_x = best_full_x.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_x, y_pred_full_x))

              precision    recall  f1-score   support

           1       0.40      0.17      0.24       551
           2       0.48      0.77      0.59      1213
           3       0.43      0.45      0.44       806
           4       0.18      0.01      0.03       492

    accuracy                           0.46      3062
   macro avg       0.37      0.35      0.32      3062
weighted avg       0.40      0.46      0.40      3062



### Full model Y Grid
The full model performed relatively poorly at predicting the Y grid. The macro F1 was fairly similar at 29% vs 32% in predicting the X grid. However, the accuracy was far lower.

In [48]:
y_pred_full_y = best_full_y.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_y, y_pred_full_y))

              precision    recall  f1-score   support

           0       0.32      0.27      0.29       806
           1       0.29      0.41      0.34       842
           2       0.28      0.33      0.30       741
           3       0.31      0.16      0.21       673

    accuracy                           0.30      3062
   macro avg       0.30      0.29      0.29      3062
weighted avg       0.30      0.30      0.29      3062



### Offensive model X Grid
The offensive model had a higher F1 macro than the full model. However, the accuracy was substantially lower than using the full model. It performed worse when predicting the most common quadrant, the second.

In [49]:
y_pred_off_x = best_off_x.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_x, y_pred_off_x))

              precision    recall  f1-score   support

           1       0.28      0.39      0.33       551
           2       0.54      0.44      0.48      1213
           3       0.42      0.35      0.38       806
           4       0.21      0.28      0.24       492

    accuracy                           0.38      3062
   macro avg       0.36      0.36      0.36      3062
weighted avg       0.41      0.38      0.39      3062



### Offensive model Y Grid
The offense-only model had a higher macro F1 score compared to the full model. The accuracy was similar, and the F1 score was lower for quadrant 1 but higher for quadrant 2.

In [50]:
y_pred_off_y = best_off_y.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_y, y_pred_off_y))

              precision    recall  f1-score   support

           0       0.32      0.33      0.33       806
           1       0.30      0.23      0.26       842
           2       0.29      0.33      0.31       741
           3       0.28      0.30      0.29       673

    accuracy                           0.30      3062
   macro avg       0.30      0.30      0.30      3062
weighted avg       0.30      0.30      0.30      3062



In [51]:
from sklearn.dummy import DummyClassifier

dummy_x = DummyClassifier(strategy="most_frequent")
dummy_x.fit(full_pipe.X_train, full_pipe.y_train_x)

DummyClassifier(strategy='most_frequent')

### Dummy Classifier (Most Frequent) X Grid

The dummy classifier predicting the most frequent class had a substantially lower accuracy than the full model predicting the X grid. However, it has a higher accuracy than the offense-only model.

In [52]:
y_pred_dummy_x = dummy_x.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_x, y_pred_dummy_x))

              precision    recall  f1-score   support

           1       0.00      0.00      0.00       551
           2       0.40      1.00      0.57      1213
           3       0.00      0.00      0.00       806
           4       0.00      0.00      0.00       492

    accuracy                           0.40      3062
   macro avg       0.10      0.25      0.14      3062
weighted avg       0.16      0.40      0.22      3062



### Dummy Classifier (Stratified) X Grid
We used a stratified dummy classifier to compare the macro F1 scores. Using this method the macro F1 score was 24%, which is lower than for the full model or offense-only model.

In [61]:
dummy_strat = DummyClassifier(strategy="stratified", random_state=0)
dummy_strat.fit(full_pipe.X_train, full_pipe.y_train_x)
print(classification_report(full_pipe.y_test_x, dummy_strat.predict(full_pipe.X_test)))

              precision    recall  f1-score   support

           1       0.17      0.18      0.17       551
           2       0.40      0.40      0.40      1213
           3       0.27      0.28      0.27       806
           4       0.13      0.12      0.12       492

    accuracy                           0.28      3062
   macro avg       0.24      0.24      0.24      3062
weighted avg       0.28      0.28      0.28      3062



### Dummy Classifier (Most Frequent) Y Grid
The dummy classifier predicing the most frequent class performed worse than the other two models in accuracy.

In [54]:
dummy_y = DummyClassifier(strategy="most_frequent")
dummy_y.fit(full_pipe.X_train, full_pipe.y_train_y)
y_pred_dummy_y = dummy_y.predict(full_pipe.X_test)
print(classification_report(full_pipe.y_test_y, y_pred_dummy_y))

              precision    recall  f1-score   support

           0       0.00      0.00      0.00       806
           1       0.27      1.00      0.43       842
           2       0.00      0.00      0.00       741
           3       0.00      0.00      0.00       673

    accuracy                           0.27      3062
   macro avg       0.07      0.25      0.11      3062
weighted avg       0.08      0.27      0.12      3062



### Dummy Classifier (Stratified) Y Grid
A stratified dummy classifier had a macro F1 score of 25%, which is lower than 29% and 30% for the full model and offense-only model, respectively.

In [60]:
dummy_strat = DummyClassifier(strategy="stratified", random_state=0)
dummy_strat.fit(full_pipe.X_train, full_pipe.y_train_y)
print(classification_report(full_pipe.y_test_y, dummy_strat.predict(full_pipe.X_test)))

              precision    recall  f1-score   support

           0       0.25      0.22      0.24       806
           1       0.29      0.29      0.29       842
           2       0.23      0.25      0.24       741
           3       0.23      0.23      0.23       673

    accuracy                           0.25      3062
   macro avg       0.25      0.25      0.25      3062
weighted avg       0.25      0.25      0.25      3062

