# Model Insights
---

## Objective
dskjfnasdljfn

-----
#### External Libraries Import

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pickle
import warnings
warnings.filterwarnings('ignore')

#### Read Data and Models

In [2]:
df_pga = pd.read_csv('../Data/Sets/final_model.csv')
slow_model = pickle.load(open('../Best_Models/slow_model.pk', 'rb'))
fast_model = pickle.load(open('../Best_Models/fast_model.pk', 'rb'))

#### Prepare Data

In [3]:
features = [
    col for col in df_pga.columns if col not in ['date', 'finish', 
                                                  'player', 'event', 
                                                  'sg:_off-the-tee',
                                                  'sg:_approach-the-green',
                                                  'sg:_around-the-green',
                                                  'sg:_putting',
                                                  'sg:_total']]

X = df_pga[features]
y = df_pga['sg:_total']

# train, test split
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                        test_size = 0.3, random_state = 77)

# standardize the data using StandardScaler
ss = StandardScaler()
X_train_sc = ss.fit_transform(X_train)
X_test_sc = ss.transform(X_test)

## <u>Compare Models<u/>

### Score Models on Full Dataset

In [4]:
print(f'The slow model explains {round((slow_model.score(X_test_sc, y_test)*100), 2)}%\
 of variation in total strokes gained.')
print(f'The fast model explains {round((fast_model.score(X_test_sc, y_test)*100), 2)}%\
 of variation in total strokes gained.')

The slow model explains 58.41% of total strokes gained.
The fast model explains 61.32% of total strokes gained.


#### Takeaways:
- Although the difference in explanatory power between the two models is only three percent, its meaningful in the unpredictable game of golf.
- The difference demonstrates the fact that performance is more difficult to explain for golfers that swing the club slower.

### Interpret Coefficients

#### Below Average Club Head Speed Players

In [12]:
slow = pd.DataFrame(slow_model.coef_, columns=['slow_coefs'])
slow['slow_abs_coefs'] = abs(slow_model.coef_)
slow.index = X_train.columns
slow = slow.sort_values('slow_abs_coefs', ascending=False).head(12)

# create standard deviation column
slow_std = []
for col in slow.index:
    slow_std.append(df_pga[col].std())
slow['std._dev'] = slow_std
slow

Unnamed: 0,slow_coefs,slow_abs_coefs,std._dev
greens_in_regulation_percentage,0.547266,0.547266,7.533094
scrambling,0.401197,0.401197,10.716159
putting_average,-0.353627,0.353627,0.078363
overall_putting_average,-0.175324,0.175324,0.074604
going_for_the_green_-_hit_green_pct.,0.100728,0.100728,18.173266
going_for_the_green_-_birdie_or_better,0.072928,0.072928,20.238183
putting_from_-_10-25',0.070319,0.070319,27.231458
club_head_speed,0.060038,0.060038,4.235433
3-putt_avoidance,-0.048774,0.048774,2.034111
sand_save_percentage,0.043033,0.043033,22.914026


#### Above Average Club Head Speed Players

In [11]:
fast = pd.DataFrame(fast_model.coef_, columns=['fast_coefs'])
fast['fast_abs_coefs'] = abs(fast_model.coef_)
fast.index = X_train.columns
fast = fast.sort_values('fast_abs_coefs', ascending=False).head(12)

# create standard deviation column
fast_std = []
for col in fast.index:
    fast_std.append(df_pga[col].std())
fast['std._dev'] = fast_std
fast

Unnamed: 0,fast_coefs,fast_abs_coefs,std._dev
greens_in_regulation_percentage,0.785991,0.785991,7.533094
overall_putting_average,-0.497467,0.497467,0.074604
scrambling,0.471522,0.471522,10.716159
putting_average,-0.339779,0.339779,0.078363
going_for_the_green_-_hit_green_pct.,0.18505,0.18505,18.173266
one-putt_percentage,-0.108674,0.108674,6.335913
fairway_proximity,0.107791,0.107791,52.124938
putts_per_round,-0.09894,0.09894,1.342851
going_for_the_green,-0.094355,0.094355,19.28453
scrambling_from_10-30_yards,-0.082979,0.082979,33.867141


## Comparison of Slow Swing Golfers to Fast Swing Golfers

- Size of coefficients
    - The first thing to notice in this comparison is the magnitude of the coefficients. The fast swing speed model produced much larger coefficients than the slow swing speed model. 
    - A single standard deviation increase (0.0746 putts) in a players overall putting average decreases a fast swing speed players' strokes gained by 0.497 while it only decreases a slow swing speed players' strokes gained by only 0.175.
<br><br>
- Club head speed
    - The strength of club head speed as a predictor is similar for both models. It increases a players strokes gained by 0.06 - 0.07 for every increase of 4.23 mph. 
<br><br>
- Interesting takeaways
    - Going for the green percentage hurts a golfer with a fast swing speed. This implies that there is a sweet spot for going for the green percentage; there are times when a player should not go for the green.
    - Putting features hurt golfers with a fast swing speed much more than golfers with a slow swing speed.
    - Important putts from 6 feet, 10 feet, 10-25 feet are stronger predictors for slower swing players. This is because they have to account for the strokes they lost before getting to the green.
    - For players with a slow swing speed, areas that account for total strokes gained are more spread out among the putting and going for green statistics. This reiterates the fact that their success is more difficult to explain.
    - For players with a fast swing speed, it is clear that their success comes from getting to the green in regulation, getting close to the hole off the fairway, and getting up and down around the green.

## Conclusion

The biggest takeaway is that golfers with slower club head speeds are making up for it in some unexplainable way. The reason a fast club head speed helps players perform well in the game of golf is because the ball travels further not only off the tee but from the fairway as well. These players gain enough strokes from the tee and approaching the green that they give themselves room to miss a few putts. Whereas golfers with slower swing speeds have to make up for it when scrambling around the green and making long putts. 
<br><br>
Increases your club head speed as a golfer is difficult and their are physical, biological limitations on improving it. So how do you improve strokes gained?
- Clutch putting
- Birdie Conversions
- Going for green accuracy
- Taking risks