# 6 - Tuned Models

In [None]:
import pandas as pd 
from xgboost import XGBRegressor
from sklearn.metrics import r2_score, root_mean_squared_error

I am manually copying the best parameters for each configuration as the `Hyperparameters` notebook takes a long time to run, and I want this to be more useable.<p>
_(Note: XBGRegressor on my machine takes alpha and lambda as `reg_alpha` and `reg_lambda`, otherwise it doesn't recognize the variable.)_<p>
_(Memo to me: the `X_train`s and `X_test`s change depending on what columns got dropped, but the target variable does not. It's always the same `y_train`s and `y_test`s.)_

## Tuned with all features

**DC set, all features parameters**:<br>
alpha=0.1, lambda=100, learning_rate=0.3, max_depth=6, n_estimators=180, objective='reg:squarederror', random_state=42

In [2]:
X_train_dc = pd.read_pickle('pickles/split/X_train_dc.pkl')
X_test_dc = pd.read_pickle('pickles/split/X_test_dc.pkl')
y_train_dc = pd.read_pickle('pickles/split/y_train_dc.pkl')
y_test_dc = pd.read_pickle('pickles/split/y_test_dc.pkl')

In [3]:
xgb_dc_allfeat = XGBRegressor(reg_alpha=0.1, reg_lambda=100, learning_rate=0.3, max_depth=6, n_estimators=180, objective='reg:squarederror', random_state=42)
xgb_dc_allfeat.fit(X_train_dc,y_train_dc)

dc_allfeat_pred = xgb_dc_allfeat.predict(X_test_dc)

In [None]:
dc_allfeat_rmse = root_mean_squared_error(y_test_dc, dc_allfeat_pred)
dc_allfeat_r2 = r2_score(y_test_dc, dc_allfeat_pred)

In [None]:
print(f'Root Mean Squared Error: {dc_allfeat_rmse} \nR-Squared: {dc_allfeat_r2}')

Root Mean Squared Error: 68.69202506023652 
Mean Absolute Error: 44.30349447055229 
R-Squared: 0.9029864072799683


**London set, all features parameters**:<br>
alpha=0.75, lambda=125, learning_rate=0.3, max_depth=7, n_estimators=70, objective='reg:squarederror', random_state=42

In [6]:
X_train_l = pd.read_pickle('pickles/split/X_train_l.pkl')
X_test_l = pd.read_pickle('pickles/split/X_test_l.pkl')
y_train_l = pd.read_pickle('pickles/split/y_train_l.pkl')
y_test_l = pd.read_pickle('pickles/split/y_test_l.pkl')

In [7]:
xgb_lond_allfeat = XGBRegressor(reg_alpha=0.75, reg_lambda=125, learning_rate=0.3, max_depth=7, n_estimators=70, objective='reg:squarederror', random_state=42)
xgb_lond_allfeat.fit(X_train_l,y_train_l)

l_allfeat_pred = xgb_lond_allfeat.predict(X_test_l)

In [None]:
l_allfeat_rmse = root_mean_squared_error(y_test_l,l_allfeat_pred)
l_allfeat_r2 = r2_score(y_test_l,l_allfeat_pred)

In [None]:
print(f'Root Mean Squared Error: {l_allfeat_rmse} \nR-Squared: {l_allfeat_r2}')

Root Mean Squared Error: 299.93461916669133 
Mean Absolute Error: 177.2631608378057 
R-Squared: 0.929388165473938


## Tuned with Forward/Backward Select Features

**DC set, FwBw features parameters**:<br>
alpha=0.8, lambda=50, learning_rate=0.2, max_depth=6, n_estimators=190, objective='reg:squarederror', random_state=42

In [10]:
X_train_dc_fwbw = pd.read_pickle('pickles/split/fwbw/X_train_dc_fwbw.pkl')
X_test_dc_fwbw = pd.read_pickle('pickles/split/fwbw/X_test_dc_fwbw.pkl')

In [11]:
xgb_dc_fwbw = XGBRegressor(reg_alpha=0.8, reg_lambda=50, learning_rate=0.2, max_depth=6, n_estimators=190, objective='reg:squarederror', random_state=42)
xgb_dc_fwbw.fit(X_train_dc_fwbw,y_train_dc)

dc_fwbw_pred = xgb_dc_fwbw.predict(X_test_dc_fwbw)

In [None]:
dc_fwbw_rmse = root_mean_squared_error(y_test_dc,dc_fwbw_pred)
dc_fwbw_r2 = r2_score(y_test_dc,dc_fwbw_pred)

In [None]:
print(f'Root Mean Squared Error: {dc_fwbw_rmse} \nR-Squared: {dc_fwbw_r2}')

Root Mean Squared Error: 69.8046989290312 
Mean Absolute Error: 45.48694607781405 
R-Squared: 0.8998181223869324


**London set, FwBw features parameters**:<br>
alpha=0.01, lambda=10, learning_rate=0.1, max_depth=6, n_estimators=150, objective='reg:squarederror', random_state=42

In [14]:
X_train_l_fwbw = pd.read_pickle('pickles/split/fwbw/X_train_l_fwbw.pkl')
X_test_l_fwbw = pd.read_pickle('pickles/split/fwbw/X_test_l_fwbw.pkl')

In [15]:
xgb_lond_fwbw = XGBRegressor(reg_alpha=0.01, reg_lambda=10, learning_rate=0.1, max_depth=6, n_estimators=150, objective='reg:squarederror', random_state=42)
xgb_lond_fwbw.fit(X_train_l_fwbw,y_train_l)

l_fwbw_pred = xgb_lond_fwbw.predict(X_test_l_fwbw)

In [None]:
l_fwbw_rmse = root_mean_squared_error(y_test_l,l_fwbw_pred)
l_fwbw_r2 = r2_score(y_test_l,l_fwbw_pred)

In [None]:
print(f'Root Mean Squared Error: {l_fwbw_rmse} \nR-Squared: {l_fwbw_r2}')

Root Mean Squared Error: 287.48854568567185 
Mean Absolute Error: 175.65939068362184 
R-Squared: 0.935126781463623


## Tuned with Lasso Select Features

**DC set, Lasso features parameters**:<br>
alpha=1, lambda=130, learning_rate=0.2, max_depth=6, n_estimators=190, objective='reg:squarederror', random_state=42

In [18]:
X_train_dc_lasso = pd.read_pickle('pickles/split/lasso/X_train_dc_lasso.pkl')
X_test_dc_lasso = pd.read_pickle('pickles/split/lasso/X_test_dc_lasso.pkl')

In [19]:
xgb_dc_lasso = XGBRegressor(reg_alpha=1, reg_lambda=130, learning_rate=0.2, max_depth=6, n_estimators=190, objective='reg:squarederror', random_state=42)
xgb_dc_lasso.fit(X_train_dc_lasso,y_train_dc)

dc_lasso_pred = xgb_dc_lasso.predict(X_test_dc_lasso)

In [None]:
dc_lasso_rmse = root_mean_squared_error(y_test_dc,dc_lasso_pred)
dc_lasso_r2 = r2_score(y_test_dc,dc_lasso_pred)

In [None]:
print(f'Root Mean Squared Error: {dc_lasso_rmse} \nR-Squared: {dc_lasso_r2}')

Root Mean Squared Error: 73.38944653955072 
Mean Absolute Error: 46.943603485819494 
R-Squared: 0.8892644643783569


**London set, Lasso features parameters**:<br>
alpha=0.01, lambda=10, learning_rate=0.1, max_depth=6, n_estimators=160, objective='reg:squarederror', random_state=42

In [22]:
X_train_l_lasso = pd.read_pickle('pickles/split/lasso/X_train_l_lasso.pkl')
X_test_l_lasso = pd.read_pickle('pickles/split/lasso/X_test_l_lasso.pkl')

In [23]:
xgb_lond_lasso = XGBRegressor(reg_alpha=0.01, reg_lambda=10, learning_rate=0.1, max_depth=6, n_estimators=160, objective='reg:squarederror', random_state=42)
xgb_lond_lasso.fit(X_train_l_lasso,y_train_l)

l_lasso_pred = xgb_lond_lasso.predict(X_test_l_lasso)

In [None]:
l_lasso_rmse = root_mean_squared_error(y_test_l,l_lasso_pred)
l_lasso_r2 = r2_score(y_test_l,l_lasso_pred)

In [None]:
print(f'Root Mean Squared Error: {l_lasso_rmse} \nR-Squared: {l_lasso_r2}')

Root Mean Squared Error: 286.0157863908551 
Mean Absolute Error: 172.2289506503164 
R-Squared: 0.9357897639274597


Compare

In [33]:
print('DC set RMSE values: \n  Basic model: 69.92294989890084 \n  Tuned all features:',dc_allfeat_rmse,'\n  FwBw features:',dc_fwbw_rmse,'\n  Lasso features:',dc_lasso_rmse)
print('\nDC set R2 values: \n  Basic model: 0.8994784355163574 \n  Tuned all features:',dc_allfeat_r2,'\n  FwBw features:',dc_fwbw_r2,'\n  Lasso features:',dc_lasso_r2)
print('\n\nLondon set RMSE values: \n  Basic model: 332.20786503370334 \n  Tuned all features:',l_allfeat_rmse,'\n  FwBw features:',l_fwbw_rmse,'\n  Lasso features:',l_lasso_rmse)
print('\nLondon set R2 values: \n  Basic model: 0.9133748412132263 \n  Tuned all features:',l_allfeat_r2,'\n  FwBw features:',l_fwbw_r2,'\n  Lasso features:',l_lasso_r2)

DC set RMSE values: 
  Basic model: 69.92294989890084 
  Tuned all features: 68.69202506023652 
  FwBw features: 69.8046989290312 
  Lasso features: 73.38944653955072

DC set R2 values: 
  Basic model: 0.8994784355163574 
  Tuned all features: 0.9029864072799683 
  FwBw features: 0.8998181223869324 
  Lasso features: 0.8892644643783569


London set RMSE values: 
  Basic model: 332.20786503370334 
  Tuned all features: 299.93461916669133 
  FwBw features: 287.48854568567185 
  Lasso features: 286.0157863908551

London set R2 values: 
  Basic model: 0.9133748412132263 
  Tuned all features: 0.929388165473938 
  FwBw features: 0.935126781463623 
  Lasso features: 0.9357897639274597


With both sets, the difference between the three tuned models was very slight, and the difference between those and the basic model was _also_ very slight.<p>
Both models have a high R2 score, so it seems that the models do a good job of accounting for the data. To get an idea of if the RMSE is in a decent range or not, we need to know the maximum values in each set. (The minimum is 0.) 

In [51]:
dc_data = pd.read_pickle('pickles/dc_data.pkl')
london_data = pd.read_pickle('pickles/london_data.pkl')

In [52]:
print('DC count max:',dc_data['count'].max())
print('London count max:',london_data['count'].max())

DC count max: 977
London count max: 7860


With those values, we can see how big an error the RMSE actually is compared to the highest value possible in the data. (I am using the highest, or worst, value here.)

In [67]:
print('DC set error %:',round(73/977*100,2))
print('London set error %:',round(332/7860*100,2))

DC set error %: 7.47
London set error %: 4.22


Additionally, if we look at the standard deviations for each set - 

In [None]:
print("For DC's 'count':\n",dc_data['count'].describe())
print("DC's worst RMSE: ~73")

For DC's 'count':
 count    17544.000000
mean       187.681202
std        181.456478
min          0.000000
25%         38.000000
50%        140.000000
75%        279.000000
max        977.000000
Name: count, dtype: float64
DC RMSE: ~73


In [69]:
print("For London's 'count':\n",london_data['count'].describe())
print("London's worst RMSE: ~332")

For London's 'count':
 count    17520.000000
mean      1136.185616
std       1085.446245
min          0.000000
25%        245.000000
50%        836.000000
75%       1661.000000
max       7860.000000
Name: count, dtype: float64
London's worst RMSE: ~332


In both cases, the RMSE is significantly lower than the standard deviation, so the model is accounting for a good portion of the variation occuring. 