# Predict Bike Sharing Demand with AutoGluon Template

## Project: Predict Bike Sharing Demand with AutoGluon
This notebook is a template with each step that you need to complete for the project.

Please fill in your code where there are explicit `?` markers in the notebook. You are welcome to add more cells and code as you see fit.

Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.

`File-> Export Notebook As... -> Export Notebook as HTML`

There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.

Completing the code template and writeup template will cover all of the rubric points for this project.

The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.

## Step 1: Create an account with Kaggle

### Create Kaggle Account and download API key
Below is example of steps to get the API username and key. Each student will have their own username and key.

1. Open account settings.
![kaggle1.png](attachment:kaggle1.png)
![kaggle2.png](attachment:kaggle2.png)
2. Scroll down to API and click Create New API Token.
![kaggle3.png](attachment:kaggle3.png)
![kaggle4.png](attachment:kaggle4.png)
3. Open up `kaggle.json` and use the username and key.
![kaggle5.png](attachment:kaggle5.png)

## Step 2: Download the Kaggle dataset using the kaggle python library

### Open up Sagemaker Studio and use starter template

1. Notebook should be using a `ml.t3.medium` instance (2 vCPU + 4 GiB)
2. Notebook should be using kernal: `Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)`

### Install packages

In [1]:
!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
# Without --no-cache-dir, smaller aws instances may have trouble installing

Collecting numpy>=1.11.3 (from bokeh==2.0.1)
  Using cached numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Using cached numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 2.0.2
    Uninstalling numpy-2.0.2:
      Successfully uninstalled numpy-2.0.2
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
holoviews 1.20.2 requires bokeh>=3.1, but you have bokeh 2.0.1 which is incompatible.
panel 1.7.1 requires bokeh<3.8.0,>=3.5.0, but you have bokeh 2.0.1 which is incompatible.
thinc 8.3.6 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.[0m[31m
[0mSuccessfully installed numpy-1.26.4
Collecting numpy<2.3.0,>=1.25.0 (from autogluon.core==1.3.1->autogluon.core

### Setup Kaggle API Key

In [7]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

In [8]:
!kaggle competitions list


ref                                                                                deadline             category                reward  teamCount  userHasEntered  
---------------------------------------------------------------------------------  -------------------  ---------------  -------------  ---------  --------------  
https://www.kaggle.com/competitions/arc-prize-2025                                 2025-11-03 23:59:00  Featured         1,000,000 Usd        465           False  
https://www.kaggle.com/competitions/openai-to-z-challenge                          2025-06-29 23:59:00  Featured           400,000 Usd          0           False  
https://www.kaggle.com/competitions/waveform-inversion                             2025-06-30 23:59:00  Research            50,000 Usd        960           False  
https://www.kaggle.com/competitions/cmi-detect-behavior-with-sensor-data           2025-09-02 23:59:00  Featured            50,000 Usd        510           False  
https://www.kagg

### Download and explore dataset

### Go to the bike sharing demand competition and agree to the terms
![kaggle6.png](attachment:kaggle6.png)

In [9]:
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
!unzip -o bike-sharing-demand.zip

bike-sharing-demand.zip: Skipping, found more recently modified local copy (use --force to force download)
Archive:  bike-sharing-demand.zip
  inflating: sampleSubmission.csv    
  inflating: test.csv                
  inflating: train.csv               


In [46]:
import pandas as pd
from autogluon.tabular import TabularPredictor

In [10]:
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv("train.csv", parse_dates=["datetime"])
train.head()

Unnamed: 0,datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed,casual,registered,count
0,2011-01-01 00:00:00,1,0,0,1,9.84,14.395,81,0.0,3,13,16
1,2011-01-01 01:00:00,1,0,0,1,9.02,13.635,80,0.0,8,32,40
2,2011-01-01 02:00:00,1,0,0,1,9.02,13.635,80,0.0,5,27,32
3,2011-01-01 03:00:00,1,0,0,1,9.84,14.395,75,0.0,3,10,13
4,2011-01-01 04:00:00,1,0,0,1,9.84,14.395,75,0.0,0,1,1


In [35]:
train.describe().transpose()

Unnamed: 0,count,mean,min,25%,50%,75%,max,std
datetime,10886.0,2011-12-27 05:56:22.399411968,2011-01-01 00:00:00,2011-07-02 07:15:00,2012-01-01 20:30:00,2012-07-01 12:45:00,2012-12-19 23:00:00,
season,10886.0,2.506614,1.0,2.0,3.0,4.0,4.0,1.116174
holiday,10886.0,0.028569,0.0,0.0,0.0,0.0,1.0,0.166599
workingday,10886.0,0.680875,0.0,0.0,1.0,1.0,1.0,0.466159
weather,10886.0,1.418427,1.0,1.0,1.0,2.0,4.0,0.633839
temp,10886.0,20.23086,0.82,13.94,20.5,26.24,41.0,7.79159
atemp,10886.0,23.655084,0.76,16.665,24.24,31.06,45.455,8.474601
humidity,10886.0,61.88646,0.0,47.0,62.0,77.0,100.0,19.245033
windspeed,10886.0,12.799395,0.0,7.0015,12.998,16.9979,56.9969,8.164537
casual,10886.0,36.021955,0.0,4.0,17.0,49.0,367.0,49.960477


In [36]:
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv("test.csv", parse_dates=["datetime"])
test.head()

Unnamed: 0,datetime,season,holiday,workingday,weather,temp,atemp,humidity,windspeed
0,2011-01-20 00:00:00,1,0,1,1,10.66,11.365,56,26.0027
1,2011-01-20 01:00:00,1,0,1,1,10.66,13.635,56,0.0
2,2011-01-20 02:00:00,1,0,1,1,10.66,13.635,56,0.0
3,2011-01-20 03:00:00,1,0,1,1,10.66,12.88,56,11.0014
4,2011-01-20 04:00:00,1,0,1,1,10.66,12.88,56,11.0014


In [37]:
test.describe().transpose()

Unnamed: 0,count,mean,min,25%,50%,75%,max,std
datetime,6493.0,2012-01-13 09:27:47.765285632,2011-01-20 00:00:00,2011-07-22 15:00:00,2012-01-20 23:00:00,2012-07-20 17:00:00,2012-12-31 23:00:00,
season,6493.0,2.4933,1.0,2.0,3.0,3.0,4.0,1.091258
holiday,6493.0,0.029108,0.0,0.0,0.0,0.0,1.0,0.168123
workingday,6493.0,0.685815,0.0,0.0,1.0,1.0,1.0,0.464226
weather,6493.0,1.436778,1.0,1.0,1.0,2.0,4.0,0.64839
temp,6493.0,20.620607,0.82,13.94,21.32,27.06,40.18,8.059583
atemp,6493.0,24.012865,0.0,16.665,25.0,31.06,50.0,8.782741
humidity,6493.0,64.125212,16.0,49.0,65.0,81.0,100.0,19.293391
windspeed,6493.0,12.631157,0.0,7.0015,11.0014,16.9979,55.9986,8.250151


In [38]:
# Same thing as train and test dataset
submission = pd.read_csv("sampleSubmission.csv", parse_dates=["datetime"])
submission.head()

Unnamed: 0,datetime,count
0,2011-01-20 00:00:00,0
1,2011-01-20 01:00:00,0
2,2011-01-20 02:00:00,0
3,2011-01-20 03:00:00,0
4,2011-01-20 04:00:00,0


## Step 3: Train a model using AutoGluon’s Tabular Prediction

Requirements:
* We are prediting `count`, so it is the label we are setting.
* Ignore `casual` and `registered` columns as they are also not present in the test dataset.
* Use the `root_mean_squared_error` as the metric to use for evaluation.
* Set a time limit of 10 minutes (600 seconds).
* Use the preset `best_quality` to focus on creating the best model.

In [50]:
train_data = train
train_data = train_data.drop(columns=["casual", "registered"])

In [51]:
predictor = TabularPredictor(
    label="count",
    eval_metric="root_mean_squared_error"
).fit(
    train_data,
    time_limit=600,
    presets="best_quality"
)

No path specified. Models will be saved in: "AutogluonModels/ag-20250606_083508"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.3.1
Python Version:     3.11.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Sun Mar 30 16:01:29 UTC 2025
CPU Count:          2
Memory Avail:       8.89 GB / 12.67 GB (70.2%)
Disk Space Avail:   64.35 GB / 107.72 GB (59.7%)
Presets specified: ['best_quality']
Setting dynamic_stacking from 'auto' to True. Reason: Enable dynamic_stacking when use_bag_holdout is disabled. (use_bag_holdout=False)
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=1
DyStack is enabled (dynamic_stacking=True). AutoGluon will try to determine whether the input data is affected by stacked overfitting and enable or disable stacking as a consequence.
	This is used to identify the optimal `num_stack_levels` value. Copies of AutoGluon will be fit on subsets of the data. Then holdout validation 

[36m(_ray_fit pid=18922)[0m [1000]	valid_set's rmse: 77.7724[32m [repeated 2x across cluster][0m


[36m(_ray_fit pid=18922)[0m 	Ran out of time, early stopping on iteration 1417. Best iteration is:
[36m(_ray_fit pid=18922)[0m 	[1143]	valid_set's rmse: 77.5289
[36m(_ray_fit pid=18959)[0m 	Ran out of time, early stopping on iteration 1350. Best iteration is:
[36m(_ray_fit pid=18959)[0m 	[1295]	valid_set's rmse: 73.7645


[36m(_ray_fit pid=19045)[0m [1000]	valid_set's rmse: 76.4191[32m [repeated 2x across cluster][0m


[36m(_ray_fit pid=19045)[0m 	Ran out of time, early stopping on iteration 1332. Best iteration is:
[36m(_ray_fit pid=19045)[0m 	[1327]	valid_set's rmse: 76.1995
[36m(_ray_fit pid=19082)[0m 	Ran out of time, early stopping on iteration 1343. Best iteration is:
[36m(_ray_fit pid=19082)[0m 	[1082]	valid_set's rmse: 76.0356
[36m(_ray_fit pid=19173)[0m 	Ran out of time, early stopping on iteration 928. Best iteration is:
[36m(_ray_fit pid=19173)[0m 	[851]	valid_set's rmse: 76.8338


[36m(_ray_fit pid=19168)[0m [1000]	valid_set's rmse: 73.0416[32m [repeated 2x across cluster][0m


[36m(_ray_fit pid=19168)[0m 	Ran out of time, early stopping on iteration 1156. Best iteration is:
[36m(_ray_fit pid=19168)[0m 	[1155]	valid_set's rmse: 72.9764
[36m(_dystack pid=17642)[0m 	-74.1512	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=17642)[0m 	48.76s	 = Training   runtime
[36m(_dystack pid=17642)[0m 	3.24s	 = Validation runtime
[36m(_dystack pid=17642)[0m Fitting model: WeightedEnsemble_L3 ... Training model for up to 149.92s of the -38.79s of remaining time.
[36m(_dystack pid=17642)[0m 	Ensemble Weights: {'LightGBMXT_BAG_L2': 0.96, 'KNeighborsDist_BAG_L1': 0.04}
[36m(_dystack pid=17642)[0m 	-74.1229	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=17642)[0m 	0.08s	 = Training   runtime
[36m(_dystack pid=17642)[0m 	0.0s	 = Validation runtime
[36m(_dystack pid=17642)[0m AutoGluon training complete, total runtime = 189.04s ... Best model: WeightedEnsemble_L3 | Estimated inference throughput: 80.1 rows/s (1210 batch 

[36m(_ray_fit pid=19423)[0m [1000]	valid_set's rmse: 129.692
[36m(_ray_fit pid=19422)[0m [1000]	valid_set's rmse: 130.657
[36m(_ray_fit pid=19422)[0m [2000]	valid_set's rmse: 129.849
[36m(_ray_fit pid=19423)[0m [3000]	valid_set's rmse: 128.461[32m [repeated 2x across cluster][0m
[36m(_ray_fit pid=19626)[0m [1000]	valid_set's rmse: 132.725
[36m(_ray_fit pid=19679)[0m [1000]	valid_set's rmse: 128.154
[36m(_ray_fit pid=19679)[0m [2000]	valid_set's rmse: 126.702
[36m(_ray_fit pid=19679)[0m [3000]	valid_set's rmse: 126.147
[36m(_ray_fit pid=19679)[0m [4000]	valid_set's rmse: 125.904
[36m(_ray_fit pid=19749)[0m [1000]	valid_set's rmse: 135.845
[36m(_ray_fit pid=19749)[0m [3000]	valid_set's rmse: 133.639[32m [repeated 4x across cluster][0m


[36m(_ray_fit pid=19679)[0m 	Ran out of time, early stopping on iteration 9212. Best iteration is:
[36m(_ray_fit pid=19679)[0m 	[7106]	valid_set's rmse: 125.339


[36m(_ray_fit pid=19749)[0m [6000]	valid_set's rmse: 132.628[32m [repeated 6x across cluster][0m
[36m(_ray_fit pid=19896)[0m [1000]	valid_set's rmse: 137.712[32m [repeated 4x across cluster][0m
[36m(_ray_fit pid=19896)[0m [4000]	valid_set's rmse: 135.344[32m [repeated 3x across cluster][0m
[36m(_ray_fit pid=19961)[0m [3000]	valid_set's rmse: 138.261[32m [repeated 5x across cluster][0m


[36m(_dystack pid=19289)[0m 	-131.9758	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=19289)[0m 	92.39s	 = Training   runtime
[36m(_dystack pid=19289)[0m 	13.1s	 = Validation runtime
[36m(_dystack pid=19289)[0m Fitting model: WeightedEnsemble_L2 ... Training model for up to 149.87s of the 34.50s of remaining time.
[36m(_dystack pid=19289)[0m 	Ensemble Weights: {'KNeighborsDist_BAG_L1': 1.0}
[36m(_dystack pid=19289)[0m 	-89.9469	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=19289)[0m 	0.04s	 = Training   runtime
[36m(_dystack pid=19289)[0m 	0.0s	 = Validation runtime
[36m(_dystack pid=19289)[0m Fitting 106 L2 models, fit_strategy="sequential" ...
[36m(_dystack pid=19289)[0m Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 34.45s of the 33.78s of remaining time.
[36m(_dystack pid=19289)[0m 	Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy (2 workers, per: cpus=1, gpus=0, memory

[36m(_ray_fit pid=20107)[0m [1000]	valid_set's rmse: 71.4318[32m [repeated 5x across cluster][0m
[36m(_ray_fit pid=20227)[0m [1000]	valid_set's rmse: 77.4878[32m [repeated 2x across cluster][0m
[36m(_ray_fit pid=20346)[0m [1000]	valid_set's rmse: 76.4032[32m [repeated 2x across cluster][0m
[36m(_ray_fit pid=20466)[0m [1000]	valid_set's rmse: 73.4134[32m [repeated 2x across cluster][0m


[36m(_dystack pid=19289)[0m 	-74.3067	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=19289)[0m 	49.54s	 = Training   runtime
[36m(_dystack pid=19289)[0m 	3.43s	 = Validation runtime
[36m(_dystack pid=19289)[0m Fitting model: WeightedEnsemble_L3 ... Training model for up to 149.87s of the -22.37s of remaining time.
[36m(_dystack pid=19289)[0m 	Ensemble Weights: {'LightGBMXT_BAG_L2': 0.947, 'KNeighborsDist_BAG_L1': 0.053}
[36m(_dystack pid=19289)[0m 	-74.2555	 = Validation score   (-root_mean_squared_error)
[36m(_dystack pid=19289)[0m 	0.02s	 = Training   runtime
[36m(_dystack pid=19289)[0m 	0.0s	 = Validation runtime
[36m(_dystack pid=19289)[0m AutoGluon training complete, total runtime = 172.4s ... Best model: WeightedEnsemble_L3 | Estimated inference throughput: 73.1 rows/s (1210 batch size)
[36m(_dystack pid=19289)[0m TabularPredictor saved. To load, use: predictor = TabularPredictor.load("/content/AutogluonModels/ag-20250606_083508/ds_sub_fit/

### Review AutoGluon's training run with ranking of models that did the best.

In [54]:
predictor.leaderboard(silent=True)

Unnamed: 0,model,score_val,eval_metric,pred_time_val,fit_time,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,WeightedEnsemble_L3,-55.068122,root_mean_squared_error,24.564065,340.786183,0.001014,0.034509,3,True,10
1,LightGBM_BAG_L2,-55.100731,root_mean_squared_error,19.271317,267.162672,0.36243,40.477931,2,True,9
2,LightGBMXT_BAG_L2,-60.807757,root_mean_squared_error,24.200621,300.273743,5.291734,73.589003,2,True,8
3,KNeighborsDist_BAG_L1,-84.125061,root_mean_squared_error,0.102407,0.064509,0.102407,0.064509,1,True,2
4,WeightedEnsemble_L2,-84.125061,root_mean_squared_error,0.103368,0.097187,0.000961,0.032678,2,True,7
5,KNeighborsUnif_BAG_L1,-101.546199,root_mean_squared_error,0.128083,0.066255,0.128083,0.066255,1,True,1
6,RandomForestMSE_BAG_L1,-116.548359,root_mean_squared_error,0.888633,19.401546,0.888633,19.401546,1,True,5
7,LightGBM_BAG_L1,-131.054162,root_mean_squared_error,2.75626,46.56515,2.75626,46.56515,1,True,4
8,CatBoost_BAG_L1,-131.421561,root_mean_squared_error,0.125415,76.094854,0.125415,76.094854,1,True,6
9,LightGBMXT_BAG_L1,-131.460909,root_mean_squared_error,14.90809,84.492427,14.90809,84.492427,1,True,3


In [52]:
predictor.fit_summary()

*** Summary of fit() ***
Estimated performance of each model:
                    model   score_val              eval_metric  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0     WeightedEnsemble_L3  -55.068122  root_mean_squared_error      24.564065  340.786183                0.001014           0.034509            3       True         10
1         LightGBM_BAG_L2  -55.100731  root_mean_squared_error      19.271317  267.162672                0.362430          40.477931            2       True          9
2       LightGBMXT_BAG_L2  -60.807757  root_mean_squared_error      24.200621  300.273743                5.291734          73.589003            2       True          8
3   KNeighborsDist_BAG_L1  -84.125061  root_mean_squared_error       0.102407    0.064509                0.102407           0.064509            1       True          2
4     WeightedEnsemble_L2  -84.125061  root_mean_squared_error       0.103368    0.097187         

AttributeError: module 'numpy' has no attribute 'bool8'

### Create predictions from test dataset

In [None]:
predictions = ?
predictions.head()

#### NOTE: Kaggle will reject the submission if we don't set everything to be > 0.

In [None]:
# Describe the `predictions` series to see if there are any negative values
?

In [None]:
# How many negative values do we have?
?

In [None]:
# Set them to zero
?

### Set predictions to submission dataframe, save, and submit

In [None]:
submission["count"] = ?
submission.to_csv("submission.csv", index=False)

In [None]:
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "first raw submission"

#### View submission via the command line or in the web browser under the competition's page - `My Submissions`

In [None]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6

#### Initial score of `?`

## Step 4: Exploratory Data Analysis and Creating an additional feature
* Any additional feature will do, but a great suggestion would be to separate out the datetime into hour, day, or month parts.

In [None]:
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
train.?

In [None]:
# create a new feature
train[?] = ?
test[?] = ?

## Make category types for these so models know they are not just numbers
* AutoGluon originally sees these as ints, but in reality they are int representations of a category.
* Setting the dtype to category will classify these as categories in AutoGluon.

In [None]:
train["season"] = ?
train["weather"] = ?
test["season"] = ?
test["weather"] = ?

In [None]:
# View are new feature
train.head()

In [None]:
# View histogram of all features again now with the hour feature
train.?

## Step 5: Rerun the model with the same settings as before, just with more features

In [None]:
predictor_new_features = TabularPredictor(?).fit(?)

In [None]:
predictor_new_features.fit_summary()

In [None]:
# Remember to set all negative values to zero
?

In [None]:
# Same submitting predictions
submission_new_features["count"] = ?
submission_new_features.to_csv("submission_new_features.csv", index=False)

In [None]:
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features.csv -m "new features"

In [None]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6

#### New Score of `?`

## Step 6: Hyper parameter optimization
* There are many options for hyper parameter optimization.
* Options are to change the AutoGluon higher level parameters or the individual model hyperparameters.
* The hyperparameters of the models themselves that are in AutoGluon. Those need the `hyperparameter` and `hyperparameter_tune_kwargs` arguments.

In [None]:
predictor_new_hpo = TabularPredictor(?).fit(?)

In [None]:
predictor_new_hpo.fit_summary()

In [None]:
# Remember to set all negative values to zero
?

In [None]:
# Same submitting predictions
submission_new_hpo["count"] = ?
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)

In [None]:
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameters"

In [None]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6

#### New Score of `?`

## Step 7: Write a Report
### Refer to the markdown file for the full report
### Creating plots and table for report

In [None]:
# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
    {
        "model": ["initial", "add_features", "hpo"],
        "score": [?, ?, ?]
    }
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_train_score.png')

In [None]:
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
    {
        "test_eval": ["initial", "add_features", "hpo"],
        "score": [?, ?, ?]
    }
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_test_score.png')

### Hyperparameter table

In [None]:
# The 3 hyperparameters we tuned with the kaggle score as the result
pd.DataFrame({
    "model": ["initial", "add_features", "hpo"],
    "hpo1": [?, ?, ?],
    "hpo2": [?, ?, ?],
    "hpo3": [?, ?, ?],
    "score": [?, ?, ?]
})