# Week 4 - Models and Experimentation

## Step 1 Training a model

For the purposes of this demo, we will be using this [adapted demo](https://www.datacamp.com/tutorial/xgboost-in-python) and training an XGBoost model, and then doing some experimentation and hyperparameter tuning.


If running this notebook locally, use the following steps to create virtual environment:
- Don't use past python 3.10
- To create virtual environment use "venv"

`python -m venv NAME`

- Try to avoid anaconda, poetry or similar package management platforms
- To install a package use pip

`python -m pip install <package-name>`

- once you are done working with this virtual environment, deactivate it with `deactivate`

### Install packages

In [1]:
from google.colab import drive
drive.mount('/content/drive/')
import os
path="/content/drive/MyDrive/Colab Notebooks/HW_实践"
os.chdir(path)
os.listdir(path)
import os
print(os.getcwd())

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/MyDrive/Colab Notebooks/HW_实践


In [2]:
!pip install wandb -qU

In [3]:
import xgboost as xgb
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error


### Import data

We will be using Diamonds dataset imported from Seaborn. It is also available on [Kaggle](https://www.kaggle.com/datasets/shivam2503/diamonds).

Read about the features by following the link. We will be predicting the price of diamonds.

In [4]:
diamonds = sns.load_dataset('diamonds')
diamonds.head()

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31
2,0.23,Good,E,VS1,56.9,65.0,327,4.05,4.07,2.31
3,0.29,Premium,I,VS2,62.4,58.0,334,4.2,4.23,2.63
4,0.31,Good,J,SI2,63.3,58.0,335,4.34,4.35,2.75


In [5]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype   
---  ------   --------------  -----   
 0   carat    53940 non-null  float64 
 1   cut      53940 non-null  category
 2   color    53940 non-null  category
 3   clarity  53940 non-null  category
 4   depth    53940 non-null  float64 
 5   table    53940 non-null  float64 
 6   price    53940 non-null  int64   
 7   x        53940 non-null  float64 
 8   y        53940 non-null  float64 
 9   z        53940 non-null  float64 
dtypes: category(3), float64(6), int64(1)
memory usage: 3.0 MB


In [6]:
diamonds.shape

(53940, 10)

In [7]:
X,y = diamonds.drop('price', axis=1), diamonds[['price']]

# For the cut, color and clarity use pandas category to enable XGBoost ability to deal with categorical data.

X['cut'] = X['cut'].astype('category')
X['color'] = X['color'].astype('category')
X['clarity'] = X['clarity'].astype('category')

### Split the data and train a model

In [8]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)

In [9]:
# Define hyperparameters
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}

n = 100
model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
)


    E.g. tree_method = "hist", device = "cuda"



In [10]:
# Define evaluation metrics - Root Mean Squared Error

predictions = model.predict(dtest)
rmse = mean_squared_error(y_test, predictions, squared=False)
print(f"RMSE: {rmse}")

RMSE: 532.8838153117543



    E.g. tree_method = "hist", device = "cuda"



### Incorporate validation

In [11]:
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 100

# Create the validation set
evals = [(dtrain, "train"), (dtest, "validation")]

In [12]:
evals = [(dtrain, "train"), (dtest, "validation")]

model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=10,
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630
[10]	train-rmse:550.99470	validation-rmse:571.16640



    E.g. tree_method = "hist", device = "cuda"



[20]	train-rmse:491.51435	validation-rmse:544.08058
[30]	train-rmse:464.38845	validation-rmse:537.01895
[40]	train-rmse:445.99106	validation-rmse:533.85127
[50]	train-rmse:430.36010	validation-rmse:532.90320
[60]	train-rmse:418.87898	validation-rmse:533.04629
[70]	train-rmse:409.66247	validation-rmse:533.58046
[80]	train-rmse:397.34048	validation-rmse:534.31963
[90]	train-rmse:389.94294	validation-rmse:532.61946
[99]	train-rmse:377.70831	validation-rmse:532.88383


In [13]:
# Incorporate early stopping
n = 10000


model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=50,
   # Activate early stopping
   early_stopping_rounds=50
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630



    E.g. tree_method = "hist", device = "cuda"



[50]	train-rmse:430.36010	validation-rmse:532.90320
[100]	train-rmse:377.56825	validation-rmse:532.79980
[102]	train-rmse:376.20429	validation-rmse:532.59813


In [14]:
# Cross-validation

params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 1000

results = xgb.cv(
   params, dtrain,
   num_boost_round=n,
   nfold=5,
   early_stopping_rounds=20
)



    E.g. tree_method = "hist", device = "cuda"



In [15]:
results.head()

Unnamed: 0,train-rmse-mean,train-rmse-std,test-rmse-mean,test-rmse-std
0,2861.153015,8.266765,2861.773555,36.937516
1,2081.378004,5.534608,2084.973481,32.064109
2,1545.361682,3.287745,1553.681211,31.059209
3,1182.364236,3.585787,1192.464771,26.157805
4,941.828819,2.971779,958.467497,23.613538


In [16]:
best_rmse = results['test-rmse-mean'].min()

best_rmse

549.1039652582465

## Start W&B


- Login into your W&B profile using the code below
- Alternatively you can set environment variables. There are several env variables which you can set to change the behavior of W&B logging. The most important are:
    - WANDB_API_KEY - find this in your "Settings" section under your profile
    - WANDB_BASE_URL - this is the url of the W&B server

- Find your API Token in "Profile" -> "Setttings" in the W&B App



In [17]:
# Log in to your W&B account
import wandb
wandb.login()



[34m[1mwandb[0m: Currently logged in as: [33mlxinporto[0m ([33mnorthwesterncsai[0m). Use [1m`wandb login --relogin`[0m to force relogin


True

In [18]:
run = wandb.init(project='xgboost_hyperparameter_tuning', entity='lxinporto')


[34m[1mwandb[0m: Currently logged in as: [33mlxinporto[0m. Use [1m`wandb login --relogin`[0m to force relogin


In [19]:
from wandb.xgboost import WandbCallback

In [20]:
params = {
    "objective": "reg:squarederror",
    "tree_method": "gpu_hist",
    "eval_metric": "rmse",
}

n = 1000
early_stopping_rounds = 20

xgbmodel = xgb.XGBRegressor(**params,
                            n_estimators=n,
                            early_stopping_rounds=early_stopping_rounds,
                            callbacks=[WandbCallback()])

X_train = pd.get_dummies(X_train)
X_test = pd.get_dummies(X_test)
xgbmodel.fit(X_train, y_train, eval_set=[(X_test, y_test)])


[0]	validation_0-rmse:2890.49103
[1]	validation_0-rmse:2151.25657
[2]	validation_0-rmse:1646.16591
[3]	validation_0-rmse:1309.68710
[4]	validation_0-rmse:1077.47338
[5]	validation_0-rmse:916.86440
[6]	validation_0-rmse:826.99597
[7]	validation_0-rmse:760.80217



    E.g. tree_method = "hist", device = "cuda"



[8]	validation_0-rmse:719.44433
[9]	validation_0-rmse:693.51193
[10]	validation_0-rmse:672.84882
[11]	validation_0-rmse:654.85984
[12]	validation_0-rmse:634.73614
[13]	validation_0-rmse:623.18225
[14]	validation_0-rmse:617.34221
[15]	validation_0-rmse:613.67147
[16]	validation_0-rmse:607.32331
[17]	validation_0-rmse:604.46119
[18]	validation_0-rmse:598.17601
[19]	validation_0-rmse:594.88868
[20]	validation_0-rmse:593.34843
[21]	validation_0-rmse:590.06852
[22]	validation_0-rmse:587.76214
[23]	validation_0-rmse:585.99364
[24]	validation_0-rmse:582.98290
[25]	validation_0-rmse:582.47269
[26]	validation_0-rmse:580.17669
[27]	validation_0-rmse:579.34976
[28]	validation_0-rmse:578.84001
[29]	validation_0-rmse:577.81118
[30]	validation_0-rmse:575.61969
[31]	validation_0-rmse:574.99526
[32]	validation_0-rmse:573.81131
[33]	validation_0-rmse:574.58592
[34]	validation_0-rmse:573.74085
[35]	validation_0-rmse:573.47272
[36]	validation_0-rmse:572.44439
[37]	validation_0-rmse:571.66438
[38]	validat

In [21]:
run.finish()

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.7608050148465852, max=1.0…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
validation_0-rmse,█▄▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,63.0
best_score,562.00163
epoch,82.0


In [22]:
sweep_config = {
  "method" : "random",
  "parameters" : {
    "learning_rate" :{
      "min": 0.001,
      "max": 1.0
    },
    "gamma" :{
      "min": 0.001,
      "max": 1.0
    },
    "min_child_weight" :{
      "min": 10,
      "max": 150
    },
    "early_stopping_rounds" :{
      "values" : [10, 20, 30]
    },
    "n_estimators" :{
      "min": 100,
      "max": 1000
    }
  }
}

In [23]:
sweep_id = wandb.sweep(sweep_config, project='xgboost_hyperparameter_tuning')


Create sweep with ID: klfjir3d
Sweep URL: https://wandb.ai/northwesterncsai/xgboost_hyperparameter_tuning/sweeps/klfjir3d


In [24]:
def train():
    diamonds = sns.load_dataset('diamonds')
    diamonds.head()

    X,y = diamonds.drop('price', axis=1), diamonds[['price']]

    # For the cut, color and clarity use pandas category to enable XGBoost ability to deal with categorical data.

    X['cut'] = X['cut'].astype('category')
    X['color'] = X['color'].astype('category')
    X['clarity'] = X['clarity'].astype('category')
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    with wandb.init(job_type="sweep") as run:

        x_params = {
            "objective": "reg:squarederror",
            "tree_method": "gpu_hist",
            "eval_metric": "rmse",
            "learning_rate": run.config['learning_rate'],
            "gamma": run.config['gamma'],
            "min_child_weight": run.config['min_child_weight'],
            "n_estimators": run.config['n_estimators']
        }

        xgbmodel = xgb.XGBRegressor(**x_params,
                                    early_stopping_rounds=run.config['early_stopping_rounds'],
                                    callbacks=[WandbCallback()])

        X_train = pd.get_dummies(X_train)
        X_test = pd.get_dummies(X_test)
        xgbmodel.fit(X_train, y_train, eval_set=[(X_test, y_test)])

In [25]:
count = 5 # number of runs to execute
wandb.agent(sweep_id, function=train, count=count)


[34m[1mwandb[0m: Agent Starting Run: xvag0boe with config:
[34m[1mwandb[0m: 	early_stopping_rounds: 30
[34m[1mwandb[0m: 	gamma: 0.9850515702568374
[34m[1mwandb[0m: 	learning_rate: 0.21992075605190492
[34m[1mwandb[0m: 	min_child_weight: 54
[34m[1mwandb[0m: 	n_estimators: 331


[0]	validation_0-rmse:3179.72706
[1]	validation_0-rmse:2564.29849
[2]	validation_0-rmse:2092.12348
[3]	validation_0-rmse:1731.31376
[4]	validation_0-rmse:1450.93897
[5]	validation_0-rmse:1244.05794
[6]	validation_0-rmse:1091.54300
[7]	validation_0-rmse:972.73844
[8]	validation_0-rmse:885.88566
[9]	validation_0-rmse:828.34332
[10]	validation_0-rmse:780.91351
[11]	validation_0-rmse:746.53465
[12]	validation_0-rmse:722.37401
[13]	validation_0-rmse:698.31533
[14]	validation_0-rmse:680.98423
[15]	validation_0-rmse:665.12972
[16]	validation_0-rmse:655.42241
[17]	validation_0-rmse:643.11261
[18]	validation_0-rmse:634.30212
[19]	validation_0-rmse:630.23609
[20]	validation_0-rmse:623.44846
[21]	validation_0-rmse:616.19079
[22]	validation_0-rmse:610.58747
[23]	validation_0-rmse:604.66804
[24]	validation_0-rmse:599.18686
[25]	validation_0-rmse:594.41640
[26]	validation_0-rmse:591.20873
[27]	validation_0-rmse:590.03291
[28]	validation_0-rmse:587.32769
[29]	validation_0-rmse:583.66159
[30]	validati

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.760898282694848, max=1.0)…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
validation_0-rmse,█▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,198.0
best_score,548.92132
epoch,227.0


[34m[1mwandb[0m: Agent Starting Run: vin0kayp with config:
[34m[1mwandb[0m: 	early_stopping_rounds: 30
[34m[1mwandb[0m: 	gamma: 0.400817011235077
[34m[1mwandb[0m: 	learning_rate: 0.5110945341356542
[34m[1mwandb[0m: 	min_child_weight: 119
[34m[1mwandb[0m: 	n_estimators: 928


[0]	validation_0-rmse:2176.97469
[1]	validation_0-rmse:1376.43393
[2]	validation_0-rmse:995.71739
[3]	validation_0-rmse:859.62866
[4]	validation_0-rmse:790.30528
[5]	validation_0-rmse:751.29232
[6]	validation_0-rmse:726.95971
[7]	validation_0-rmse:704.30419
[8]	validation_0-rmse:689.01295
[9]	validation_0-rmse:681.17930
[10]	validation_0-rmse:667.77053
[11]	validation_0-rmse:657.25183
[12]	validation_0-rmse:653.00015
[13]	validation_0-rmse:647.97803
[14]	validation_0-rmse:644.12276
[15]	validation_0-rmse:641.94478
[16]	validation_0-rmse:639.94851
[17]	validation_0-rmse:636.63163
[18]	validation_0-rmse:630.67115
[19]	validation_0-rmse:627.92733
[20]	validation_0-rmse:622.79894
[21]	validation_0-rmse:619.79907
[22]	validation_0-rmse:617.10471
[23]	validation_0-rmse:616.24771
[24]	validation_0-rmse:614.06331
[25]	validation_0-rmse:613.03388
[26]	validation_0-rmse:611.55006
[27]	validation_0-rmse:610.38872
[28]	validation_0-rmse:609.72776
[29]	validation_0-rmse:608.94341
[30]	validation_0-

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.7608050148465852, max=1.0…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
validation_0-rmse,█▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,102.0
best_score,572.66735
epoch,132.0


[34m[1mwandb[0m: Agent Starting Run: 137771tu with config:
[34m[1mwandb[0m: 	early_stopping_rounds: 20
[34m[1mwandb[0m: 	gamma: 0.2905244628690686
[34m[1mwandb[0m: 	learning_rate: 0.11233077612895163
[34m[1mwandb[0m: 	min_child_weight: 50
[34m[1mwandb[0m: 	n_estimators: 761


[0]	validation_0-rmse:3572.47817
[1]	validation_0-rmse:3208.47557
[2]	validation_0-rmse:2889.11779
[3]	validation_0-rmse:2604.19887
[4]	validation_0-rmse:2355.23094
[5]	validation_0-rmse:2136.82001
[6]	validation_0-rmse:1943.40793
[7]	validation_0-rmse:1776.18344
[8]	validation_0-rmse:1629.02698
[9]	validation_0-rmse:1496.58176
[10]	validation_0-rmse:1384.91955
[11]	validation_0-rmse:1285.20864
[12]	validation_0-rmse:1197.98557
[13]	validation_0-rmse:1124.42358
[14]	validation_0-rmse:1059.36268
[15]	validation_0-rmse:1004.16255
[16]	validation_0-rmse:955.84698
[17]	validation_0-rmse:910.70437
[18]	validation_0-rmse:874.03366
[19]	validation_0-rmse:844.57972
[20]	validation_0-rmse:818.87813
[21]	validation_0-rmse:795.24068
[22]	validation_0-rmse:772.94624
[23]	validation_0-rmse:753.60978
[24]	validation_0-rmse:739.50233
[25]	validation_0-rmse:727.06845
[26]	validation_0-rmse:714.36077
[27]	validation_0-rmse:703.94237
[28]	validation_0-rmse:694.07926
[29]	validation_0-rmse:683.45351
[30]

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.7601973684210527, max=1.0…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
validation_0-rmse,█▅▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,184.0
best_score,548.68844
epoch,204.0


[34m[1mwandb[0m: Agent Starting Run: ujq5gns3 with config:
[34m[1mwandb[0m: 	early_stopping_rounds: 10
[34m[1mwandb[0m: 	gamma: 0.042768107082297294
[34m[1mwandb[0m: 	learning_rate: 0.3552913019108537
[34m[1mwandb[0m: 	min_child_weight: 105
[34m[1mwandb[0m: 	n_estimators: 852


[0]	validation_0-rmse:2707.27217
[1]	validation_0-rmse:1905.40213
[2]	validation_0-rmse:1413.17668
[3]	validation_0-rmse:1123.40615
[4]	validation_0-rmse:957.10377
[5]	validation_0-rmse:860.90893
[6]	validation_0-rmse:786.18128
[7]	validation_0-rmse:751.53684
[8]	validation_0-rmse:720.78069
[9]	validation_0-rmse:700.28503
[10]	validation_0-rmse:686.47673
[11]	validation_0-rmse:674.49745
[12]	validation_0-rmse:665.43074
[13]	validation_0-rmse:658.08645
[14]	validation_0-rmse:653.46815
[15]	validation_0-rmse:645.17661
[16]	validation_0-rmse:641.24294
[17]	validation_0-rmse:638.94257
[18]	validation_0-rmse:637.17640
[19]	validation_0-rmse:632.71090
[20]	validation_0-rmse:628.19402
[21]	validation_0-rmse:625.48108
[22]	validation_0-rmse:622.97254
[23]	validation_0-rmse:619.43626
[24]	validation_0-rmse:614.79055
[25]	validation_0-rmse:612.03502
[26]	validation_0-rmse:610.63633
[27]	validation_0-rmse:608.74334
[28]	validation_0-rmse:608.50205
[29]	validation_0-rmse:607.84679
[30]	validation_

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.7605401844532279, max=1.0…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇████
validation_0-rmse,█▅▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,52.0
best_score,585.79636
epoch,62.0


[34m[1mwandb[0m: Agent Starting Run: zfbyrp3s with config:
[34m[1mwandb[0m: 	early_stopping_rounds: 10
[34m[1mwandb[0m: 	gamma: 0.43928136537699375
[34m[1mwandb[0m: 	learning_rate: 0.6875993573829211
[34m[1mwandb[0m: 	min_child_weight: 91
[34m[1mwandb[0m: 	n_estimators: 786


[0]	validation_0-rmse:1629.62715
[1]	validation_0-rmse:1018.20102
[2]	validation_0-rmse:825.55791
[3]	validation_0-rmse:758.23438
[4]	validation_0-rmse:728.81968
[5]	validation_0-rmse:704.96088
[6]	validation_0-rmse:699.33841
[7]	validation_0-rmse:687.44058
[8]	validation_0-rmse:679.97607
[9]	validation_0-rmse:675.89532
[10]	validation_0-rmse:669.75539
[11]	validation_0-rmse:657.45941
[12]	validation_0-rmse:652.00979
[13]	validation_0-rmse:644.14745
[14]	validation_0-rmse:639.56083
[15]	validation_0-rmse:637.09862
[16]	validation_0-rmse:633.96971
[17]	validation_0-rmse:629.73797
[18]	validation_0-rmse:623.45150
[19]	validation_0-rmse:621.38923
[20]	validation_0-rmse:620.11960
[21]	validation_0-rmse:613.19795
[22]	validation_0-rmse:612.11065
[23]	validation_0-rmse:611.38997
[24]	validation_0-rmse:610.03781
[25]	validation_0-rmse:607.16954
[26]	validation_0-rmse:606.75789
[27]	validation_0-rmse:606.61137
[28]	validation_0-rmse:606.12255
[29]	validation_0-rmse:606.45652
[30]	validation_0-

VBox(children=(Label(value='0.002 MB of 0.003 MB uploaded\r'), FloatProgress(value=0.7611496531219029, max=1.0…

0,1
best_iteration,▁
best_score,▁
epoch,▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇███
validation_0-rmse,█▄▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
best_iteration,54.0
best_score,591.97718
epoch,64.0


#Experiment Description
 This experiment runs 5 different hyper-parameter configurations using random variations of learning rate, gamma, min child weight, early stopping rounds, num estimators for the XGBRegressor model. Looking at the resulhttps://wandb.ai/northwesterncsai/xgboost_hyperparameter_tuning/runs/137771tu of the experiment on WandB we see that sweep run 3 performed best with 10 early stopping rounds, a gamma of 0.042, a learning rate of 0.35, a min child weight of 105, and 852 num estimators, best_score	548.68844
epoch	204.

```
# This is formatted as code
```



In [26]:
# TO DO
# Start experiment tracking with W&B
# Do at least 5 experiments with various hyperparameters
# Choose any method for hyperparameter tuning: grid search, random search, bayesian search
# Describe your findings and what you see