# Week 4 - Models and Experimentation

## Step 1 Training a model

For the purposes of this demo, we will be using this [adapted demo](https://www.datacamp.com/tutorial/xgboost-in-python) and training an XGBoost model, and then doing some experimentation and hyperparameter tuning.


If running this notebook locally, use the following steps to create virtual environment:
- Don't use past python 3.10
- To create virtual environment use "venv"

`python -m venv NAME`

- Try to avoid anaconda, poetry or similar package management platforms
- To install a package use pip

`python -m pip install <package-name>`

- once you are done working with this virtual environment, deactivate it with `deactivate`

### Install packages

In [6]:
!pip install wandb -qU

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m15.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m17.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m267.1/267.1 kB[0m [31m21.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [7]:
import xgboost as xgb
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error


### Import data

We will be using Diamonds dataset imported from Seaborn. It is also available on [Kaggle](https://www.kaggle.com/datasets/shivam2503/diamonds).

Read about the features by following the link. We will be predicting the price of diamonds.

In [8]:
diamonds = sns.load_dataset('diamonds')
diamonds.head()

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31
2,0.23,Good,E,VS1,56.9,65.0,327,4.05,4.07,2.31
3,0.29,Premium,I,VS2,62.4,58.0,334,4.2,4.23,2.63
4,0.31,Good,J,SI2,63.3,58.0,335,4.34,4.35,2.75


In [9]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype   
---  ------   --------------  -----   
 0   carat    53940 non-null  float64 
 1   cut      53940 non-null  category
 2   color    53940 non-null  category
 3   clarity  53940 non-null  category
 4   depth    53940 non-null  float64 
 5   table    53940 non-null  float64 
 6   price    53940 non-null  int64   
 7   x        53940 non-null  float64 
 8   y        53940 non-null  float64 
 9   z        53940 non-null  float64 
dtypes: category(3), float64(6), int64(1)
memory usage: 3.0 MB


In [10]:
diamonds.shape

(53940, 10)

In [11]:
X,y = diamonds.drop('price', axis=1), diamonds[['price']]

# For the cut, color and clarity use pandas category to enable XGBoost ability to deal with categorical data.

X['cut'] = X['cut'].astype('category')
X['color'] = X['color'].astype('category')
X['clarity'] = X['clarity'].astype('category')

### Split the data and train a model

In [12]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)

In [13]:
# Define hyperparameters
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}

n = 100
model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
)


    E.g. tree_method = "hist", device = "cuda"



In [14]:
# Define evaluation metrics - Root Mean Squared Error

predictions = model.predict(dtest)
rmse = mean_squared_error(y_test, predictions, squared=False)
print(f"RMSE: {rmse}")

RMSE: 532.8838153117543



    E.g. tree_method = "hist", device = "cuda"



### Incorporate validation

In [15]:
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 100

# Create the validation set
evals = [(dtrain, "train"), (dtest, "validation")]

In [16]:
evals = [(dtrain, "train"), (dtest, "validation")]

model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=10,
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630



    E.g. tree_method = "hist", device = "cuda"



[10]	train-rmse:550.99470	validation-rmse:571.16640
[20]	train-rmse:491.51435	validation-rmse:544.08058
[30]	train-rmse:464.38845	validation-rmse:537.01895
[40]	train-rmse:445.99106	validation-rmse:533.85127
[50]	train-rmse:430.36010	validation-rmse:532.90320
[60]	train-rmse:418.87898	validation-rmse:533.04629
[70]	train-rmse:409.66247	validation-rmse:533.58046
[80]	train-rmse:397.34048	validation-rmse:534.31963
[90]	train-rmse:389.94294	validation-rmse:532.61946
[99]	train-rmse:377.70831	validation-rmse:532.88383


In [17]:
# Incorporate early stopping
n = 10000


model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=50,
   # Activate early stopping
   early_stopping_rounds=50
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630



    E.g. tree_method = "hist", device = "cuda"



[50]	train-rmse:430.36010	validation-rmse:532.90320
[100]	train-rmse:377.56825	validation-rmse:532.79980
[102]	train-rmse:376.20429	validation-rmse:532.59813


In [18]:
# Cross-validation

params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 1000

results = xgb.cv(
   params, dtrain,
   num_boost_round=n,
   nfold=5,
   early_stopping_rounds=20
)



    E.g. tree_method = "hist", device = "cuda"



In [19]:
results.head()

Unnamed: 0,train-rmse-mean,train-rmse-std,test-rmse-mean,test-rmse-std
0,2861.153015,8.266765,2861.773555,36.937516
1,2081.378004,5.534608,2084.973481,32.064109
2,1545.361682,3.287745,1553.681211,31.059209
3,1182.364236,3.585787,1192.464771,26.157805
4,941.828819,2.971779,958.467497,23.613538


In [20]:
best_rmse = results['test-rmse-mean'].min()

best_rmse

549.1039652582465

## Start W&B


- Login into your W&B profile using the code below
- Alternatively you can set environment variables. There are several env variables which you can set to change the behavior of W&B logging. The most important are:
    - WANDB_API_KEY - find this in your "Settings" section under your profile
    - WANDB_BASE_URL - this is the url of the W&B server

- Find your API Token in "Profile" -> "Setttings" in the W&B App



In [21]:
# Log in to your W&B account
import wandb

wandb.login()

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [22]:
# TO DO
# Start experiment tracking with W&B
# Do at least 5 experiments with various hyperparameters
# Choose any method for hyperparameter tuning: grid search, random search, bayesian search
# Describe your findings and what you see

### Grid Search

In [23]:
for learning_rate in [0.01, 0.1, 0.2]:
    for max_depth in [6, 9, 12]:
        params = {
            "objective": "reg:squarederror",
            "tree_method": "gpu_hist",
            "learning_rate": learning_rate,
            "max_depth": max_depth
        }
        # Initialize a W&B run
        with wandb.init(project='diamonds_price_prediction', config=params):
            model = xgb.train(
                params=params,
                dtrain=dtrain,
                num_boost_round=100,
                evals=evals,
                early_stopping_rounds=50
            )
            predictions = model.predict(dtest)
            rmse = mean_squared_error(y_test, predictions, squared=False)
            wandb.log({"rmse": rmse})


[34m[1mwandb[0m: Currently logged in as: [33msaiganeshnellore[0m ([33mnsg[0m). Use [1m`wandb login --relogin`[0m to force relogin


[0]	train-rmse:3951.92462	validation-rmse:3949.02748
[1]	train-rmse:3914.29025	validation-rmse:3911.21998
[2]	train-rmse:3877.10275	validation-rmse:3873.86672
[3]	train-rmse:3840.30732	validation-rmse:3836.90210
[4]	train-rmse:3803.89438	validation-rmse:3800.32256
[5]	train-rmse:3767.86467	validation-rmse:3764.12914
[6]	train-rmse:3732.20828	validation-rmse:3728.30718
[7]	train-rmse:3696.92823	validation-rmse:3692.85775
[8]	train-rmse:3662.02397	validation-rmse:3657.78952
[9]	train-rmse:3627.46912	validation-rmse:3623.08587
[10]	train-rmse:3593.28635	validation-rmse:3588.73717
[11]	train-rmse:3559.44161	validation-rmse:3554.71475
[12]	train-rmse:3525.97874	validation-rmse:3521.06781
[13]	train-rmse:3492.86570	validation-rmse:3487.77529
[14]	train-rmse:3460.11145	validation-rmse:3454.83778
[15]	train-rmse:3427.65383	validation-rmse:3422.21992
[16]	train-rmse:3395.56677	validation-rmse:3389.96690



    E.g. tree_method = "hist", device = "cuda"



[17]	train-rmse:3363.75667	validation-rmse:3358.08757
[18]	train-rmse:3332.31965	validation-rmse:3326.47775
[19]	train-rmse:3301.17180	validation-rmse:3295.26749
[20]	train-rmse:3270.41375	validation-rmse:3264.35438
[21]	train-rmse:3239.92519	validation-rmse:3233.88165
[22]	train-rmse:3209.73054	validation-rmse:3203.59201
[23]	train-rmse:3179.80215	validation-rmse:3173.60675
[24]	train-rmse:3150.18892	validation-rmse:3144.01912
[25]	train-rmse:3120.92669	validation-rmse:3114.64985
[26]	train-rmse:3091.93693	validation-rmse:3085.54884
[27]	train-rmse:3063.14791	validation-rmse:3056.74740
[28]	train-rmse:3034.75572	validation-rmse:3028.31985
[29]	train-rmse:3006.57676	validation-rmse:3000.08035
[30]	train-rmse:2978.78205	validation-rmse:2972.18011
[31]	train-rmse:2951.11499	validation-rmse:2944.49426
[32]	train-rmse:2923.86171	validation-rmse:2917.20940
[33]	train-rmse:2896.92357	validation-rmse:2890.16929
[34]	train-rmse:2870.06897	validation-rmse:2863.29618
[35]	train-rmse:2843.69881	v


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1610.09514


[0]	train-rmse:3951.31161	validation-rmse:3948.51603
[1]	train-rmse:3913.06182	validation-rmse:3910.16519
[2]	train-rmse:3875.20416	validation-rmse:3872.21117
[3]	train-rmse:3837.72759	validation-rmse:3834.64671
[4]	train-rmse:3800.65435	validation-rmse:3797.48598
[5]	train-rmse:3763.95868	validation-rmse:3760.68426
[6]	train-rmse:3727.63526	validation-rmse:3724.29360
[7]	train-rmse:3691.68229	validation-rmse:3688.30457



    E.g. tree_method = "hist", device = "cuda"



[8]	train-rmse:3656.09006	validation-rmse:3652.64018
[9]	train-rmse:3620.86228	validation-rmse:3617.41609
[10]	train-rmse:3585.99076	validation-rmse:3582.49463
[11]	train-rmse:3551.46856	validation-rmse:3547.91563
[12]	train-rmse:3517.29885	validation-rmse:3513.70542
[13]	train-rmse:3483.47667	validation-rmse:3479.86643
[14]	train-rmse:3449.99711	validation-rmse:3446.32743
[15]	train-rmse:3416.86206	validation-rmse:3413.18331
[16]	train-rmse:3384.05851	validation-rmse:3380.31172
[17]	train-rmse:3351.59675	validation-rmse:3347.88861
[18]	train-rmse:3319.46008	validation-rmse:3315.73005
[19]	train-rmse:3287.65596	validation-rmse:3283.93800
[20]	train-rmse:3256.17609	validation-rmse:3252.44695
[21]	train-rmse:3225.00835	validation-rmse:3221.23390
[22]	train-rmse:3194.16374	validation-rmse:3190.42093
[23]	train-rmse:3163.63536	validation-rmse:3159.91866
[24]	train-rmse:3133.41055	validation-rmse:3129.66044
[25]	train-rmse:3103.50265	validation-rmse:3099.75892
[26]	train-rmse:3073.88879	val


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1566.3593


[0]	train-rmse:3951.24816	validation-rmse:3948.46957
[1]	train-rmse:3912.93494	validation-rmse:3910.09797
[2]	train-rmse:3875.01078	validation-rmse:3872.13959



    E.g. tree_method = "hist", device = "cuda"



[3]	train-rmse:3837.47090	validation-rmse:3834.54663
[4]	train-rmse:3800.30990	validation-rmse:3797.36513
[5]	train-rmse:3763.52904	validation-rmse:3760.50873
[6]	train-rmse:3727.11946	validation-rmse:3724.02586
[7]	train-rmse:3691.08401	validation-rmse:3687.98543
[8]	train-rmse:3655.41441	validation-rmse:3652.25947
[9]	train-rmse:3620.10283	validation-rmse:3616.93894
[10]	train-rmse:3585.14767	validation-rmse:3581.96120
[11]	train-rmse:3550.54746	validation-rmse:3547.32872
[12]	train-rmse:3516.29645	validation-rmse:3513.04184
[13]	train-rmse:3482.39393	validation-rmse:3479.17655
[14]	train-rmse:3448.82679	validation-rmse:3445.59950
[15]	train-rmse:3415.60276	validation-rmse:3412.38092
[16]	train-rmse:3382.71827	validation-rmse:3379.50498
[17]	train-rmse:3350.16456	validation-rmse:3346.96400
[18]	train-rmse:3317.93852	validation-rmse:3314.83716
[19]	train-rmse:3286.04289	validation-rmse:3282.94787
[20]	train-rmse:3254.46982	validation-rmse:3251.45466
[21]	train-rmse:3223.21401	validati


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1566.03389


[0]	train-rmse:3610.50768	validation-rmse:3606.11213
[1]	train-rmse:3271.50935	validation-rmse:3265.19719
[2]	train-rmse:2967.13455	validation-rmse:2960.13325
[3]	train-rmse:2692.91239	validation-rmse:2685.88366
[4]	train-rmse:2448.37676	validation-rmse:2441.67887
[5]	train-rmse:2228.50936	validation-rmse:2221.28015
[6]	train-rmse:2031.32603	validation-rmse:2024.25763
[7]	train-rmse:1855.37730	validation-rmse:1849.21702
[8]	train-rmse:1697.65725	validation-rmse:1692.26806
[9]	train-rmse:1556.75129	validation-rmse:1549.67485
[10]	train-rmse:1430.42789	validation-rmse:1423.68486
[11]	train-rmse:1319.23364	validation-rmse:1313.11177
[12]	train-rmse:1219.65530	validation-rmse:1213.57580
[13]	train-rmse:1131.26668	validation-rmse:1125.66809
[14]	train-rmse:1052.85512	validation-rmse:1048.28505
[15]	train-rmse:982.80467	validation-rmse:978.20323
[16]	train-rmse:921.95189	validation-rmse:918.57590



    E.g. tree_method = "hist", device = "cuda"



[17]	train-rmse:868.86000	validation-rmse:866.21695
[18]	train-rmse:821.45477	validation-rmse:819.79389
[19]	train-rmse:780.15752	validation-rmse:780.41317
[20]	train-rmse:744.73249	validation-rmse:746.78021
[21]	train-rmse:712.48331	validation-rmse:716.67222
[22]	train-rmse:685.73700	validation-rmse:691.37735
[23]	train-rmse:662.00527	validation-rmse:669.30421
[24]	train-rmse:641.58231	validation-rmse:651.29228
[25]	train-rmse:622.49311	validation-rmse:634.41151
[26]	train-rmse:607.23098	validation-rmse:620.24614
[27]	train-rmse:593.31149	validation-rmse:608.92380
[28]	train-rmse:581.00149	validation-rmse:598.52109
[29]	train-rmse:571.35028	validation-rmse:590.04102
[30]	train-rmse:562.18272	validation-rmse:582.61892
[31]	train-rmse:554.44091	validation-rmse:576.08962
[32]	train-rmse:547.95331	validation-rmse:570.85828
[33]	train-rmse:542.14337	validation-rmse:566.15872
[34]	train-rmse:536.73555	validation-rmse:562.66761
[35]	train-rmse:532.00657	validation-rmse:559.28528
[36]	train-r


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,524.19663


[0]	train-rmse:3603.99757	validation-rmse:3600.70490
[1]	train-rmse:3257.43184	validation-rmse:3253.84723
[2]	train-rmse:2946.12814	validation-rmse:2942.41856
[3]	train-rmse:2666.59383	validation-rmse:2663.57073
[4]	train-rmse:2415.46982	validation-rmse:2412.66304
[5]	train-rmse:2189.64548	validation-rmse:2187.59034



    E.g. tree_method = "hist", device = "cuda"



[6]	train-rmse:1987.35933	validation-rmse:1986.75644
[7]	train-rmse:1806.19552	validation-rmse:1807.76939
[8]	train-rmse:1643.27805	validation-rmse:1646.28140
[9]	train-rmse:1497.96636	validation-rmse:1502.30642
[10]	train-rmse:1367.57441	validation-rmse:1374.18944
[11]	train-rmse:1251.53223	validation-rmse:1261.17126
[12]	train-rmse:1148.33409	validation-rmse:1160.91190
[13]	train-rmse:1056.23912	validation-rmse:1073.16000
[14]	train-rmse:975.04060	validation-rmse:995.36773
[15]	train-rmse:902.62192	validation-rmse:926.97663
[16]	train-rmse:838.98619	validation-rmse:868.19912
[17]	train-rmse:782.90746	validation-rmse:816.46797
[18]	train-rmse:734.15536	validation-rmse:772.10621
[19]	train-rmse:690.68822	validation-rmse:734.02191
[20]	train-rmse:653.59273	validation-rmse:700.72099
[21]	train-rmse:621.01482	validation-rmse:672.33908
[22]	train-rmse:591.82088	validation-rmse:648.01667
[23]	train-rmse:566.19995	validation-rmse:628.37680
[24]	train-rmse:544.27253	validation-rmse:612.08846



    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,527.30582


[0]	train-rmse:3603.31398	validation-rmse:3600.22336
[1]	train-rmse:3255.87528	validation-rmse:3252.66502
[2]	train-rmse:2943.58334	validation-rmse:2941.17465



    E.g. tree_method = "hist", device = "cuda"



[3]	train-rmse:2663.10188	validation-rmse:2661.58381
[4]	train-rmse:2411.03791	validation-rmse:2410.95913
[5]	train-rmse:2184.58864	validation-rmse:2186.06253
[6]	train-rmse:1981.28099	validation-rmse:1984.67675
[7]	train-rmse:1798.95219	validation-rmse:1805.17636
[8]	train-rmse:1635.02283	validation-rmse:1644.47919
[9]	train-rmse:1488.10139	validation-rmse:1501.34128
[10]	train-rmse:1355.91065	validation-rmse:1374.40075
[11]	train-rmse:1237.73545	validation-rmse:1261.11212
[12]	train-rmse:1131.74288	validation-rmse:1161.65959
[13]	train-rmse:1036.86365	validation-rmse:1074.30024
[14]	train-rmse:952.29719	validation-rmse:997.60953
[15]	train-rmse:876.45020	validation-rmse:929.89679
[16]	train-rmse:808.91817	validation-rmse:871.14262
[17]	train-rmse:748.42169	validation-rmse:821.53338
[18]	train-rmse:694.33909	validation-rmse:777.17978
[19]	train-rmse:646.06692	validation-rmse:739.04015
[20]	train-rmse:603.22361	validation-rmse:706.52017
[21]	train-rmse:565.21893	validation-rmse:678.864


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,543.44322


[0]	train-rmse:3233.38044	validation-rmse:3227.27781
[1]	train-rmse:2635.32347	validation-rmse:2627.44665
[2]	train-rmse:2158.02535	validation-rmse:2148.42569
[3]	train-rmse:1781.80965	validation-rmse:1774.35390
[4]	train-rmse:1484.40305	validation-rmse:1475.02506
[5]	train-rmse:1251.51190	validation-rmse:1243.19153
[6]	train-rmse:1070.04470	validation-rmse:1063.46539
[7]	train-rmse:931.50004	validation-rmse:927.48947
[8]	train-rmse:825.29999	validation-rmse:824.51632
[9]	train-rmse:744.62177	validation-rmse:747.45284
[10]	train-rmse:683.82356	validation-rmse:692.25352
[11]	train-rmse:640.02367	validation-rmse:653.28257
[12]	train-rmse:608.14455	validation-rmse:625.02267
[13]	train-rmse:583.05241	validation-rmse:601.49639
[14]	train-rmse:565.19881	validation-rmse:588.04170
[15]	train-rmse:550.69623	validation-rmse:575.80210
[16]	train-rmse:539.91286	validation-rmse:567.92907
[17]	train-rmse:531.25198	validation-rmse:561.50809



    E.g. tree_method = "hist", device = "cuda"



[18]	train-rmse:524.47897	validation-rmse:556.58120
[19]	train-rmse:519.42462	validation-rmse:553.05511
[20]	train-rmse:514.24715	validation-rmse:550.19953
[21]	train-rmse:510.67820	validation-rmse:547.97200
[22]	train-rmse:506.29592	validation-rmse:546.50881
[23]	train-rmse:503.56289	validation-rmse:544.45359
[24]	train-rmse:499.94904	validation-rmse:542.21078
[25]	train-rmse:497.16952	validation-rmse:541.81032
[26]	train-rmse:494.13797	validation-rmse:539.97113
[27]	train-rmse:492.35133	validation-rmse:539.52006
[28]	train-rmse:490.79416	validation-rmse:539.60194
[29]	train-rmse:488.14693	validation-rmse:538.75695
[30]	train-rmse:485.78736	validation-rmse:537.11338
[31]	train-rmse:483.36163	validation-rmse:535.86078
[32]	train-rmse:480.65328	validation-rmse:535.03029
[33]	train-rmse:477.43378	validation-rmse:534.33605
[34]	train-rmse:475.76858	validation-rmse:533.65049
[35]	train-rmse:474.07508	validation-rmse:533.42376
[36]	train-rmse:471.91065	validation-rmse:532.21912
[37]	train-r


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,528.78762


[0]	train-rmse:3219.33381	validation-rmse:3215.67318
[1]	train-rmse:2605.76001	validation-rmse:2602.50890
[2]	train-rmse:2117.12339	validation-rmse:2116.54980
[3]	train-rmse:1728.37872	validation-rmse:1731.74425
[4]	train-rmse:1421.32717	validation-rmse:1429.46391
[5]	train-rmse:1179.90421	validation-rmse:1192.53953
[6]	train-rmse:991.55946	validation-rmse:1010.88296



    E.g. tree_method = "hist", device = "cuda"



[7]	train-rmse:845.30584	validation-rmse:874.16018
[8]	train-rmse:733.73972	validation-rmse:773.82550
[9]	train-rmse:650.08703	validation-rmse:700.28160
[10]	train-rmse:586.54473	validation-rmse:647.91706
[11]	train-rmse:539.62488	validation-rmse:611.14486
[12]	train-rmse:505.03047	validation-rmse:586.70409
[13]	train-rmse:479.41941	validation-rmse:569.33790
[14]	train-rmse:461.62796	validation-rmse:558.02503
[15]	train-rmse:446.29622	validation-rmse:550.51089
[16]	train-rmse:433.29448	validation-rmse:544.67261
[17]	train-rmse:423.11755	validation-rmse:541.08063
[18]	train-rmse:414.66021	validation-rmse:539.27889
[19]	train-rmse:409.33202	validation-rmse:537.78697
[20]	train-rmse:402.10695	validation-rmse:536.87034
[21]	train-rmse:396.74221	validation-rmse:535.99708
[22]	train-rmse:391.77613	validation-rmse:536.04305
[23]	train-rmse:388.67365	validation-rmse:535.91931
[24]	train-rmse:384.94653	validation-rmse:535.72035
[25]	train-rmse:381.77087	validation-rmse:535.93579
[26]	train-rmse


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,546.29685


[0]	train-rmse:3217.83343	validation-rmse:3214.66438
[1]	train-rmse:2602.13630	validation-rmse:2600.82904
[2]	train-rmse:2111.72985	validation-rmse:2114.16783



    E.g. tree_method = "hist", device = "cuda"



[3]	train-rmse:1721.45750	validation-rmse:1730.05651
[4]	train-rmse:1411.27899	validation-rmse:1428.59849
[5]	train-rmse:1165.15099	validation-rmse:1193.70312
[6]	train-rmse:970.54790	validation-rmse:1012.80740
[7]	train-rmse:817.76823	validation-rmse:877.60149
[8]	train-rmse:697.63489	validation-rmse:777.17345
[9]	train-rmse:603.72056	validation-rmse:706.05451
[10]	train-rmse:531.87223	validation-rmse:653.97026
[11]	train-rmse:474.52661	validation-rmse:619.87326
[12]	train-rmse:430.74522	validation-rmse:593.90690
[13]	train-rmse:394.22816	validation-rmse:576.30086
[14]	train-rmse:366.76698	validation-rmse:566.23577
[15]	train-rmse:345.24781	validation-rmse:558.59428
[16]	train-rmse:324.30737	validation-rmse:553.38875
[17]	train-rmse:306.56395	validation-rmse:550.49999
[18]	train-rmse:296.13579	validation-rmse:548.25369
[19]	train-rmse:284.62208	validation-rmse:547.41576
[20]	train-rmse:273.14734	validation-rmse:546.54901
[21]	train-rmse:264.49993	validation-rmse:545.21005
[22]	train-r


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,552.90571


Observations on Model Performance:
RMSE Trends: Throughout the experiments, the Root Mean Square Error (RMSE) generally decreased as the models continued training, which indicates that the models were learning effectively from the training data.

Validation Set Performance: The RMSE on the validation set showed slight fluctuations, suggesting a need to monitor for overfitting or to adjust model complexity.

**Impact of Hyperparameters:**

Changes in learning rate, max depth, and other parameters significantly influenced model performance. For instance, lower learning rates might have caused slower convergence, whereas higher rates might lead to overshooting the minimum.

Important parameters which had more influence is learning_rate


The number of boosting rounds also played a crucial role. In some runs, increasing the rounds continued to improve the RMSE, suggesting that further training was beneficial up to a point.



rmse:524.1966314359638
*   learning_rate:0.1
*   max_depth:6




### Random Search

In [24]:
import random

for _ in range(5):
    params = {
        "objective": "reg:squarederror",
        "tree_method": "gpu_hist",
        "learning_rate": random.uniform(0.01, 0.2),
        "max_depth": random.randint(6, 12)
    }
    with wandb.init(project='diamonds_price_prediction', config=params):
        model = xgb.train(
            params=params,
            dtrain=dtrain,
            num_boost_round=100,
            evals=evals,
            early_stopping_rounds=50
        )
        predictions = model.predict(dtest)
        rmse = mean_squared_error(y_test, predictions, squared=False)
        wandb.log({"rmse": rmse})


[0]	train-rmse:3630.21272	validation-rmse:3626.49109
[1]	train-rmse:3304.93808	validation-rmse:3300.23857
[2]	train-rmse:3010.81222	validation-rmse:3005.32242
[3]	train-rmse:2744.70338	validation-rmse:2738.95184
[4]	train-rmse:2504.05344	validation-rmse:2498.85603
[5]	train-rmse:2286.42476	validation-rmse:2282.12875
[6]	train-rmse:2089.18432	validation-rmse:2085.81751
[7]	train-rmse:1910.67143	validation-rmse:1907.96545



    E.g. tree_method = "hist", device = "cuda"



[8]	train-rmse:1750.08958	validation-rmse:1748.52848
[9]	train-rmse:1605.48924	validation-rmse:1605.24593
[10]	train-rmse:1475.17870	validation-rmse:1475.51489
[11]	train-rmse:1357.71560	validation-rmse:1359.08053
[12]	train-rmse:1252.60160	validation-rmse:1256.20355
[13]	train-rmse:1158.67783	validation-rmse:1164.19042
[14]	train-rmse:1074.43494	validation-rmse:1082.44751
[15]	train-rmse:999.43192	validation-rmse:1010.14866
[16]	train-rmse:932.55408	validation-rmse:945.63031
[17]	train-rmse:872.81700	validation-rmse:888.78185
[18]	train-rmse:819.99626	validation-rmse:838.31408
[19]	train-rmse:772.96456	validation-rmse:794.27974
[20]	train-rmse:731.20176	validation-rmse:755.09995
[21]	train-rmse:694.75281	validation-rmse:722.18920
[22]	train-rmse:663.05025	validation-rmse:692.88491
[23]	train-rmse:634.65037	validation-rmse:667.92279
[24]	train-rmse:609.87346	validation-rmse:645.90324
[25]	train-rmse:588.06363	validation-rmse:627.48136
[26]	train-rmse:569.56937	validation-rmse:612.01187


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.016 MB of 0.016 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,523.11639


[0]	train-rmse:3864.51969	validation-rmse:3861.58778
[1]	train-rmse:3743.23660	validation-rmse:3740.03062
[2]	train-rmse:3625.99984	validation-rmse:3622.57147
[3]	train-rmse:3512.63409	validation-rmse:3509.14648
[4]	train-rmse:3402.98702	validation-rmse:3399.37519
[5]	train-rmse:3296.99538	validation-rmse:3293.25322
[6]	train-rmse:3194.49128	validation-rmse:3190.89910
[7]	train-rmse:3095.38601	validation-rmse:3091.73688



    E.g. tree_method = "hist", device = "cuda"



[8]	train-rmse:2999.53428	validation-rmse:2995.81877
[9]	train-rmse:2906.89546	validation-rmse:2903.48223
[10]	train-rmse:2817.30473	validation-rmse:2814.09878
[11]	train-rmse:2730.66458	validation-rmse:2727.50474
[12]	train-rmse:2646.92178	validation-rmse:2644.18258
[13]	train-rmse:2565.85054	validation-rmse:2563.25650
[14]	train-rmse:2487.50151	validation-rmse:2485.18811
[15]	train-rmse:2411.80007	validation-rmse:2409.75036
[16]	train-rmse:2338.61272	validation-rmse:2336.81024
[17]	train-rmse:2267.84634	validation-rmse:2266.33221
[18]	train-rmse:2199.45319	validation-rmse:2198.21973
[19]	train-rmse:2133.19797	validation-rmse:2132.48444
[20]	train-rmse:2069.25963	validation-rmse:2068.84842
[21]	train-rmse:2007.41469	validation-rmse:2007.21770
[22]	train-rmse:1947.71484	validation-rmse:1947.95799
[23]	train-rmse:1890.05577	validation-rmse:1890.64357
[24]	train-rmse:1834.15769	validation-rmse:1835.13255
[25]	train-rmse:1780.27638	validation-rmse:1782.10406
[26]	train-rmse:1728.19204	val


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,550.80093


[0]	train-rmse:3793.75550	validation-rmse:3790.47245
[1]	train-rmse:3607.88083	validation-rmse:3603.97220
[2]	train-rmse:3431.64302	validation-rmse:3427.43496
[3]	train-rmse:3264.60725	validation-rmse:3259.71186
[4]	train-rmse:3106.26827	validation-rmse:3101.85485
[5]	train-rmse:2956.23683	validation-rmse:2951.25910
[6]	train-rmse:2813.93377	validation-rmse:2808.93639
[7]	train-rmse:2679.02378	validation-rmse:2674.06203
[8]	train-rmse:2550.82884	validation-rmse:2545.83057
[9]	train-rmse:2429.59448	validation-rmse:2424.80112
[10]	train-rmse:2314.69769	validation-rmse:2309.84208



    E.g. tree_method = "hist", device = "cuda"



[11]	train-rmse:2205.46414	validation-rmse:2200.99475
[12]	train-rmse:2102.43338	validation-rmse:2098.80558
[13]	train-rmse:2004.64198	validation-rmse:2001.46877
[14]	train-rmse:1911.87948	validation-rmse:1909.34512
[15]	train-rmse:1823.94405	validation-rmse:1821.75468
[16]	train-rmse:1741.06350	validation-rmse:1739.48475
[17]	train-rmse:1662.52572	validation-rmse:1661.44820
[18]	train-rmse:1588.22364	validation-rmse:1587.63707
[19]	train-rmse:1517.97988	validation-rmse:1517.71601
[20]	train-rmse:1451.39242	validation-rmse:1452.04051
[21]	train-rmse:1388.53949	validation-rmse:1390.04476
[22]	train-rmse:1329.35243	validation-rmse:1332.01354
[23]	train-rmse:1273.50655	validation-rmse:1276.92611
[24]	train-rmse:1220.82026	validation-rmse:1224.96879
[25]	train-rmse:1171.10883	validation-rmse:1176.42369
[26]	train-rmse:1124.29115	validation-rmse:1130.72881
[27]	train-rmse:1080.10780	validation-rmse:1087.62329
[28]	train-rmse:1038.65475	validation-rmse:1047.11690
[29]	train-rmse:999.58085	va


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,523.59295


[0]	train-rmse:3318.47154	validation-rmse:3315.31479
[1]	train-rmse:2765.36432	validation-rmse:2763.13663
[2]	train-rmse:2310.28168	validation-rmse:2310.20936
[3]	train-rmse:1935.37221	validation-rmse:1938.71132



    E.g. tree_method = "hist", device = "cuda"



[4]	train-rmse:1627.79182	validation-rmse:1635.70200
[5]	train-rmse:1375.53845	validation-rmse:1390.77221
[6]	train-rmse:1168.90419	validation-rmse:1194.43361
[7]	train-rmse:1000.34300	validation-rmse:1036.50376
[8]	train-rmse:863.90446	validation-rmse:911.08930
[9]	train-rmse:753.15991	validation-rmse:815.40817
[10]	train-rmse:663.68561	validation-rmse:740.43956
[11]	train-rmse:594.09521	validation-rmse:683.32393
[12]	train-rmse:536.87422	validation-rmse:642.25237
[13]	train-rmse:490.64790	validation-rmse:611.90287
[14]	train-rmse:453.49511	validation-rmse:591.14911
[15]	train-rmse:423.70226	validation-rmse:575.82492
[16]	train-rmse:399.36363	validation-rmse:564.98100
[17]	train-rmse:380.11878	validation-rmse:557.10064
[18]	train-rmse:364.09269	validation-rmse:552.83922
[19]	train-rmse:350.80314	validation-rmse:548.60245
[20]	train-rmse:338.34780	validation-rmse:546.65115
[21]	train-rmse:330.55674	validation-rmse:544.63450
[22]	train-rmse:318.66562	validation-rmse:543.55723
[23]	train


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,549.5194


[0]	train-rmse:3784.38376	validation-rmse:3780.99540
[1]	train-rmse:3590.34810	validation-rmse:3586.00062
[2]	train-rmse:3407.05510	validation-rmse:3402.04952
[3]	train-rmse:3233.97807	validation-rmse:3228.18009
[4]	train-rmse:3070.14862	validation-rmse:3064.69086
[5]	train-rmse:2915.61763	validation-rmse:2909.89417
[6]	train-rmse:2769.34484	validation-rmse:2763.86727
[7]	train-rmse:2631.29351	validation-rmse:2625.69169
[8]	train-rmse:2499.95997	validation-rmse:2494.82788
[9]	train-rmse:2376.55585	validation-rmse:2371.59413
[10]	train-rmse:2259.61728	validation-rmse:2254.96517
[11]	train-rmse:2149.06048	validation-rmse:2144.16968
[12]	train-rmse:2044.97187	validation-rmse:2040.51982



    E.g. tree_method = "hist", device = "cuda"



[13]	train-rmse:1946.32173	validation-rmse:1941.90833
[14]	train-rmse:1853.72247	validation-rmse:1850.01058
[15]	train-rmse:1766.19805	validation-rmse:1762.85120
[16]	train-rmse:1683.61329	validation-rmse:1680.72754
[17]	train-rmse:1605.84618	validation-rmse:1602.78250
[18]	train-rmse:1532.62072	validation-rmse:1529.74929
[19]	train-rmse:1463.43252	validation-rmse:1461.00849
[20]	train-rmse:1398.25382	validation-rmse:1395.91428
[21]	train-rmse:1337.27054	validation-rmse:1335.29151
[22]	train-rmse:1279.64429	validation-rmse:1278.52445
[23]	train-rmse:1225.41440	validation-rmse:1224.99272
[24]	train-rmse:1174.62224	validation-rmse:1174.83223
[25]	train-rmse:1126.49389	validation-rmse:1127.50385
[26]	train-rmse:1081.72254	validation-rmse:1083.47990
[27]	train-rmse:1039.55611	validation-rmse:1042.31902
[28]	train-rmse:1000.27805	validation-rmse:1003.67902
[29]	train-rmse:963.27748	validation-rmse:967.52703
[30]	train-rmse:928.75300	validation-rmse:934.06362
[31]	train-rmse:896.49488	valida


    E.g. tree_method = "hist", device = "cuda"



VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,527.32834


#### Hyperparameter W&B Sweeps with Bayes Method

In [29]:
import wandb
wandb.login()

sweep_config = {
    'method': 'bayes',
    'metric': {
        'name': 'rmse',
        'goal': 'minimize'
    },
    'parameters': {
        'max_depth': {
            'values': [3, 6, 9, 12]
        },
        'learning_rate': {
            'values': [0.01, 0.1, 0.2]
        },
        'n_estimators': {
            'values': [100, 200, 300]
        },
        'subsample': {
            'values': [0.8, 0.9, 1.0]
        },
        'colsample_bytree': {
            'values': [0.5, 0.75, 1.0]
        }
    }
}

sweep_id = wandb.sweep(sweep_config, project="diamonds_price_prediction")


Create sweep with ID: yhuzk2ul
Sweep URL: https://wandb.ai/nsg/diamonds_price_prediction/sweeps/yhuzk2ul


In [30]:
def train():
    wandb.init()

    # Load dataset and preprocess
    diamonds = sns.load_dataset('diamonds')
    X = diamonds.drop('price', axis=1)
    y = diamonds['price']
    X['cut'] = X['cut'].astype('category')
    X['color'] = X['color'].astype('category')
    X['clarity'] = X['clarity'].astype('category')
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
    dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)

    # Get hyperparameters from W&B
    config = wandb.config

    params = {
        "objective": "reg:squarederror",
        "tree_method": "gpu_hist",
        "max_depth": config.max_depth,
        "learning_rate": config.learning_rate,
        "n_estimators": config.n_estimators,
        "subsample": config.subsample,
        "colsample_bytree": config.colsample_bytree
    }

    # Train model
    model = xgb.train(params, dtrain, num_boost_round=100)

    # Predict and calculate RMSE
    predictions = model.predict(dtest)
    rmse = mean_squared_error(y_test, predictions, squared=False)

    # Log the RMSE to W&B
    wandb.log({"rmse": rmse})

    wandb.finish()


In [32]:
wandb.agent(sweep_id, train, count=25)

[34m[1mwandb[0m: Agent Starting Run: qd8lopzs with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.75
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	max_depth: 12
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.929006453958731, max=1.0)…

0,1
rmse,▁

0,1
rmse,1623.08621


[34m[1mwandb[0m: Agent Starting Run: rvazewkj with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,612.80815


[34m[1mwandb[0m: Agent Starting Run: 6qkxk4l5 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 300
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11229749631811488, max=1.…

0,1
rmse,▁

0,1
rmse,605.06089


[34m[1mwandb[0m: Agent Starting Run: 8zcrpt27 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,546.71962


[34m[1mwandb[0m: Agent Starting Run: 908b7zly with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.75
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 300
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,529.78454


[34m[1mwandb[0m: Agent Starting Run: 99by3d1h with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.75
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,578.58018


[34m[1mwandb[0m: Agent Starting Run: btslqcay with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 300
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11132332878581173, max=1.…

0,1
rmse,▁

0,1
rmse,581.45812


[34m[1mwandb[0m: Agent Starting Run: u9xbtfmn with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 300
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11130308265890697, max=1.…

0,1
rmse,▁

0,1
rmse,609.72705


[34m[1mwandb[0m: Agent Starting Run: 8vuugzcr with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11123237835379718, max=1.…

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Agent Starting Run: 21gdf9uq with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,609.72705


[34m[1mwandb[0m: Agent Starting Run: o5t4ielu with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,610.32823


[34m[1mwandb[0m: Agent Starting Run: 4qlbb3t6 with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,583.92454


[34m[1mwandb[0m: Agent Starting Run: 7mi6mn3e with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,581.45812


[34m[1mwandb[0m: Agent Starting Run: c5wjhd59 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11254256004417043, max=1.…

0,1
rmse,▁

0,1
rmse,604.2648


[34m[1mwandb[0m: Agent Starting Run: glzsvs4l with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 300
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 5utvy1mm with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,604.2648


[34m[1mwandb[0m: Agent Starting Run: ciktpiuq with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 1


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,581.45812


[34m[1mwandb[0m: Agent Starting Run: 4bkmatfv with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,583.92454


[34m[1mwandb[0m: Agent Starting Run: z69u9g2m with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,607.23496


[34m[1mwandb[0m: Agent Starting Run: ni0l31q3 with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11264494754279404, max=1.…

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Agent Starting Run: zjbh7p2z with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Agent Starting Run: 83c1l2kf with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Agent Starting Run: 73saupud with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.75
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,580.22251


[34m[1mwandb[0m: Agent Starting Run: qak5bqbv with config:
[34m[1mwandb[0m: 	colsample_bytree: 1
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.9


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112679366664855, max=1.0…

VBox(children=(Label(value='0.010 MB of 0.010 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,587.47154


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 7wu3jp6g with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.75
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11120203673395163, max=1.…

0,1
rmse,▁

0,1
rmse,580.22251


## Experiment Observation and Analysis

### Objective
The goal of this series of experiments was to optimize the hyperparameters of an XGBoost model trained on the Diamonds dataset. We aimed to minimize the root mean squared error (RMSE) of the model's predictions on the test dataset. We used Weights & Biases (W&B) for experiment tracking and hyperparameter sweeps.

### Methodology
We employed W&B's Bayesian optimization for the hyperparameter tuning process, focusing on several key parameters:
- `max_depth`: The maximum depth of the trees.
- `learning_rate`: The step size shrinkage used to prevent overfitting.
- `n_estimators`: The number of trees in the forest.
- `subsample`: The fraction of samples to be used for fitting the individual base learners.
- `colsample_bytree`: The fraction of features to be used for each tree.

These parameters were varied over defined ranges to observe their effect on the RMSE.

### Results and Insights
The experiments were conducted over multiple runs, each with a different set of hyperparameters. Here are some of the noteworthy observations:

- The lowest RMSE achieved was 529.78, with the hyperparameters set to `colsample_bytree: 0.75`, `learning_rate: 0.2`, `max_depth: 6`, `n_estimators: 300`, and `subsample: 1.0`.
- Increasing the `max_depth` and `n_estimators` generally led to better performance, indicating that more complex models were able to capture the underlying patterns in the data more effectively.
- The `learning_rate` at 0.2 combined with higher values of `max_depth` and `n_estimators` consistently yielded better results, suggesting that a higher learning rate works well with a more complex model in this dataset.
- The optimal `subsample` and `colsample_bytree` values hovered around 0.9 and 0.75, respectively, balancing the model's ability to train on diverse data samples while avoiding overfitting.