# Week 4 - Models and Experimentation

## Step 1 Training a model

For the purposes of this demo, we will be using this [adapted demo](https://www.datacamp.com/tutorial/xgboost-in-python) and training an XGBoost model, and then doing some experimentation and hyperparameter tuning.


If running this notebook locally, use the following steps to create virtual environment:
- Don't use past python 3.10
- To create virtual environment use "venv"

`python -m venv NAME`

- Try to avoid anaconda, poetry or similar package management platforms
- To install a package use pip

`python -m pip install <package-name>`

- once you are done working with this virtual environment, deactivate it with `deactivate`

### Install packages

In [1]:
!pip install wandb -qU

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m267.1/267.1 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import xgboost as xgb
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error


### Import data

We will be using Diamonds dataset imported from Seaborn. It is also available on [Kaggle](https://www.kaggle.com/datasets/shivam2503/diamonds).

Read about the features by following the link. We will be predicting the price of diamonds.

In [3]:
diamonds = sns.load_dataset('diamonds')
diamonds.head()

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31
2,0.23,Good,E,VS1,56.9,65.0,327,4.05,4.07,2.31
3,0.29,Premium,I,VS2,62.4,58.0,334,4.2,4.23,2.63
4,0.31,Good,J,SI2,63.3,58.0,335,4.34,4.35,2.75


In [4]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype   
---  ------   --------------  -----   
 0   carat    53940 non-null  float64 
 1   cut      53940 non-null  category
 2   color    53940 non-null  category
 3   clarity  53940 non-null  category
 4   depth    53940 non-null  float64 
 5   table    53940 non-null  float64 
 6   price    53940 non-null  int64   
 7   x        53940 non-null  float64 
 8   y        53940 non-null  float64 
 9   z        53940 non-null  float64 
dtypes: category(3), float64(6), int64(1)
memory usage: 3.0 MB


In [5]:
diamonds.shape

(53940, 10)

In [6]:
X,y = diamonds.drop('price', axis=1), diamonds[['price']]

# For the cut, color and clarity use pandas category to enable XGBoost ability to deal with categorical data.

X['cut'] = X['cut'].astype('category')
X['color'] = X['color'].astype('category')
X['clarity'] = X['clarity'].astype('category')

### Split the data and train a model

In [7]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)

In [8]:
# Define hyperparameters
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}

n = 100
model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
)


    E.g. tree_method = "hist", device = "cuda"



In [9]:
# Define evaluation metrics - Root Mean Squared Error

predictions = model.predict(dtest)
rmse = mean_squared_error(y_test, predictions, squared=False)
print(f"RMSE: {rmse}")

RMSE: 532.8838153117543



    E.g. tree_method = "hist", device = "cuda"



### Incorporate validation

In [10]:
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 100

# Create the validation set
evals = [(dtrain, "train"), (dtest, "validation")]

In [11]:
evals = [(dtrain, "train"), (dtest, "validation")]

model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=10,
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630
[10]	train-rmse:550.99470	validation-rmse:571.16640
[20]	train-rmse:491.51435	validation-rmse:544.08058



    E.g. tree_method = "hist", device = "cuda"



[30]	train-rmse:464.38845	validation-rmse:537.01895
[40]	train-rmse:445.99106	validation-rmse:533.85127
[50]	train-rmse:430.36010	validation-rmse:532.90320
[60]	train-rmse:418.87898	validation-rmse:533.04629
[70]	train-rmse:409.66247	validation-rmse:533.58046
[80]	train-rmse:397.34048	validation-rmse:534.31963
[90]	train-rmse:389.94294	validation-rmse:532.61946
[99]	train-rmse:377.70831	validation-rmse:532.88383


In [12]:
# Incorporate early stopping
n = 10000


model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=50,
   # Activate early stopping
   early_stopping_rounds=50
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630



    E.g. tree_method = "hist", device = "cuda"



[50]	train-rmse:430.36010	validation-rmse:532.90320
[100]	train-rmse:377.56825	validation-rmse:532.79980
[102]	train-rmse:376.20429	validation-rmse:532.59813


In [13]:
# Cross-validation

params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 1000

results = xgb.cv(
   params, dtrain,
   num_boost_round=n,
   nfold=5,
   early_stopping_rounds=20
)



    E.g. tree_method = "hist", device = "cuda"



In [14]:
results.head()

Unnamed: 0,train-rmse-mean,train-rmse-std,test-rmse-mean,test-rmse-std
0,2861.153015,8.266765,2861.773555,36.937516
1,2081.378004,5.534608,2084.973481,32.064109
2,1545.361682,3.287745,1553.681211,31.059209
3,1182.364236,3.585787,1192.464771,26.157805
4,941.828819,2.971779,958.467497,23.613538


In [15]:
best_rmse = results['test-rmse-mean'].min()

best_rmse

549.1039652582465

## Start W&B


- Login into your W&B profile using the code below
- Alternatively you can set environment variables. There are several env variables which you can set to change the behavior of W&B logging. The most important are:
    - WANDB_API_KEY - find this in your "Settings" section under your profile
    - WANDB_BASE_URL - this is the url of the W&B server

- Find your API Token in "Profile" -> "Setttings" in the W&B App



In [17]:
# Log in to your W&B account
import wandb

wandb.login()

[34m[1mwandb[0m: Currently logged in as: [33mmehak-kawatra3[0m ([33mpracticum-msai[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [None]:
# TO DO
# Start experiment tracking with W&B
# Do at least 5 experiments with various hyperparameters
# Choose any method for hyperparameter tuning: grid search, random search, bayesian search
# Describe your findings and what you see

In [18]:
sweep_config = {
    "method": "random",  # random search method
    "metric": {
      "name": "rmse",
      "goal": "minimize"
    },
    "parameters": {
        'max_depth': {
            'values': [2, 6, 8, 12]
        },
        'subsample': {
            'min': 0.5,
            'max': 0.9
        },
        'colsample_bytree': {
            'min': 0.3,
            'max': 0.8  # fraction of features to use for each tree
        },
        'n_estimators': {
            'values': [50, 100, 150, 200]  # number of trees in the ensemble
        },
        'learning_rate': {
            'values': [0.01, 0.05, 0.1, 0.2]
        }
    }
}


In [19]:
sweep_id = wandb.sweep(sweep_config, project="xgb_experiments", entity='mehak-kawatra3')


Create sweep with ID: rzco21i4
Sweep URL: https://wandb.ai/mehak-kawatra3/xgb_experiments/sweeps/rzco21i4


In [20]:
def train():
    run = wandb.init()

    # Access the hyperparameters through wandb.config
    config = wandb.config

    # Define the model parameters
    params = {
        'objective': 'reg:squarederror',
        'learning_rate': config.learning_rate,
        'max_depth': int(config.max_depth),
        'subsample': config.subsample,
        'colsample_bytree': config.colsample_bytree,
        'n_estimators': int(config.n_estimators),
        'eval_metric': 'rmse'
    }

    # Train the model
    model = xgb.train(params, dtrain, num_boost_round=config.n_estimators)

    # Evaluate the model
    predictions = model.predict(dtest)
    rmse = np.sqrt(mean_squared_error(y_test, predictions))

    # Log metrics to W&B
    wandb.log({'rmse': rmse})

    # Finish the W&B run
    run.finish()


In [21]:
wandb.agent(sweep_id, train, count=30)

[34m[1mwandb[0m: Agent Starting Run: jdd0r08t with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.3002910960452374
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8897132078963956
[34m[1mwandb[0m: Currently logged in as: [33mmehak-kawatra3[0m. Use [1m`wandb login --relogin`[0m to force relogin


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,667.06955


[34m[1mwandb[0m: Agent Starting Run: 6e65d1pu with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.46961041618940186
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.6342477273939169


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112978933331559, max=1.0…

VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,549.97421


[34m[1mwandb[0m: Agent Starting Run: hy8hol6e with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7053636040324154
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.7863729207326369


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,529.37551


[34m[1mwandb[0m: Agent Starting Run: 9h85lczm with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6619971012240453
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.8400986637107286


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,559.36382


[34m[1mwandb[0m: Agent Starting Run: qisa52sw with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7466522400581715
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 2
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.5428005740401134


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,693.20463


[34m[1mwandb[0m: Agent Starting Run: f5gq04lb with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7230395907610241
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 12
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.5494111531667127


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,563.53025


[34m[1mwandb[0m: Agent Starting Run: 50zvw9ni with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6823026387915789
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8070739341150763


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,528.3607


[34m[1mwandb[0m: Agent Starting Run: swx4gf7a with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6000507646992786
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 12
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.66030421723821


VBox(children=(Label(value='0.001 MB of 0.010 MB uploaded\r'), FloatProgress(value=0.11443742561567335, max=1.…

0,1
rmse,▁

0,1
rmse,638.83692


[34m[1mwandb[0m: Agent Starting Run: dfp3b9r8 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7650732568837884
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.7899386759319866


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1652.66289


[34m[1mwandb[0m: Agent Starting Run: mw1qdvi6 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.40812135256738186
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 2
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.651017457638056


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,759.5355


[34m[1mwandb[0m: Agent Starting Run: oqkvefat with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.4078755005679941
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.8968903124530123


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,968.3609


[34m[1mwandb[0m: Agent Starting Run: gva1pxm2 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.4996780176441682
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.7466617277691301


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,551.09547


[34m[1mwandb[0m: Agent Starting Run: 9rpblgo1 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.4008508423255483
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.5095343485581585


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,577.20612


[34m[1mwandb[0m: Agent Starting Run: cq2zxqyb with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5697370727122655
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.6700465960246103


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,2542.04155


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: b0j9ebkp with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5215458891663016
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 12
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.5654552945466302


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,650.45834


[34m[1mwandb[0m: Agent Starting Run: v8yobb9s with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.40800454600451547
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 2
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.7596737967729674


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,843.07818


[34m[1mwandb[0m: Agent Starting Run: mvj55yok with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5064582744637207
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.514804701790732


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,574.70888


[34m[1mwandb[0m: Agent Starting Run: k30au3c2 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5502125895656108
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.5888182221645369


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,591.08833


[34m[1mwandb[0m: Agent Starting Run: md18cock with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7072206412526583
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.7707613816650283


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,545.7932


[34m[1mwandb[0m: Agent Starting Run: y81wve0v with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.517138455792121
[34m[1mwandb[0m: 	learning_rate: 0.1
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.8390222247268104


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,603.89691


[34m[1mwandb[0m: Agent Starting Run: zliipoqp with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6286958839227917
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 2
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.784599436993237


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,968.20651


[34m[1mwandb[0m: Agent Starting Run: ug1leskw with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7002913572074307
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.8935177084155159


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,527.02804


[34m[1mwandb[0m: Agent Starting Run: w0lruzhu with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6983040358728058
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 12
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.8587948301800054


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,604.34289


[34m[1mwandb[0m: Agent Starting Run: fmhpsgfq with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.32027085745386874
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8220928662938758


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,644.36105


[34m[1mwandb[0m: Agent Starting Run: 1c45pn75 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.5975836544843653
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.704330323760876


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1159.79623


[34m[1mwandb[0m: Agent Starting Run: i5i103s6 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6438427531401683
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.6919184172939936


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,560.8084


[34m[1mwandb[0m: Agent Starting Run: ob0wnq7p with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.4590914294726232
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	max_depth: 2
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.884515593864458


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,2742.41886


[34m[1mwandb[0m: Agent Starting Run: z1hl16i2 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.3701938667473554
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.7371922345969594


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,569.8021


[34m[1mwandb[0m: Agent Starting Run: n3de1rdv with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.3172474816236297
[34m[1mwandb[0m: 	learning_rate: 0.05
[34m[1mwandb[0m: 	max_depth: 8
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.5455834731789513


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1229.91074


[34m[1mwandb[0m: Agent Starting Run: 7sh5hy7h with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.4915358793106711
[34m[1mwandb[0m: 	learning_rate: 0.2
[34m[1mwandb[0m: 	max_depth: 6
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.5332878227086275


VBox(children=(Label(value='0.011 MB of 0.011 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,559.59784


Based on the sweep data from Weights & Biases, we can analyse how different  hyperparameters for XGBoost model influence the RMSE metric/model performance.

**`colsample_bytree`**: this controls the fraction of features used to build each tree, varying from 0.3 to 0.7. The impact on RMSE was actually not too consistent showing that in this range, the model is kind of insensitive to changes in feature sampling.

**`learning_rate`**: Learning ratetested at 0.01, 0.05, 0.1, and 0.2 impacts model convergence and performance a lot. Learning rate of 0.1 generally yielded better performance, finding the right balance between the speed and stability of convergence. We can see very low rates slowed down learning a lot, and very high rates were leading to overshooting - giving higher RMSE values.

**`max_depth`**: Maximum depth was tested at 2, 6, 8, and 12 - very directly affecting the model's ability to train. Depth 8 was most effective suggesting we should use a model of moderate depth, shows that it is a good trade-off between fitting the model sufficiently and avoiding overfitting, as extremely deep trees (depth 12) did not consistently improve performance, indicating potential overfitting issues.

**`n_estimators`**: This value is the number of trees in the model, we tested values 50, 100, 150, and 200. we see that increasing the number of trees did improve RMSE, especially at higher counts - showing better learning and generalization capabilities. But also the improvements tended to plateau, indicating a point beyond which more trees do not significantly enhance performance but increase computational cost.

**`subsample`**: The fraction of the training data used for building trees was tested between 0.5 and 0.9. Higher subsample values were seen to give better RMSE results - showing that using more data was better capturing underlying patterns in the dataset and did not lead to overfitting.

The best-performing model configuration was with the lowest RMSE of 527.02804 - with a balanced values for most params - `colsample_bytree` around 0.7, `learning_rate` of 0.05, a `max_depth` of 8, `n_estimators` 150, and a `subsample` rate near 0.89. This configuration underscores the importance of balancing learning dynamics and model complexity to achieve optimal predictive performance.