<a href="https://colab.research.google.com/github/DevAmbani/weights-biases/blob/main/Week4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Week 4 - Models and Experimentation

## Step 1 Training a model

For the purposes of this demo, we will be using this [adapted demo](https://www.datacamp.com/tutorial/xgboost-in-python) and training an XGBoost model, and then doing some experimentation and hyperparameter tuning.


If running this notebook locally, use the following steps to create virtual environment:
- Don't use past python 3.10
- To create virtual environment use "venv"

`python -m venv NAME`

- Try to avoid anaconda, poetry or similar package management platforms
- To install a package use pip

`python -m pip install <package-name>`

- once you are done working with this virtual environment, deactivate it with `deactivate`

In [1]:
import sys
print(sys.version)

3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]


In [2]:
!apt install python3.10-venv
!python -m venv Week4HW

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  python3-pip-whl python3-setuptools-whl
The following NEW packages will be installed:
  python3-pip-whl python3-setuptools-whl python3.10-venv
0 upgraded, 3 newly installed, 0 to remove and 45 not upgraded.
Need to get 2,473 kB of archives.
After this operation, 2,884 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-pip-whl all 22.0.2+dfsg-1ubuntu0.4 [1,680 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-setuptools-whl all 59.6.0-1.2ubuntu0.22.04.1 [788 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3.10-venv amd64 3.10.12-1~22.04.3 [5,716 B]
Fetched 2,473 kB in 3s (867 kB/s)
Selecting previously unselected package python3-pip-whl.
(Reading database ... 121752 files and directories currently installed.)
Prep

### Install packages

In [3]:
!pip install wandb -qU

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m33.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m267.1/267.1 kB[0m [31m21.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [4]:
import xgboost as xgb
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

### Import data

We will be using Diamonds dataset imported from Seaborn. It is also available on [Kaggle](https://www.kaggle.com/datasets/shivam2503/diamonds).

Read about the features by following the link. We will be predicting the price of diamonds.

In [5]:
diamonds = sns.load_dataset('diamonds')
diamonds.head()

Unnamed: 0,carat,cut,color,clarity,depth,table,price,x,y,z
0,0.23,Ideal,E,SI2,61.5,55.0,326,3.95,3.98,2.43
1,0.21,Premium,E,SI1,59.8,61.0,326,3.89,3.84,2.31
2,0.23,Good,E,VS1,56.9,65.0,327,4.05,4.07,2.31
3,0.29,Premium,I,VS2,62.4,58.0,334,4.2,4.23,2.63
4,0.31,Good,J,SI2,63.3,58.0,335,4.34,4.35,2.75


In [6]:
diamonds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53940 entries, 0 to 53939
Data columns (total 10 columns):
 #   Column   Non-Null Count  Dtype   
---  ------   --------------  -----   
 0   carat    53940 non-null  float64 
 1   cut      53940 non-null  category
 2   color    53940 non-null  category
 3   clarity  53940 non-null  category
 4   depth    53940 non-null  float64 
 5   table    53940 non-null  float64 
 6   price    53940 non-null  int64   
 7   x        53940 non-null  float64 
 8   y        53940 non-null  float64 
 9   z        53940 non-null  float64 
dtypes: category(3), float64(6), int64(1)
memory usage: 3.0 MB


In [7]:
diamonds.shape

(53940, 10)

In [8]:
X,y = diamonds.drop('price', axis=1), diamonds[['price']]

# For the cut, color and clarity use pandas category to enable XGBoost ability to deal with categorical data.

X['cut'] = X['cut'].astype('category')
X['color'] = X['color'].astype('category')
X['clarity'] = X['clarity'].astype('category')

### Split the data and train a model

In [9]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix
dtrain = xgb.DMatrix(X_train, label=y_train, enable_categorical=True)
dtest = xgb.DMatrix(X_test, label=y_test, enable_categorical=True)

In [10]:
# Define hyperparameters
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}

n = 100
model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
)


    E.g. tree_method = "hist", device = "cuda"



In [11]:
# Define evaluation metrics - Root Mean Squared Error

predictions = model.predict(dtest)
rmse = mean_squared_error(y_test, predictions, squared=False)
print(f"RMSE: {rmse}")

RMSE: 532.8838153117543



    E.g. tree_method = "hist", device = "cuda"



### Incorporate validation

In [12]:
params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 100

# Create the validation set
evals = [(dtrain, "train"), (dtest, "validation")]

In [13]:
evals = [(dtrain, "train"), (dtest, "validation")]

model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=10,
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630
[10]	train-rmse:550.99470	validation-rmse:571.16640
[20]	train-rmse:491.51435	validation-rmse:544.08058



    E.g. tree_method = "hist", device = "cuda"



[30]	train-rmse:464.38845	validation-rmse:537.01895
[40]	train-rmse:445.99106	validation-rmse:533.85127
[50]	train-rmse:430.36010	validation-rmse:532.90320
[60]	train-rmse:418.87898	validation-rmse:533.04629
[70]	train-rmse:409.66247	validation-rmse:533.58046
[80]	train-rmse:397.34048	validation-rmse:534.31963
[90]	train-rmse:389.94294	validation-rmse:532.61946
[99]	train-rmse:377.70831	validation-rmse:532.88383


In [14]:
# Incorporate early stopping
n = 10000


model = xgb.train(
   params=params,
   dtrain=dtrain,
   num_boost_round=n,
   evals=evals,
   verbose_eval=50,
   # Activate early stopping
   early_stopping_rounds=50
)

[0]	train-rmse:2859.49097	validation-rmse:2851.62630



    E.g. tree_method = "hist", device = "cuda"



[50]	train-rmse:430.36010	validation-rmse:532.90320
[100]	train-rmse:377.56825	validation-rmse:532.79980
[103]	train-rmse:375.44970	validation-rmse:532.50220


In [15]:
# Cross-validation

params = {"objective": "reg:squarederror", "tree_method": "gpu_hist"}
n = 1000

results = xgb.cv(
   params, dtrain,
   num_boost_round=n,
   nfold=5,
   early_stopping_rounds=20
)


In [16]:
results.head()

Unnamed: 0,train-rmse-mean,train-rmse-std,test-rmse-mean,test-rmse-std
0,2861.153015,8.266765,2861.773555,36.937516
1,2081.378004,5.534608,2084.973481,32.064109
2,1545.361682,3.287745,1553.681211,31.059209
3,1182.364236,3.585787,1192.464771,26.157805
4,941.828819,2.971779,958.467497,23.613538


In [17]:
best_rmse = results['test-rmse-mean'].min()

best_rmse

549.1039652582465

## Start W&B


- Login into your W&B profile using the code below
- Alternatively you can set environment variables. There are several env variables which you can set to change the behavior of W&B logging. The most important are:
    - WANDB_API_KEY - find this in your "Settings" section under your profile
    - WANDB_BASE_URL - this is the url of the W&B server

- Find your API Token in "Profile" -> "Setttings" in the W&B App



In [18]:
# Log in to your W&B account
import wandb

wandb.login()

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [20]:
sweep_config = {
    'method': 'random',  # Can be 'grid', 'random', 'bayesian'
    'metric': {
        'name': 'rmse',
        'goal': 'minimize'
    },
    'parameters': {
        'learning_rate': {
            'min': 0.01,
            'max': 0.2
        },
        'max_depth': {
            'values': [3, 5, 7, 9]
        },
        'subsample': {
            'min': 0.6,
            'max': 0.9
        },
        'colsample_bytree': {
            'min': 0.6,
            'max': 0.9
        },
        'n_estimators': {
            'values': [50, 100, 150, 200]
        }
    }
}

In [21]:
sweep_id = wandb.sweep(sweep_config, project="xgboost_diamonds_sweep_experiments", entity='devambani')

Create sweep with ID: y3o2ro2k
Sweep URL: https://wandb.ai/devambani/xgboost_diamonds_sweep_experiments/sweeps/y3o2ro2k


In [25]:
def train():
    with wandb.init() as run:
        config = wandb.config

        params = {
            'objective': 'reg:squarederror',
            'learning_rate': config.learning_rate,
            'max_depth': int(config.max_depth),
            'subsample': config.subsample,
            'colsample_bytree': config.colsample_bytree,
            'eval_metric': 'rmse'
        }

        evals = [(dtrain, 'train'), (dtest, 'test')]
        model = xgb.train(params, dtrain, num_boost_round=100,
                          evals=evals, early_stopping_rounds=10)

        predictions = model.predict(dtest)
        rmse = np.sqrt(mean_squared_error(y_test, predictions))
        wandb.log({'rmse': rmse})

In [26]:
wandb.agent(sweep_id, train)

[34m[1mwandb[0m: Agent Starting Run: epsyyfgd with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7470157333200191
[34m[1mwandb[0m: 	learning_rate: 0.0931057403433692
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.6197332048089546


[0]	train-rmse:3663.79090	test-rmse:3659.13685
[1]	train-rmse:3349.38124	test-rmse:3344.38568
[2]	train-rmse:3066.36419	test-rmse:3061.30027
[3]	train-rmse:2816.88964	test-rmse:2811.37631
[4]	train-rmse:2603.99160	test-rmse:2599.50343
[5]	train-rmse:2394.21937	test-rmse:2391.97171
[6]	train-rmse:2223.30560	test-rmse:2222.55851
[7]	train-rmse:2055.83091	test-rmse:2056.82411
[8]	train-rmse:1882.25070	test-rmse:1884.63026
[9]	train-rmse:1741.47085	test-rmse:1746.49423
[10]	train-rmse:1632.98443	test-rmse:1641.26245
[11]	train-rmse:1500.26918	test-rmse:1510.64309
[12]	train-rmse:1382.55979	test-rmse:1395.45621
[13]	train-rmse:1275.92926	test-rmse:1291.39048
[14]	train-rmse:1205.84845	test-rmse:1224.89446
[15]	train-rmse:1116.79660	test-rmse:1138.26182
[16]	train-rmse:1052.52003	test-rmse:1077.10278
[17]	train-rmse:979.19071	test-rmse:1006.86589
[18]	train-rmse:914.34962	test-rmse:944.79823
[19]	train-rmse:856.90899	test-rmse:891.19205
[20]	train-rmse:816.36677	test-rmse:854.34870
[21]	trai

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,544.30118


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 8tkbgeto with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8382655303390566
[34m[1mwandb[0m: 	learning_rate: 0.06780090209968762
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.7989327556372321


[0]	train-rmse:3743.83083	test-rmse:3740.80294
[1]	train-rmse:3507.78901	test-rmse:3504.47497
[2]	train-rmse:3288.45864	test-rmse:3284.89117
[3]	train-rmse:3075.36175	test-rmse:3072.13196
[4]	train-rmse:2885.22407	test-rmse:2882.41024
[5]	train-rmse:2700.20276	test-rmse:2697.89538
[6]	train-rmse:2546.36484	test-rmse:2543.91019
[7]	train-rmse:2384.58762	test-rmse:2382.89670
[8]	train-rmse:2234.30032	test-rmse:2233.26862
[9]	train-rmse:2094.57074	test-rmse:2093.86309
[10]	train-rmse:1974.92406	test-rmse:1975.04059
[11]	train-rmse:1852.66844	test-rmse:1854.40146
[12]	train-rmse:1739.56780	test-rmse:1742.63415
[13]	train-rmse:1634.89070	test-rmse:1639.23659
[14]	train-rmse:1550.35754	test-rmse:1556.49487
[15]	train-rmse:1458.81826	test-rmse:1466.63244
[16]	train-rmse:1374.07394	test-rmse:1382.98343
[17]	train-rmse:1295.63101	test-rmse:1306.71889
[18]	train-rmse:1222.94397	test-rmse:1235.38895
[19]	train-rmse:1155.86146	test-rmse:1170.55865
[20]	train-rmse:1099.33440	test-rmse:1115.81298
[2

VBox(children=(Label(value='0.002 MB of 0.012 MB uploaded\r'), FloatProgress(value=0.18735518040383978, max=1.…

0,1
rmse,▁

0,1
rmse,531.71114


[34m[1mwandb[0m: Agent Starting Run: q9q9svye with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.87326927644894
[34m[1mwandb[0m: 	learning_rate: 0.056230102695253074
[34m[1mwandb[0m: 	max_depth: 7
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.7231991893807205


[0]	train-rmse:3786.68785	test-rmse:3783.14583
[1]	train-rmse:3589.38201	test-rmse:3585.40951
[2]	train-rmse:3403.57109	test-rmse:3398.80344
[3]	train-rmse:3222.01568	test-rmse:3217.25149
[4]	train-rmse:3056.82928	test-rmse:3051.59921
[5]	train-rmse:2895.74957	test-rmse:2890.47795
[6]	train-rmse:2758.27586	test-rmse:2752.57244
[7]	train-rmse:2613.65632	test-rmse:2608.05669
[8]	train-rmse:2478.04940	test-rmse:2472.28289
[9]	train-rmse:2350.19490	test-rmse:2344.37754
[10]	train-rmse:2238.30286	test-rmse:2232.39792
[11]	train-rmse:2124.02319	test-rmse:2118.40969
[12]	train-rmse:2016.82056	test-rmse:2011.74593
[13]	train-rmse:1916.35223	test-rmse:1911.58106
[14]	train-rmse:1832.59954	test-rmse:1827.91960
[15]	train-rmse:1742.53178	test-rmse:1738.83673
[16]	train-rmse:1657.49097	test-rmse:1654.09773
[17]	train-rmse:1578.04461	test-rmse:1575.28751
[18]	train-rmse:1503.08307	test-rmse:1500.62040
[19]	train-rmse:1433.25887	test-rmse:1431.44012
[20]	train-rmse:1372.74672	test-rmse:1371.35064
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,531.06758


[34m[1mwandb[0m: Agent Starting Run: wfa550w7 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8888074102390375
[34m[1mwandb[0m: 	learning_rate: 0.156692177535765
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.816315272711912


[0]	train-rmse:3423.87641	test-rmse:3420.94591
[1]	train-rmse:2930.45349	test-rmse:2927.82860
[2]	train-rmse:2518.55459	test-rmse:2518.06244
[3]	train-rmse:2149.70140	test-rmse:2151.64726
[4]	train-rmse:1862.33775	test-rmse:1866.67896
[5]	train-rmse:1601.08855	test-rmse:1609.15458
[6]	train-rmse:1418.74330	test-rmse:1430.96505
[7]	train-rmse:1230.84817	test-rmse:1246.26610
[8]	train-rmse:1075.31547	test-rmse:1096.83134
[9]	train-rmse:947.95014	test-rmse:974.28463
[10]	train-rmse:856.15971	test-rmse:888.31656
[11]	train-rmse:767.66172	test-rmse:805.51648
[12]	train-rmse:697.79284	test-rmse:741.58578
[13]	train-rmse:642.41274	test-rmse:693.16303
[14]	train-rmse:606.74709	test-rmse:665.55988
[15]	train-rmse:567.50202	test-rmse:632.43192
[16]	train-rmse:536.52620	test-rmse:608.02678
[17]	train-rmse:510.87515	test-rmse:589.03459
[18]	train-rmse:491.39159	test-rmse:574.61090
[19]	train-rmse:475.47650	test-rmse:562.94690
[20]	train-rmse:464.34719	test-rmse:557.45069
[21]	train-rmse:456.40554	

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,536.76099


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 6jhclqwq with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8002823341802576
[34m[1mwandb[0m: 	learning_rate: 0.1889187482402675
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.6324657021093064


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011113003855555487, max=1.0…

[0]	train-rmse:3311.21504	test-rmse:3306.93598
[1]	train-rmse:2739.41764	test-rmse:2733.74017
[2]	train-rmse:2283.03000	test-rmse:2278.58227
[3]	train-rmse:1887.16652	test-rmse:1885.94043
[4]	train-rmse:1595.83088	test-rmse:1600.99085
[5]	train-rmse:1338.42511	test-rmse:1348.07571
[6]	train-rmse:1171.80155	test-rmse:1187.59703
[7]	train-rmse:999.91724	test-rmse:1021.48659
[8]	train-rmse:865.56575	test-rmse:894.09384
[9]	train-rmse:761.99133	test-rmse:798.92843
[10]	train-rmse:692.91026	test-rmse:739.87574
[11]	train-rmse:627.93062	test-rmse:683.27011
[12]	train-rmse:581.93220	test-rmse:643.29288
[13]	train-rmse:545.07309	test-rmse:614.57072
[14]	train-rmse:524.58300	test-rmse:601.97744
[15]	train-rmse:500.24025	test-rmse:585.39985
[16]	train-rmse:482.78272	test-rmse:574.26930
[17]	train-rmse:468.04651	test-rmse:566.45284
[18]	train-rmse:458.55712	test-rmse:560.39973
[19]	train-rmse:450.10747	test-rmse:557.03910
[20]	train-rmse:443.20057	test-rmse:554.75581
[21]	train-rmse:439.06509	tes

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,549.39399


[34m[1mwandb[0m: Agent Starting Run: 8irqn0kn with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8279543027683408
[34m[1mwandb[0m: 	learning_rate: 0.03089446375288491
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.8897124116552463


[0]	train-rmse:3882.79553	test-rmse:3879.07460
[1]	train-rmse:3777.35007	test-rmse:3772.86438
[2]	train-rmse:3675.19636	test-rmse:3670.04144
[3]	train-rmse:3576.61253	test-rmse:3571.11739
[4]	train-rmse:3481.42675	test-rmse:3475.29953
[5]	train-rmse:3389.54385	test-rmse:3382.85920
[6]	train-rmse:3302.78430	test-rmse:3295.23376
[7]	train-rmse:3216.88183	test-rmse:3209.31309
[8]	train-rmse:3134.35519	test-rmse:3126.41176
[9]	train-rmse:3054.68981	test-rmse:3046.15838
[10]	train-rmse:2979.41884	test-rmse:2970.34769
[11]	train-rmse:2904.66849	test-rmse:2894.70521
[12]	train-rmse:2831.62275	test-rmse:2820.70735
[13]	train-rmse:2762.29825	test-rmse:2751.29161
[14]	train-rmse:2697.23539	test-rmse:2685.63965
[15]	train-rmse:2632.48678	test-rmse:2620.42541
[16]	train-rmse:2569.23130	test-rmse:2557.15569
[17]	train-rmse:2507.83015	test-rmse:2495.20037
[18]	train-rmse:2449.13109	test-rmse:2436.70713
[19]	train-rmse:2393.63314	test-rmse:2380.88427
[20]	train-rmse:2340.61356	test-rmse:2327.86746
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,888.43061


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: ov02ox8b with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.6648167983265665
[34m[1mwandb[0m: 	learning_rate: 0.14643583648122604
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.815814858806591


[0]	train-rmse:3485.99281	test-rmse:3478.44399
[1]	train-rmse:3057.25280	test-rmse:3048.66098
[2]	train-rmse:2695.52382	test-rmse:2685.16514
[3]	train-rmse:2405.84537	test-rmse:2394.55873
[4]	train-rmse:2167.09468	test-rmse:2153.41248
[5]	train-rmse:1951.80149	test-rmse:1936.95346
[6]	train-rmse:1795.06020	test-rmse:1780.04685
[7]	train-rmse:1669.57703	test-rmse:1655.35380
[8]	train-rmse:1547.21516	test-rmse:1533.22869
[9]	train-rmse:1448.05366	test-rmse:1433.09719
[10]	train-rmse:1384.29923	test-rmse:1370.24205
[11]	train-rmse:1290.46904	test-rmse:1276.75673
[12]	train-rmse:1230.63847	test-rmse:1216.25213
[13]	train-rmse:1184.74309	test-rmse:1172.00371
[14]	train-rmse:1159.43426	test-rmse:1147.50191
[15]	train-rmse:1104.93059	test-rmse:1092.10048
[16]	train-rmse:1068.40615	test-rmse:1054.03871
[17]	train-rmse:1037.80176	test-rmse:1023.73407
[18]	train-rmse:1009.43645	test-rmse:993.91476
[19]	train-rmse:973.26366	test-rmse:957.32940
[20]	train-rmse:951.62098	test-rmse:935.88954
[21]	tr

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,603.13775


[34m[1mwandb[0m: Agent Starting Run: 0ehybpry with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.636354731194539
[34m[1mwandb[0m: 	learning_rate: 0.11881161156834188
[34m[1mwandb[0m: 	max_depth: 5
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.839835637920948


[0]	train-rmse:3575.22052	test-rmse:3568.77455
[1]	train-rmse:3193.21227	test-rmse:3185.68877
[2]	train-rmse:2859.21827	test-rmse:2850.54582
[3]	train-rmse:2573.07227	test-rmse:2563.90102
[4]	train-rmse:2337.33303	test-rmse:2325.82142
[5]	train-rmse:2112.15222	test-rmse:2100.90909
[6]	train-rmse:1939.19040	test-rmse:1927.98484
[7]	train-rmse:1791.38372	test-rmse:1780.11613
[8]	train-rmse:1642.21921	test-rmse:1632.51897
[9]	train-rmse:1514.51683	test-rmse:1506.12677
[10]	train-rmse:1425.90893	test-rmse:1417.96848
[11]	train-rmse:1303.35002	test-rmse:1295.12122
[12]	train-rmse:1223.39466	test-rmse:1216.26606
[13]	train-rmse:1156.28976	test-rmse:1151.02944
[14]	train-rmse:1111.87306	test-rmse:1109.21163
[15]	train-rmse:1033.87951	test-rmse:1032.03130
[16]	train-rmse:980.96684	test-rmse:979.36237
[17]	train-rmse:942.21682	test-rmse:942.36917
[18]	train-rmse:900.97369	test-rmse:902.25275
[19]	train-rmse:852.97775	test-rmse:853.91818
[20]	train-rmse:823.62784	test-rmse:824.91564
[21]	train-r

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,546.64836


[34m[1mwandb[0m: Agent Starting Run: 780qruvn with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8031828058099304
[34m[1mwandb[0m: 	learning_rate: 0.016686513865545344
[34m[1mwandb[0m: 	max_depth: 5
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.6794214025867659


[0]	train-rmse:3930.19585	test-rmse:3927.15758
[1]	train-rmse:3869.78604	test-rmse:3866.41331
[2]	train-rmse:3810.38755	test-rmse:3806.75621
[3]	train-rmse:3751.28475	test-rmse:3747.80975
[4]	train-rmse:3693.79238	test-rmse:3690.11506
[5]	train-rmse:3636.71494	test-rmse:3632.97776
[6]	train-rmse:3583.98711	test-rmse:3579.96419
[7]	train-rmse:3528.41618	test-rmse:3524.72245
[8]	train-rmse:3474.00857	test-rmse:3470.30765
[9]	train-rmse:3420.61357	test-rmse:3416.63072
[10]	train-rmse:3370.49371	test-rmse:3366.31288
[11]	train-rmse:3318.80357	test-rmse:3314.52093
[12]	train-rmse:3268.25249	test-rmse:3263.77385
[13]	train-rmse:3218.54743	test-rmse:3214.08077
[14]	train-rmse:3172.98312	test-rmse:3168.21389
[15]	train-rmse:3124.86786	test-rmse:3119.78403
[16]	train-rmse:3077.45568	test-rmse:3072.14888
[17]	train-rmse:3030.90811	test-rmse:3025.45106
[18]	train-rmse:2985.23700	test-rmse:2979.65954
[19]	train-rmse:2940.19354	test-rmse:2934.28541
[20]	train-rmse:2897.82306	test-rmse:2891.78678
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,1052.581


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: rqggsxcq with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7722824501212442
[34m[1mwandb[0m: 	learning_rate: 0.06274388963310655
[34m[1mwandb[0m: 	max_depth: 3
[34m[1mwandb[0m: 	n_estimators: 50
[34m[1mwandb[0m: 	subsample: 0.7428567950395064


[0]	train-rmse:3772.84706	test-rmse:3768.06582
[1]	train-rmse:3566.85814	test-rmse:3560.61932
[2]	train-rmse:3375.00660	test-rmse:3367.53809
[3]	train-rmse:3201.05854	test-rmse:3193.23097
[4]	train-rmse:3039.28986	test-rmse:3029.93402
[5]	train-rmse:2885.29869	test-rmse:2875.52406
[6]	train-rmse:2746.01772	test-rmse:2735.03697
[7]	train-rmse:2615.93934	test-rmse:2604.89414
[8]	train-rmse:2491.63433	test-rmse:2479.78885
[9]	train-rmse:2374.14115	test-rmse:2361.11733
[10]	train-rmse:2272.18105	test-rmse:2258.49072
[11]	train-rmse:2168.23951	test-rmse:2153.39308
[12]	train-rmse:2076.45631	test-rmse:2062.08367
[13]	train-rmse:1991.01552	test-rmse:1976.50702
[14]	train-rmse:1916.72984	test-rmse:1901.87752
[15]	train-rmse:1842.98222	test-rmse:1828.29790
[16]	train-rmse:1774.16528	test-rmse:1759.33315
[17]	train-rmse:1706.78750	test-rmse:1692.00217
[18]	train-rmse:1648.07819	test-rmse:1633.91852
[19]	train-rmse:1593.58578	test-rmse:1580.23514
[20]	train-rmse:1545.37224	test-rmse:1532.27449
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,671.69553


[34m[1mwandb[0m: Agent Starting Run: cgi44bo8 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8229560195906361
[34m[1mwandb[0m: 	learning_rate: 0.03866340342755223
[34m[1mwandb[0m: 	max_depth: 9
[34m[1mwandb[0m: 	n_estimators: 150
[34m[1mwandb[0m: 	subsample: 0.8197204198311758


[0]	train-rmse:3849.43909	test-rmse:3846.69199
[1]	train-rmse:3710.61997	test-rmse:3707.70953
[2]	train-rmse:3577.34068	test-rmse:3574.02321
[3]	train-rmse:3444.63904	test-rmse:3441.49818
[4]	train-rmse:3321.44281	test-rmse:3318.39494
[5]	train-rmse:3198.76177	test-rmse:3195.69174
[6]	train-rmse:3091.87192	test-rmse:3088.51852
[7]	train-rmse:2977.92942	test-rmse:2974.89596
[8]	train-rmse:2868.59624	test-rmse:2865.74386
[9]	train-rmse:2763.67267	test-rmse:2760.97001
[10]	train-rmse:2669.78757	test-rmse:2667.22788
[11]	train-rmse:2572.40973	test-rmse:2570.25939
[12]	train-rmse:2479.21458	test-rmse:2477.00613
[13]	train-rmse:2389.80522	test-rmse:2388.31889
[14]	train-rmse:2312.99525	test-rmse:2311.78385
[15]	train-rmse:2229.86179	test-rmse:2229.04522
[16]	train-rmse:2150.16735	test-rmse:2149.72384
[17]	train-rmse:2073.80113	test-rmse:2073.96607
[18]	train-rmse:2000.06696	test-rmse:2000.40522
[19]	train-rmse:1929.61073	test-rmse:1930.75403
[20]	train-rmse:1866.90700	test-rmse:1868.34406
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,539.73536


[34m[1mwandb[0m: Agent Starting Run: xt6kzha3 with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8335758643543822
[34m[1mwandb[0m: 	learning_rate: 0.039241697342633275
[34m[1mwandb[0m: 	max_depth: 7
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.8407352799453298


[0]	train-rmse:3847.84663	test-rmse:3844.97378
[1]	train-rmse:3707.60784	test-rmse:3704.43155
[2]	train-rmse:3572.93588	test-rmse:3569.28344
[3]	train-rmse:3439.57777	test-rmse:3436.09836
[4]	train-rmse:3315.25747	test-rmse:3311.60829
[5]	train-rmse:3192.32198	test-rmse:3188.32337
[6]	train-rmse:3084.80818	test-rmse:3080.38848
[7]	train-rmse:2970.71170	test-rmse:2966.25508
[8]	train-rmse:2861.36460	test-rmse:2856.19145
[9]	train-rmse:2756.68798	test-rmse:2751.15717
[10]	train-rmse:2662.53124	test-rmse:2656.57164
[11]	train-rmse:2565.49679	test-rmse:2559.62709
[12]	train-rmse:2472.49759	test-rmse:2466.23174
[13]	train-rmse:2383.82666	test-rmse:2377.83730
[14]	train-rmse:2307.25061	test-rmse:2300.98391
[15]	train-rmse:2224.60388	test-rmse:2218.61436
[16]	train-rmse:2145.38092	test-rmse:2139.67216
[17]	train-rmse:2069.58533	test-rmse:2064.33309
[18]	train-rmse:1996.31029	test-rmse:1991.38115
[19]	train-rmse:1926.64007	test-rmse:1921.76205
[20]	train-rmse:1864.28791	test-rmse:1859.36380
[2

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,544.32912


[34m[1mwandb[0m: Agent Starting Run: gwjmbfim with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.7798838022576992
[34m[1mwandb[0m: 	learning_rate: 0.15033991249902837
[34m[1mwandb[0m: 	max_depth: 5
[34m[1mwandb[0m: 	n_estimators: 100
[34m[1mwandb[0m: 	subsample: 0.6240162434556568


[0]	train-rmse:3456.64791	test-rmse:3451.54173
[1]	train-rmse:2987.99579	test-rmse:2980.37733
[2]	train-rmse:2595.46750	test-rmse:2588.54425
[3]	train-rmse:2256.02512	test-rmse:2250.09851
[4]	train-rmse:1976.84532	test-rmse:1970.30844
[5]	train-rmse:1734.83739	test-rmse:1730.87910
[6]	train-rmse:1558.77486	test-rmse:1553.84641
[7]	train-rmse:1381.94640	test-rmse:1377.47120
[8]	train-rmse:1237.87494	test-rmse:1233.71252
[9]	train-rmse:1116.24643	test-rmse:1111.35454
[10]	train-rmse:1027.23972	test-rmse:1022.99415
[11]	train-rmse:944.30447	test-rmse:941.78208
[12]	train-rmse:873.93470	test-rmse:873.00649
[13]	train-rmse:816.55910	test-rmse:816.42484
[14]	train-rmse:783.32477	test-rmse:785.50380
[15]	train-rmse:740.73107	test-rmse:745.74474
[16]	train-rmse:706.80901	test-rmse:713.26085
[17]	train-rmse:676.90948	test-rmse:683.46823
[18]	train-rmse:654.84702	test-rmse:662.81757
[19]	train-rmse:635.71062	test-rmse:644.44227
[20]	train-rmse:622.77169	test-rmse:633.43227
[21]	train-rmse:611.44

VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
rmse,▁

0,1
rmse,547.47415


[34m[1mwandb[0m: Agent Starting Run: g8n4552t with config:
[34m[1mwandb[0m: 	colsample_bytree: 0.8411822129957723
[34m[1mwandb[0m: 	learning_rate: 0.16713020780320745
[34m[1mwandb[0m: 	max_depth: 7
[34m[1mwandb[0m: 	n_estimators: 200
[34m[1mwandb[0m: 	subsample: 0.6579110944889554
[34m[1mwandb[0m: Ctrl + C detected. Stopping sweep.


##Hyperparameter Variability and Impact


*   colsample_bytree (Feature Sampling Rate): The variability in this parameter, ranging from 0.639 to 0.892, suggests it plays a significant role in defining the model’s capability to generalize. This hyperparameter specifies the proportion of total features used in building each tree within the ensemble. The absence of a consistent pattern in its impact on RMSE underscores the complexity of its interaction with other hyperparameters, indicating a need for tailored adjustments based on specific dataset characteristics and feature correlations.

*   learning_rate (Step Size for Weight Updates): Set between approximately 0.017 and 0.16, the learning rate determines the speed at which a model adjusts its weights to minimize errors. Models with intermediate learning rates around 0.1 generally yielded the best performance, showcasing that moderate rates help balance between adequate learning time and avoiding overshooting the minima. Extremely low rates may result in a prolonged training process with little improvement per iteration, while excessively high rates might cause the learning process to converge prematurely or oscillate around the optimal weights.

*   max_depth (Tree Depth): The depth of the trees, which ranged from 3 to 9, directly impacts the model’s complexity and its ability to capture underlying patterns in the data. Shallow trees (e.g., depth of 3) tend to produce higher bias and underfitting, unable to model the data complexity adequately. In contrast, deeper trees (e.g., depth of 9) are prone to learning noise and specificities of the training data, leading to overfitting. Optimal tree depth thus needs careful calibration to balance model bias and variance, particularly in relation to the dataset size and feature interactions.

*   n_estimators (Number of Trees): The configuration varied with values like 50, 100, 150, and 200 trees. Increasing the number of trees generally enhances the model's accuracy and stability by averaging out predictions, reducing the likelihood of overfitting to some extent. However, beyond a certain threshold, additional trees do not improve performance significantly and might even increase computational complexity without substantial gains, highlighting the importance of setting this parameter in context with other model settings.

*   subsample (Data Sampling Rate for Building Trees): With values ranging from about 0.638 to 0.882, this parameter controls the fraction of data used for constructing each tree. A higher subsample rate typically increases the diversity among the trees in the model, potentially improving robustness by reducing variance. However, too high a subsample rate might lead to diminished model performance due to a decrease in training data randomness.














##Performance Analysis


*   RMSE Assessment: The root mean square error (RMSE) showcased a substantial range from about 533 to 1868. Lower RMSE scores indicate more accurate model predictions. The model configurations leading to the highest RMSE often featured imbalances, such as very low learning rates combined with insufficient numbers of trees, underlining the critical nature of appropriate hyperparameter settings.

*   The best-performing models, with the lowest RMSE scores around 533, employed more balanced configurations. These avoided extremes in any single parameter, demonstrating the effectiveness of moderate settings, particularly in terms of learning rates and the number of estimators used.



##Guidelines for Enhanced Model Tuning:



*   Strategic Depth and Complexity Management: Determining the optimal tree depth is crucial, balancing the need to capture sufficient data complexity with the risk of overfitting. This often requires adjusting based on the specific data being used, possibly through cross-validation techniques to test different depths under various conditions.

*   Optimizing Learning Dynamics: Fine-tuning the learning rate can significantly affect model performance. Incremental adjustments and employing techniques such as learning rate schedules or adaptive learning rate methods can optimize this parameter for better convergence behaviors.

*   Comprehensive Hyperparameter Optimization: Utilizing systematic approaches like grid search or random search can help in uncovering the most effective combinations of parameters. These methods allow for exploring a broad space of parameter settings, ensuring that the interdependencies between hyperparameters are adequately addressed, which is crucial for achieving the best model performance.




