In this notebook we will be using Weights and Biases (W&B) integration for XGBoost for experiment tracking and use W&B Sweep for hyperparameter sweep. 

Note that I am using my own [fork](https://github.com/ayulockin/client) of the `wandb/client` repo where I have improved the existing integration for XGBoost. You can find the pending PR [here](https://github.com/wandb/client/pull/2929). It is expected to be merged soon. I hope this notebook shows you the benefits of using this callback. 

# Imports and Setup

In [None]:
!git clone https://github.com/ayulockin/client
%cd client
!pip -qq install .
%cd ..

In [None]:
import os
import json
import time
import numpy as np
import pandas as pd
from datetime import datetime

import xgboost as xgb
from xgboost.callback import EarlyStopping
from sklearn.metrics import mean_squared_error

The existing integration of XGBoost (`wandb_callback`) uses an old style callback that will be deprecated in favor of `WandbCallback`. 

In [None]:
import wandb
from wandb.xgboost import WandbCallback

# Login
wandb.login()

# Load Dataset

If you haven't already check out the [Tutorial to the G-Research Crypto Competition](https://www.kaggle.com/cstein06/tutorial-to-the-g-research-crypto-competition).

In [None]:
crypto_df = pd.read_csv('../input/g-research-crypto-forecasting/train.csv')
assets = pd.read_csv('../input/g-research-crypto-forecasting/asset_details.csv').sort_values("Asset_ID").reset_index(drop=True)
crypto_df.head()

# Prepare Split

In [None]:
crypto_df['datetime'] = pd.to_datetime(crypto_df['timestamp'], unit='s')
train_df = crypto_df[crypto_df['datetime'] < '2021-06-13 00:00:00']
valid_df = crypto_df[crypto_df['datetime'] >= '2021-06-13 00:00:00']

print("Number of samples in train_df: ", len(train_df))
print("Number of samples in valid_df: ", len(valid_df))

# Features

In [None]:
# Features
featues_col = ["Count", "Open", "High", "Low", "Close", "Volume", "VWAP"]

def upper_shadow(df):
    return df['High'] - np.maximum(df['Close'], df['Open'])

def lower_shadow(df):
    return np.minimum(df['Close'], df['Open']) - df['Low']

def log_return(series, periods=1):
    return np.log(series).diff(periods=periods)

def fill_nan_inf(df):
    # Fill NaN values
    df = df.fillna(0)
    # Fill Inf values
    df = df.replace([np.inf, -np.inf], 0)
    
    return df

def create_features(df, label=False):
    """
    Create time series features
    """
    # Build features
    up_shadow = upper_shadow(df)
    low_shadow = lower_shadow(df)    
    five_min_log_return = log_return(df.VWAP, periods=5)
    abs_one_min_log_return = log_return(df.VWAP,periods=1).abs()    
    features = df[featues_col]

    # Concat all the features into one dataframe
    X = pd.concat([features, up_shadow, low_shadow, 
                   five_min_log_return, abs_one_min_log_return], 
                  axis=1)
    
    # Rename feature columns
    X.columns = featues_col+["up_shadow", "low_shadow", "five_min_log_return", "abs_one_min_log_return"]
    
    # Fill NaN and Inf
    X = fill_nan_inf(X)
    
    if label:
        y = df.Target
        # Fill NaN and Inf
        y = fill_nan_inf(y)
        
        return X, y
    
    return X

We will take just one crypto asset and find the best combination of hyperparameters to forecast the target for the validation set. 

There are two reasons to do so:

* `MultiOutputRegressor` wrapper for `XGBRegressor` is limited. We can't perform multi-output prediction/evaluation using this wrapper. Check out the GitHub issue [here](https://github.com/scikit-learn/scikit-learn/issues/15953). 

* We will use Bitcoin which is responsible to move the target because of having the hightest weightage (6.779922). 

In [None]:
# Get single crypto trading data
btc_train = train_df[train_df.Asset_ID==1]
btc_valid = valid_df[valid_df.Asset_ID==1]

# Fill missing value
btc_train = btc_train.reindex(range(btc_train.index[0],btc_train.index[-1]+60,60),method='pad')
btc_valid = btc_valid.reindex(range(btc_valid.index[0],btc_valid.index[-1]+60,60),method='pad')

# Create features
X_train, y_train = create_features(btc_train, label=True)
X_valid, y_valid = create_features(btc_valid, label=True)

# Hyperparameter Tuning

In [None]:
def train():
    with wandb.init() as run:
        bst_params = {
            'objective': 'reg:squarederror', 
            'n_estimators': 60,
            'booster': run.config.booster,
            'learning_rate': run.config.learning_rate,     
            'gamma': run.config.gamma,
            'max_depth': run.config.max_depth,
            'min_child_weight': run.config.min_child_weight,  
            'eval_metric': ['rmse'],
            'tree_method': 'gpu_hist',
        }

        # Initialize the XGBoostClassifier
        xgbmodel = xgb.XGBRegressor(**bst_params)

        # Train the model, using the wandb_callback for logging
        xgbmodel.fit(X_train, y_train, 
                     eval_set=[(X_valid, y_valid)],
                     callbacks=[
                         WandbCallback(log_model=True,
                                       log_feature_importance=False,
                                       define_metric=True)
                     ],
                     verbose=False)
        
        preds = xgbmodel.predict(X_valid)
        rmse = np.sqrt(mean_squared_error(y_valid, preds))
        print("RMSE: %f" % (rmse))
        wandb.log({"Valid_RMSE": rmse})

In [None]:
sweep_config = {
  "name" : "btc_hyperparam_search",
  "method" : "random",
  "parameters" : {
    "booster": {
        "values": ["gbtree", "gblinear"]
    },
    "learning_rate": {
      "min": 0.001,
      "max": 1.0
    },
    "gamma": {
      "min": 0.001,
      "max": 1.0
    },
    "max_depth": {
        "values": [3, 5, 7]
    },
    "min_child_weight": {
      "min": 1,
      "max": 150
    },
    "early_stopping_rounds": {
      "values" : [10, 20, 40, 40,]
    },
  }
}

sweep_id = wandb.sweep(sweep_config, project='btc_hyperparam_search')

In [None]:
wandb.agent(sweep_id, project='btc_hyperparam_search', function=train)

With just 2-3 lines of extra code you could monitor so much more and make sense of the most important hyperparameters for your XGBoost model. 

## [Check out the Sweeps page here $\rightarrow$](https://wandb.ai/ayut/btc_hyperparam_search/sweeps/c1pgsztw?workspace=user-ayut)

<!-- ![img](https://media.giphy.com/media/2ji2l1fsec75l6yoFD/giphy.gif) -->
![sweepdemo_4.gif](https://s10.gifyu.com/images/sweepdemo_4.gif)