## More Gradient Boosting Methods

### Install CatBoost and LightGBM

In [1]:
!pip install catboost

Collecting catboost
  Obtaining dependency information for catboost from https://files.pythonhosted.org/packages/35/7e/35fa1a7cf6925ff438e849cca50c88b8d28e02d9c3486442f2f85b86184a/catboost-1.2.5-cp311-cp311-win_amd64.whl.metadata
  Downloading catboost-1.2.5-cp311-cp311-win_amd64.whl.metadata (1.2 kB)
Downloading catboost-1.2.5-cp311-cp311-win_amd64.whl (101.1 MB)
   ---------------------------------------- 0.0/101.1 MB ? eta -:--:--
   ---------------------------------------- 0.0/101.1 MB ? eta -:--:--
   ---------------------------------------- 0.0/101.1 MB 330.3 kB/s eta 0:05:07
   ---------------------------------------- 0.0/101.1 MB 330.3 kB/s eta 0:05:07
   ---------------------------------------- 0.1/101.1 MB 525.1 kB/s eta 0:03:13
   ---------------------------------------- 0.2/101.1 MB 857.5 kB/s eta 0:01:58
   ---------------------------------------- 0.2/101.1 MB 958.4 kB/s eta 0:01:46
   ---------------------------------------- 0.3/101.1 MB 1.2 MB/s eta 0:01:27
   ----------

In [3]:
!pip install lightgbm

Collecting lightgbm
  Obtaining dependency information for lightgbm from https://files.pythonhosted.org/packages/ca/b4/57f3f253721e0a16ea28c49acca92c5b1198eb94fbbb8328d6dabc61d2e0/lightgbm-4.4.0-py3-none-win_amd64.whl.metadata
  Downloading lightgbm-4.4.0-py3-none-win_amd64.whl.metadata (19 kB)
Downloading lightgbm-4.4.0-py3-none-win_amd64.whl (1.4 MB)
   ---------------------------------------- 0.0/1.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/1.4 MB ? eta -:--:--
    --------------------------------------- 0.0/1.4 MB 640.0 kB/s eta 0:00:03
   - -------------------------------------- 0.1/1.4 MB 653.6 kB/s eta 0:00:03
   ---- ----------------------------------- 0.2/1.4 MB 1.2 MB/s eta 0:00:02
   ----- ---------------------------------- 0.2/1.4 MB 1.2 MB/s eta 0:00:02
   ------- -------------------------------- 0.3/1.4 MB 1.1 MB/s eta 0:00:02
   --------- ------------------------------ 0.3/1.4 MB 1.2 MB/s eta 0:00:01
   ------------ --------------------------- 0.5

In [4]:
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import accuracy_score
from catboost import CatBoostClassifier
from lightgbm import LGBMClassifier
from xgboost import XGBClassifier

### Train Test Split

In [5]:
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Define Models

In [9]:
# Define models
catboost_model = CatBoostClassifier(random_state=42, verbose=False)
lgbm_model = LGBMClassifier(random_state=42)
xgb_model = XGBClassifier(random_state=42)

models = {
    'CatBoost': catboost_model,
    'LightGBM': lgbm_model,
    'XGBoost': xgb_model
}

### Evaluate models with Cross-Validation

In [10]:
# Perform cross-validation and evaluate models
results = {}
for model_name, model in models.items():
    cv_scores = cross_val_score(model, X_train, y_train, cv=5, scoring='accuracy')
    results[model_name] = cv_scores

[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000184 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 81
[LightGBM] [Info] Number of data points in the train set: 96, number of used features: 4
[LightGBM] [Info] Start training from score -1.098612
[LightGBM] [Info] Start training from score -1.098612
[LightGBM] [Info] Start training from score -1.098612
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000062 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 83
[LightGBM] [Info] Number of data points in the train set: 96, number of used features: 4
[LightGBM] [Info] Start training from score -1.098612
[LightGBM] [Info] Start training from score -1.067841
[LightGBM] [Info] Start training from score -1.130361
[LightGBM] [Info] Auto-choosing col-wise multi

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000095 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 79
[LightGBM] [Info] Number of data points in the train set: 96, number of used features: 4
[LightGBM] [Info] Start training from score -1.098612
[LightGBM] [Info] Start training from score -1.067841
[LightGBM] [Info] Start training from score -1.130361




In [11]:
# Display cross-validation results
print("Cross-validation results (Accuracy):")
for model_name, scores in results.items():
    print(f"{model_name}: Mean accuracy = {np.mean(scores):.4f}, Std = {np.std(scores):.4f}")


Cross-validation results (Accuracy):
CatBoost: Mean accuracy = 0.9417, Std = 0.0333
LightGBM: Mean accuracy = 0.9417, Std = 0.0333
XGBoost: Mean accuracy = 0.9333, Std = 0.0500
