XGBoost, which stands for eXtreme Gradient Boosting, is a powerful open-source machine learning library known for its efficiency, flexibility, and ability to achieve state-of-the-art results in various tasks. It implements the gradient boosting algorithm, which is an ensemble method that combines multiple weak learners (often decision trees) to create a strong learner. Here's a deeper dive into XGBoost:

In [3]:
import xgboost as xgb
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

# Sample data (replace with your data)
X = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
y = [10, 14, 18, 22]  # Target variable

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define evaluation metric (RMSE) as a custom function
def rmse(y_true, y_pred):
  return mean_squared_error(y_true, y_pred, squared=False)  # Avoids squaring the error

# Create DMatrix for XGBoost (sparse matrix representation)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# Set up XGBoost parameters
params = {
  'objective': 'reg:squarederror',  # Regression objective with squared error
  'eval_metric': 'rmse',  # Use RMSE for evaluation during training
  'max_depth': 3,  # Maximum depth of decision trees
  'n_estimators': 100,  # Number of boosting iterations
  'learning_rate': 0.1  # Learning rate for updates
}

# Train the XGBoost model
model = xgb.train(params, dtrain, verbose=False)  # Set verbose to True for training progress

# Make predictions on the testing set
y_pred = model.predict(dtest)

# Evaluate the model performance using RMSE
rmse_score = rmse(y_test, y_pred)
print("Root Mean Squared Error (RMSE):", rmse_score)


Accuracy: 1.0
