# Gradient Boosting算法
梯度提升算法通过迭代地训练基学习器，每一轮训练的目标是拟合前一轮模型的残差。在回归任务中，残差是真实值与预测值的差值；在分类任务中，通常使用对数损失函数的负梯度来近似残差。

## 分类任务

In [1]:
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# 生成示例数据
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 创建梯度提升分类器
gb = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, random_state=42)
gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)
print(f"梯度提升分类器的准确率: {accuracy_score(y_test, y_pred):.2f}")

梯度提升分类器的准确率: 0.96


## 回归任务

In [2]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# 生成示例数据
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 创建梯度提升回归器
gb = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, random_state=42)
gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)
print(f"梯度提升回归器的均方误差: {mean_squared_error(y_test, y_pred):.2f}")

梯度提升回归器的均方误差: 89.23
