# Sales Forecasting with XGBoost

This notebook demonstrates how to use the XGBoost algorithm for sales forecasting. The goal is to predict future sales based on historical data.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Load the dataset
data = pd.read_csv('../data/sales_data.csv')
data.head()

In [None]:
# Data preprocessing
def preprocess_data(data):
    # Handle missing values
    data.fillna(method='ffill', inplace=True)
    
    # Convert date column to datetime
    data['date'] = pd.to_datetime(data['date'])
    
    # Feature engineering
    data['month'] = data['date'].dt.month
    data['year'] = data['date'].dt.year
    return data

data = preprocess_data(data)
data.head()

In [None]:
# Split the data into training and testing sets
X = data[['month', 'year']]
y = data['sales']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
# Train the XGBoost model
model = xgb.XGBRegressor(objective='reg:squarederror')
model.fit(X_train, y_train)


In [None]:
# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

## Conclusion

In this notebook, we demonstrated how to use XGBoost for sales forecasting. The model's performance can be improved by tuning hyperparameters and incorporating additional features.