CART regression builds a tree-like model where each internal node represents a decision based on a feature, and each leaf node represents a predicted value. It recursively splits the data based on the selected features to minimize the sum of squared errors or another regression metric.


In CART Regression, the goal is to predict a continuous target variable based on the values of one or more predictor variables. The correlation matrix you provided shows the correlation coefficients between different variables.

In [36]:
import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error, r2_score


In [37]:
data= pd.read_excel('data.xlsx')

In [38]:
X = data[['Month','Item Price','Item Total']]
y = data['Quantity'] 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = DecisionTreeRegressor()

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# Evaluate mean squared error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

# Evaluate mean absolute error
mae = mean_absolute_error(y_test, y_pred)
print("Mean Absolute Error:", mae)

# Evaluate R-squared
r2 = r2_score(y_test, y_pred)
print("R-squared:", r2)


Mean Squared Error: 1.0898583967027262
Mean Absolute Error: 0.017730398767328553
R-squared: 0.991384598760959


In [39]:
X = data[['Item Price','Item Total']] 
y = data['Month']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = DecisionTreeRegressor()

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# Evaluate mean squared error
mse = mean_squared_error(y_test, y_pred)
print("Mean Squared Error:", mse)

# Evaluate mean absolute error
mae = mean_absolute_error(y_test, y_pred)
print("Mean Absolute Error:", mae)

# Evaluate R-squared
r2 = r2_score(y_test, y_pred)
print("R-squared:", r2)



Mean Squared Error: 1.696800309588029
Mean Absolute Error: 1.0391838543667806
R-squared: 0.2226341309189328
