# Decision Tree Regressor

### Importing Libraries

- **datasets**: Import functions to load datasets, including the Iris and California housing datasets.
- **StandardScaler**: Import a class for standardization of dataset features.
- **train_test_split**: Import a function to split the dataset into training and testing sets.
- **DecisionTreeRegressor**: Import the decision tree regressor from scikit-learn.
- **mean_absolute_error** and **mean_squared_error**: Import evaluation metrics for regression tasks.

In [1]:
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error

### Loading Datasets

- **datasets.load_iris()**: Load the Iris dataset for a regression task. **x_c** contains the feature data, and **y_c** contains the target values.
- **datasets.fetch_california_housing()**: Load the California housing dataset. **x_r** contains the feature data, and **y_r** contains the target values.

In [2]:
x_c, y_c = datasets.load_iris(return_X_y=True)
x_r, y_r = datasets.fetch_california_housing(return_X_y=True)

### Splitting the Datasets

- Split the Iris dataset into training and testing sets with an 80-20 split. **x_c_train** and **y_c_train** represent the training data, and **x_c_test** and **y_c_test** represent the testing data.

In [3]:
x_c_train, x_c_test, y_c_train, y_c_test = train_test_split(x_c, y_c, test_size=0.2, random_state=50, stratify=y_c)

### Standardizing the California Housing Dataset

- Create a **StandardScaler** object (ss) to standardize the features in the California housing dataset.
- Fit the scaler to the feature data (**x_r**) to compute the mean and standard deviation.
- Transform (standardize) the feature data, replacing **x_r** with the standardized values.

In [4]:
ss = StandardScaler()
ss.fit(x_r)

x_r = ss.transform(x_r)

### Splitting the Standardized California Housing Dataset

- Split the standardized California housing dataset into training and testing sets with an 80-20 split. **x_r_train** and **y_r_train** represent the training data, and **x_r_test** and **y_r_test** represent the testing data.

In [5]:
x_r_train, x_r_test, y_r_train, y_r_test = train_test_split(x_r, y_r, test_size=0.2, random_state=50)

### Decision Tree Regression

- Create a decision tree regressor with the "squared_error" criterion and fit it to the training data.
- The **reg_tree** object now represents the trained decision tree regression model.

### Model Evaluation

- Calculate and print the mean absolute error and mean squared error for the regression model on the test data.

In [6]:
reg_tree = DecisionTreeRegressor(criterion='squared_error')
reg_tree.fit(x_r_train, y_r_train)

print(mean_absolute_error(y_r_test, reg_tree.predict(x_r_test)))
print(mean_squared_error(y_r_test, reg_tree.predict(x_r_test)))

0.47019379118217053
0.5451669863855862
