<a href="https://colab.research.google.com/github/yoseforaz0990/ML-templates/blob/main/regression/decision_tree_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

| Step                                                      | Description                                                                                                          |
|-----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------|
| **Training the Decision Tree Regression model on the whole dataset** | Build a Decision Tree Regression model using the entire dataset.                                                     |
|                                                           | The `DecisionTreeRegressor` class from scikit-learn is used to create the model.                                      |
|                                                           | The `random_state` parameter is set to `0` for reproducibility in model training.                                     |
|                                                           | Train the Decision Tree Regression model on the dataset to learn the relationships between the independent variable (Position level `X`) and the dependent variable (Salary `y`).   |
| **Predicting a new result**                               | Predict the salary for a new position level (e.g., `6.5`) using the trained Decision Tree Regression model.       |
|                                                           | The `predict()` method is used, passing the new position level as input to get the predicted salary.                   |
| **Visualising the Decision Tree Regression results**      | Create a range of position levels (`X_grid`) from the minimum to the maximum value of the original position levels `X`. |
|                                                           | The `DecisionTreeRegressor` model is then used to predict salaries for the position levels in `X_grid`.                |
|                                                           | Create a scatter plot to visualize the actual salary (`y`) against the position level (`X`) in red data points.        |
|                                                           | Plot the Decision Tree Regression predictions (salary) based on the `X_grid` in blue to visualize how well the model fits the data. |
|                                                           | The blue curve represents the Decision Tree Regression predictions, capturing the relationship between position level and salary. |
|                                                           | This visualization helps assess the performance of the Decision Tree Regression model and how well it captures the underlying patterns in the data. |
| **Difference between Decision Tree Regression, SVR, SVC, and Polynomial Regression** |                                                                                                                  |
|                                                           | **Decision Tree Regression:** It involves building a tree-like model to make predictions based on the independent variables. It splits the data into segments to create homogenous groups based on the dependent variable. Suitable for both linear and non-linear relationships between variables. |
|                                                           |                                                                                                                      |
|                                                           | **Support Vector Regression (SVR):** It is a non-linear regression technique that uses the Support Vector Machine (SVM) algorithm to model complex relationships between the dependent variable and the independent variables. It uses a kernel trick (e.g., RBF kernel) to map the data into a higher-dimensional space, allowing for non-linear modeling. SVR is suitable for cases where the relationship between variables is complex and cannot be captured well by linear models. |
|                                                           |                                                                                                                      |
|                                                           | **Support Vector Classification (SVC):** It is a classification technique using the SVM algorithm. SVC finds the best hyperplane that can separate two or more classes with a maximum margin. It is useful for binary and multi-class classification problems, particularly when the classes are not easily separable by a linear boundary. |
|                                                           |                                                                                                                      |
|                                                           | **Polynomial Regression:** It involves fitting a linear model to the transformed feature matrix `X_poly`, which includes polynomial combinations of the original features. This allows the model to capture nonlinear relationships between the dependent variable and the independent variables. Suitable for cases where data exhibits a curvilinear pattern.   |


In [None]:
# Training the Decision Tree Regression model on the whole dataset
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state=0)
regressor.fit(X, y)

# Predicting a new result
new_position_level = 6.5
predicted_salary = regressor.predict([[new_position_level]])

# Visualising the Decision Tree Regression results
import numpy as np
X_grid = np.arange(min(X), max(X), 0.01)
X_grid = X_grid.reshape((len(X_grid), 1))

import matplotlib.pyplot as plt
plt.scatter(X, y, color='red')
plt.plot(X_grid, regressor.predict(X_grid), color='blue')
plt.title('Truth or Bluff (Decision Tree Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()


