# Decision Tree

* A decision tree is a tree where each node represents a feature(attribute), each link(branch) represents a decision(rule) and each leaf represents an outcome(categorical(classification) or continues value(regression)).
* The whole idea is to create a tree like this for the entire data and process a single outcome at every leaf.
* Decision trees are used for both classification and regression problems.
* Decision tree algorithm falls under the category of supervised learning algorithms.
* There are many algorithms for Decision Tree:
  * CART(Classification and Regression Trees): uses Gini Index as metric.
  * ID3(Iterative Dichotomiser 3): uses Entropy function and Information gain as metrics.

**Get a clear explanation on: https://medium.com/deep-math-machine-learning-ai/chapter-4-decision-trees-algorithms-b93975f7a1f1**


## Decision Tree Regression

Predict a continuous value.

    Decision tree regression is not the best model to use on single feature dataset as we will do. It is more adapted to the dataset with more features i.e.high dimensional dataset. But the code that we will be implementing can be used with other datasets with other features as well.

In [2]:
#no feature scaling needed in decision tree regression
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv("Position_Salaries.csv")
print(dataset)
print('\n')
x = dataset.iloc[: , 1:-1].values   ##for higher dimension dataset: change splitting values and if you have any missing value 
y = dataset.iloc[: , -1].values   ##then just apply the "Taking care of missing data" of our Preprocessing section,
print(x)
print('\n')
print(y)



##for higher dimension dataset:  if there is any categorical data then you have to apply either the "Encoding categorical data" 
##OneHotEncoded part of our Preprocessing section if there's no order relationship in your categorical variables or apply the,
##"Encoding categorical data"LabelEncoder of our Preprocessing part if there's an ordered relationship like size of clothes,
##or if the position levels collected in strings etc, and you dont have to apply feature scaling as decision tree splits your 
##features in different ranges

            Position  Level   Salary
0   Business Analyst      1    45000
1  Junior Consultant      2    50000
2  Senior Consultant      3    60000
3            Manager      4    80000
4    Country Manager      5   110000
5     Region Manager      6   150000
6            Partner      7   200000
7     Senior Partner      8   300000
8            C-level      9   500000
9                CEO     10  1000000


[[ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]]


[  45000   50000   60000   80000  110000  150000  200000  300000  500000
 1000000]


In [None]:
#Training the decision tree regression model on whole dataset

#DecisionTreeRegressor will predict a continuous value
#DecisionTreeClassifier will predict a category
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state = 0)   #fixing the random values with 0
regressor.fit(x, y)   #train your DecisionTreeRegressor to understand the correlations between the Level and Salary columns



##if you have higher dimension dataset: change 'x' to 'x_train' and 'y' to 'y_train'

In [None]:
#Predicting a new result

#we would like to have corresponding salary of the observation 6.5, 
regressor.predict([[6.5]])   #so to format that single position level 6.5 we would have to put it in a 2d array inside which we will put out observation 6.5



##if you have higher dimension dataset: you would have to put other features value besides the salary like 6.5,30,5, where 30 can be age and 5 can be number of viewers
##this is how you would predict a single observation when you would have several features

    For this case study the predicted salary is 150000.

In [None]:
#Visualising the Decision Tree Regression results(higher resolution)
x_grid = np.arange(min(x), max(x), 0.1)
x_grid = x_grid.reshape((len(x_grid), 1))
plt.scatter(x, y, color='red')   #if feature scaling would have been applied then you would have to change the x and y
plt.plot(x_grid, regressor.predict((x_grid)), color='blue')
plt.title("Truth or Bluff (Decision Tree Regression)")
plt.xlabel("Position Level")
plt.ylabel("Salary")
plt.show()



##if you have higher dimension dataset: various features will not be plotted so you would have to delete this visualization part

    All the predicted salaries in a particluar range are same so we have the staircase like curve till last. Here it's not continuous, we jump from the Position Level to the next one on every step of one.

In [None]:
#curve in low resolution

plt.scatter(x, y, color='red')   #if feature scaling would have been applied then you would have to change the x and y
plt.plot(x, regressor.predict((x)), color='blue')
plt.title("Truth or Bluff (Decision Tree Regression)")
plt.xlabel("Position Level")
plt.ylabel("Salary")
plt.show()

#here the predicted salary of the Position Level was just the actual salary itself because it was trained on the whole dataset