# Random Forest Classifier

It is as superbised ML Algorithm based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model. Random Forest is a classifier that contains a number of decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset.

![Image](https://static.javatpoint.com/tutorial/machine-learning/images/random-forest-algorithm.png)

## Importing Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Importing Dataset

In [2]:
df = pd.read_csv("./Social_Network_Ads.csv")
df

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0
...,...,...,...
395,46,41000,1
396,51,23000,1
397,50,20000,1
398,36,33000,0


In [3]:
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

In [4]:
X[:5]

array([[   19, 19000],
       [   35, 20000],
       [   26, 43000],
       [   27, 57000],
       [   19, 76000]], dtype=int64)

In [5]:
y[:5]

array([0, 0, 0, 0, 0], dtype=int64)

## Split Dataset into Train and Test set

In [6]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

## Feature Scalling

In [7]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [8]:
X_train_scaled[:5]

array([[ 1.8925893 ,  1.52189404],
       [ 0.1250379 ,  0.03213212],
       [ 0.9106163 , -1.31157471],
       [-1.34792161, -1.48684082],
       [-0.169554  , -0.58129926]])

In [9]:
X_test_scaled[:5]

array([[ 0.812419  , -1.39920777],
       [ 2.0889839 ,  0.52871943],
       [-0.95513241, -0.75656537],
       [ 1.0088136 ,  0.76240757],
       [-0.85693511, -1.22394166]])

## Train DecisionTree Classifier

In [17]:
from sklearn.ensemble import RandomForestClassifier

classifier = RandomForestClassifier(n_estimators=100, criterion="entropy", random_state=42)
classifier.fit(X_train_scaled, y_train)

RandomForestClassifier(criterion='entropy', random_state=42)

In [18]:
classifier.score(X_test_scaled, y_test)

0.9

## Confusion matrix

In [12]:
from sklearn.metrics import confusion_matrix

y_p = classifier.predict(X_test_scaled)

confusion_matrix(y_test, y_p)

array([[57,  6],
       [ 4, 33]], dtype=int64)