# Random Forest Classifier
Random Forest Classifier using the Scikit-Learn library of Python programming language and in order to do this, we use the IRIS dataset which is quite a common and famous dataset. The Random forest or Random Decision Forest is a supervised Machine learning algorithm used for classification, regression, and other tasks using decision trees.

The Random forest classifier creates a set of decision trees from a randomly selected subset of the training set. It is basically a set of decision trees (DT) from a randomly selected subset of the training set and then It collects the votes from different decision trees to decide the final prediction.

### libraries

In [2]:
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics

### loading the datasets

In [3]:
iris = datasets.load_iris()

### checking our dataset content and features names present in it.

In [4]:
print("Target names are : ",iris.target_names)

print("Feature names are : ",iris.feature_names)

Target names are :  ['setosa' 'versicolor' 'virginica']
Feature names are :  ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']


### declaring X and y

In [5]:
X,y = datasets.load_iris(return_X_y=True)

### splitting the datas

In [6]:
X_train, X_test,y_train, y_test =  train_test_split(X,y,test_size=0.30)

### creating dataframe

In [7]:
data = pd.DataFrame({'sepallength': iris.data[:, 0], 'sepalwidth': iris.data[:, 1],
                     'petallength': iris.data[:, 2], 'petalwidth': iris.data[:, 3],
                     'species': iris.target})

### getting some of the datsets

In [8]:
print(data.head(10))

   sepallength  sepalwidth  petallength  petalwidth  species
0          5.1         3.5          1.4         0.2        0
1          4.9         3.0          1.4         0.2        0
2          4.7         3.2          1.3         0.2        0
3          4.6         3.1          1.5         0.2        0
4          5.0         3.6          1.4         0.2        0
5          5.4         3.9          1.7         0.4        0
6          4.6         3.4          1.4         0.3        0
7          5.0         3.4          1.5         0.2        0
8          4.4         2.9          1.4         0.2        0
9          4.9         3.1          1.5         0.1        0


### Random Forest

In [9]:
clf = RandomForestClassifier(n_estimators=100)

### fitting the model

In [10]:
clf.fit(X_train,y_train)

### predictions

In [11]:
y_pred = clf.predict(X_test)

### accuracy

In [12]:
print("Accuracy : ",(metrics.accuracy_score(y_test,y_pred))*100,"%")

Accuracy :  93.33333333333333 %


### predicting the type of flower

In [14]:
predicted_val = clf.predict([[3,3,3,3]])
if predicted_val == [0]:
    print("Sepal Length",predicted_val)
elif predicted_val == [1]:
    print("Sepal Width",predicted_val)
elif predicted_val == [2]:
    print("Petal Length",predicted_val)
else:
    print("Petal Width",predicted_val)

Petal Length [2]


In [23]:
predicted_val = clf.predict([[3,3,2,2]])
if predicted_val == [0]:
    print("Sepal Length",predicted_val)
elif predicted_val == [1]:
    print("Sepal Width",predicted_val)
elif predicted_val == [2]:
    print("Petal Length",predicted_val)
else:
    print("Petal Width",predicted_val)

Sepal Length [0]


In [19]:
predicted_val = clf.predict([[2,2,1,1]])
if predicted_val == [0]:
    print("Sepal Length",predicted_val)
elif predicted_val == [1]:
    print("Sepal Width",predicted_val)
elif predicted_val == [2]:
    print("Petal Length",predicted_val)
else:
    print("Petal Width",predicted_val)

Sepal Width [1]


In [33]:
predicted_val = clf.predict([[4,4,4,4]])
if predicted_val == [0]:
    print("Sepal Length",predicted_val)
elif predicted_val == [1]:
    print("Sepal Width",predicted_val)
elif predicted_val == [2]:
    print("Petal Length",predicted_val)
else:
    print("Petal Width",predicted_val)

Sepal Width [1]


### calculating feature importance

In [16]:
feature_imp = pd.Series(clf.feature_importances_, index = iris.feature_names).sort_values(ascending = False)
feature_imp

petal length (cm)    0.476502
petal width (cm)     0.392191
sepal length (cm)    0.103897
sepal width (cm)     0.027410
dtype: float64