# Random Forest - Bosques aleatorios

- Random Forest es una combinación de árboles de decisión, donde cada árbol selecciona una clase y luego se combinan las decisiones de cada árbol, para seleccionar una clase final ganadora.
- Es uno de los algoritmos de aprendizaje de clasificación más populares y mayor presción.
- Funciona eficientemente en bases de datos grandes y tiene una alta precisión.
- Puede manejar cientos de variables de entrada.

    ![image.png](attachment:image.png)


In [101]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [102]:
vinos = pd.read_csv('vino.csv')
vinos.head()

Unnamed: 0,Alcohol,Malic acid,Ash,Alcalinity of ash,Magnesium,Total phenols,Flavanoids,Nonflavanoid phenols,Proanthocyanins,Color intensity,Hue,OD280/OD315 of diluted wines,Proline,Wine Type
0,14.23,1.71,2.43,15.6,127.0,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065.0,One
1,13.2,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050.0,One
2,13.16,2.36,2.67,18.6,101.0,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185.0,One
3,14.37,1.95,2.5,16.8,113.0,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480.0,One
4,13.24,2.59,2.87,21.0,118.0,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735.0,One


In [103]:
X = vinos.drop(['Wine Type'], axis=1)
y = vinos['Wine Type']

In [104]:
from sklearn.model_selection import train_test_split

In [105]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)


In [106]:
from sklearn.ensemble import RandomForestClassifier

random_forest = RandomForestClassifier(n_estimators=80)

In [107]:
random_forest.fit(X_train, y_train)

In [108]:
predicciones = random_forest.predict(X_test)
predicciones

array(['One', 'One', 'One', 'One', 'One', 'Three', 'Three', 'One', 'Two',
       'Three', 'One', 'Two', 'Two', 'Two', 'Three', 'One', 'Two', 'Two',
       'Two', 'Two', 'One', 'One', 'Three', 'Two', 'Three', 'One', 'Two',
       'Three', 'Three', 'Two', 'One', 'Two', 'Two', 'One', 'One', 'One',
       'Three', 'One', 'Two', 'One', 'Two', 'One', 'Two', 'One', 'Two',
       'Two', 'Two', 'Two', 'One', 'Two', 'Three', 'One', 'Two', 'One'],
      dtype=object)

In [109]:
y_test

3        One
23       One
31       One
56       One
55       One
165    Three
174    Three
4        One
116      Two
167    Three
8        One
94       Two
78       Two
64       Two
171    Three
5        One
124      Two
71       Two
115      Two
80       Two
41       One
45       One
159    Three
103      Two
131    Three
15       One
68       Two
158    Three
137    Three
66       Two
2        One
75       Two
81       Two
43       One
10       One
49       One
172    Three
9        One
110      Two
58       One
101      Two
17       One
122      Two
6        One
63       Two
113      Two
79       Two
111      Two
51       One
100      Two
134    Three
28       One
90       Two
18       One
Name: Wine Type, dtype: object

In [110]:
from sklearn.metrics import confusion_matrix, classification_report

In [111]:
print(confusion_matrix(y_test, predicciones))

[[22  0  0]
 [ 0 10  0]
 [ 0  0 22]]


In [112]:
print(classification_report(y_test, predicciones))

              precision    recall  f1-score   support

         One       1.00      1.00      1.00        22
       Three       1.00      1.00      1.00        10
         Two       1.00      1.00      1.00        22

    accuracy                           1.00        54
   macro avg       1.00      1.00      1.00        54
weighted avg       1.00      1.00      1.00        54

