# Random Forest Classifiers in Scikit-learn

This notebook introduces the concept of Random Forest classifiers and demonstrates how to implement them using scikit-learn.

First, let's import the necessary libraries. We'll use scikit-learn for the Random Forest classifier and dataset, and matplotlib for visualization.

In [None]:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt

Now, let's create a synthetic dataset for classification. We'll use scikit-learn's make_classification function to generate this data.

In [None]:
X, y = make_classification(n_samples=1000, n_features=4, n_informative=2, n_redundant=0, random_state=42)
print(f"Shape of X: {X.shape}, Shape of y: {y.shape}")

Let's create an instance of the RandomForestClassifier. We'll set the number of trees (n_estimators) to 100 and use a random state for reproducibility.

In [None]:
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)
print(rf_classifier)

Now we'll fit the Random Forest classifier on our dataset. This step trains the model on the provided data.

In [None]:
rf_classifier.fit(X, y)

Let's make predictions on the same dataset to see how well our model performs. In practice, you'd use a separate test set.

In [None]:
predictions = rf_classifier.predict(X)
print(f"First 10 predictions: {predictions[:10]}")

We can evaluate the model's accuracy using the score method, which computes the mean accuracy on the given data and labels.

In [None]:
accuracy = rf_classifier.score(X, y)
print(f"Model accuracy: {accuracy:.2f}")

Random Forests can provide feature importance scores. Let's visualize these to understand which features are most influential in our model's decisions.

In [None]:
importances = rf_classifier.feature_importances_
feature_names = [f'Feature {i}' for i in range(X.shape[1])]

plt.bar(feature_names, importances)
plt.title('Feature Importances')
plt.xlabel('Features')
plt.ylabel('Importance')
plt.show()

This concludes our introduction to Random Forest classifiers using scikit-learn. We've covered creating a model, fitting it to data, making predictions, evaluating accuracy, and examining feature importances.