## Step 1: Import the Tools (Libraries)

In [1]:
from sklearn.datasets import load_iris  #A function that lets you easily load the famous Iris flower dataset.
from sklearn.model_selection import train_test_split   #A helper function to automatically shuffle and split your data into training and testing sets.
from sklearn.ensemble import RandomForestClassifier   #The machine learning model itself. This is the "expert committee" we want to train.
from sklearn.metrics import accuracy_score   #A tool to measure how well our model performed by comparing its predictions to the true answers.

## Step 2: Load the Data

In [2]:
iris = load_iris()
X = iris.data   #We assign the flower measurements (the features) to a variable X. For each flower, this includes four values: sepal length, sepal width, petal length, and petal width.
y = iris.target   #We assign the correct species for each flower (the labels) to a variable y. The species are represented by numbers: 0 for Setosa, 1 for Versicolor, and 2 for Virginica.

## Step 3: Split the Data for Training and Testing

In [3]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### This is the crucial step of dividing our data for a fair evaluation. We put all our features (X) and labels (y) into the train_test_split function.
##### test_size=0.2: This tells the function to set aside 20% of the data for the final test. The other 80% will be used for training.
##### random_state=42: This ensures that the data is shuffled the same way every time you run the code. It makes your experiment reproducible.

#### The function returns four new variables:

##### 1.X_train: 80% of the flower measurements (for studying).

##### 2.y_train: The correct species for the X_train data (the textbook answers).

##### 3.X_test: 20% of the flower measurements (the final exam questions).

##### 4.y_test: The correct species for the X_test data (the exam's answer key).

## Step 4: Create and Train the Model

In [4]:
clf = RandomForestClassifier(n_estimators=100, random_state=42)   #We create an instance of our Random Forest model. n_estimators=100 means we are creating a "forest" with 100 individual decision trees.
clf.fit(X_train, y_train)   #The .fit() method is the command to start training. The model analyzes the X_train (training measurements) and y_train (training answers) to learn the patterns that connect a flower's measurements to its species. The model never sees the test data during this phase

## Step 5: Make Predictions on Test Data

In [5]:
predictions = clf.predict(X_test)
#Now that the model is trained, it's time for the exam. The .predict() method takes the unseen test features (X_test) and makes its best guess for the species of each flower. The results are stored in the predictions variable.

## Step 6: Evaluate the Model's Performance

In [6]:
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy * 100:.2f}%")

Accuracy: 100.00%


##### accuracy_score(y_test, predictions): This function compares the model's predictions with the true answers (y_test).

##### It calculates the percentage of correct guesses and stores it in the accuracy variable. An accuracy of 1.0 means 100% of the predictions were correct.

##### The print() statement then displays this result in a user-friendly format.