
### Explanation of the Code:
1. **Data Loading**: Load the Iris dataset.
2. **Data Preprocessing**:
   - Encode the categorical labels (species) using `LabelEncoder`.
3. **Model Building**:
   - Define features (`X`) and target variable (`y`).
   - Split the data into training and testing sets.
   - Train a `RandomForestClassifier` model on the training set.
4. **Model Evaluation**:
   - Make predictions on the test set.
   - Evaluate the model's performance using accuracy, confusion matrix, and classification report.

In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load the dataset
file_path = 'IRIS-2.csv'
iris_df = pd.read_csv(file_path)

# Display the first few rows of the dataset
print(iris_df.head())

# Display basic information about the dataset
print(iris_df.info())

# Show summary statistics
print(iris_df.describe(include='all'))

# Data Preprocessing
# Encode categorical labels
label_encoder = LabelEncoder()
iris_df['species'] = label_encoder.fit_transform(iris_df['species'])

# Define features and target variable
X = iris_df.drop(columns=['species'])
y = iris_df['species']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print('Confusion Matrix:')
print(conf_matrix)
print('Classification Report:')
print(class_report)


   sepal_length  sepal_width  petal_length  petal_width      species
0           5.1          3.5           1.4          0.2  Iris-setosa
1           4.9          3.0           1.4          0.2  Iris-setosa
2           4.7          3.2           1.3          0.2  Iris-setosa
3           4.6          3.1           1.5          0.2  Iris-setosa
4           5.0          3.6           1.4          0.2  Iris-setosa
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   sepal_length  150 non-null    float64
 1   sepal_width   150 non-null    float64
 2   petal_length  150 non-null    float64
 3   petal_width   150 non-null    float64
 4   species       150 non-null    object 
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
None
        sepal_length  sepal_width  petal_length  petal_width      species
count     150.000000   150.000000    150.000000   