### Importing Required Libraries

The following libraries are imported to facilitate data manipulation, model training, and evaluation:

- `pandas`: For handling data in DataFrame format.
- `numpy`: For numerical computations and array manipulation.
- `matplotlib.pyplot`: For visualizing data through plots.
- `seaborn`: For enhanced data visualization.
- `sklearn.model_selection`: For splitting data into training and testing sets, and performing cross-validation.
- `sklearn.preprocessing`: For scaling numerical features.
- `sklearn.ensemble`: For implementing the Random Forest classifier.
- `sklearn.svm`: For implementing the Support Vector Classifier.
- `sklearn.metrics`: For evaluating model performance with accuracy, confusion matrix, and classification report.
- `kagglehub`: For interacting with Kaggle datasets (if required).
- `os`: For interacting with the operating system (e.g., file handling).

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.model_selection import cross_val_score
import kagglehub
import os


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.model_selection import cross_val_score
import kagglehub
import os

# Loading and Exploring the Iris Flower Dataset

In [None]:
path = kagglehub.dataset_download("arshid/iris-flower-dataset")

print("Path to dataset files:", path)

In [None]:
directory_path = "/kaggle/input/iris-flower-dataset"
files = os.listdir(directory_path)

print("Files in dataset directory:")
for f in files:
    print(f)


In [None]:
df = pd.read_csv('/kaggle/input/iris-flower-dataset/IRIS.csv')

In [None]:
df.head()

In [None]:
df.info()

### Dataset Overview

The Iris dataset consists of 150 entries and 5 columns:

- **sepal_length**, **sepal_width**, **petal_length**, **petal_width**: Numerical features (float64).
- **species**: Categorical target variable (object).

All columns have 150 non-null entries, and the dataset uses approximately 6.0 KB of memory.


In [None]:
# Get dataset shape
print("Dataset shape:", df.shape)

In [None]:
print("\nClass distribution:")

# Class counts
class_counts = df['species'].value_counts()

# Pie chart
plt.figure(figsize=(6, 6))
plt.pie(class_counts, labels=class_counts.index, autopct='%1.1f%%', startangle=140, colors=['lightcoral', 'lightblue', 'lightgreen'])
plt.title("Iris Species Distribution")
plt.axis('equal')  # Equal aspect ratio ensures the pie is circular.
plt.show()


In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Plot for petal_length
sns.FacetGrid(df, hue="species", height=3).map(sns.histplot, "petal_length", kde=True).add_legend()
plt.title("Distribution of Petal Length by Species")
plt.show()

# Plot for petal_width
sns.FacetGrid(df, hue="species", height=3).map(sns.histplot, "petal_width", kde=True).add_legend()
plt.title("Distribution of Petal Width by Species")
plt.show()

# Plot for sepal_length
sns.FacetGrid(df, hue="species", height=3).map(sns.histplot, "sepal_length", kde=True).add_legend()
plt.title("Distribution of Sepal Length by Species")
plt.show()

In [None]:
sns.set_style("whitegrid")
sns.pairplot(df,hue="species",size=3);
plt.suptitle("Pairwise Relationships Between Iris Features", y=1.02)
plt.show()

In [None]:
# Separate features (X) and target variable (y)
X = df.drop(columns=['species'])
y = df['species']

# Standardize the features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)


In [None]:
# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)


In [None]:
# Initialize and train the Random Forest model
rf_model = RandomForestClassifier(random_state=42)
rf_model.fit(X_train, y_train)

# Predict on the test set
rf_pred = rf_model.predict(X_test)


In [None]:
# Random Forest accuracy
print("Random Forest Accuracy:", accuracy_score(y_test, rf_pred))

# Random Forest classification report
print("\nRandom Forest Classification Report:\n", classification_report(y_test, rf_pred))

# Random Forest confusion matrix
plt.figure(figsize=(6, 5))
sns.heatmap(confusion_matrix(y_test, rf_pred), annot=True, fmt="d", cmap="Blues", xticklabels=df['species'].unique(), yticklabels=df['species'].unique())
plt.title('Random Forest Confusion Matrix')
plt.show()


In [None]:
model = RandomForestClassifier()
model.fit(X_train, y_train)
importances = model.feature_importances_
feature_names = X.columns
sns.barplot(x=importances, y=feature_names)
plt.title("Feature Importance")

In [None]:
# Initialize and train the SVM model
svm_model = SVC(kernel='linear', random_state=42)
svm_model.fit(X_train, y_train)

# Predict on the test set
svm_pred = svm_model.predict(X_test)


In [None]:
# SVM accuracy
print("SVM Accuracy:", accuracy_score(y_test, svm_pred))

# SVM classification report
print("\nSVM Classification Report:\n", classification_report(y_test, svm_pred))

# SVM confusion matrix
plt.figure(figsize=(6, 5))
sns.heatmap(confusion_matrix(y_test, svm_pred), annot=True, fmt="d", cmap="Blues", xticklabels=df['species'].unique(), yticklabels=df['species'].unique())
plt.title('SVM Confusion Matrix')
plt.show()


In [None]:
from sklearn.svm import SVC

# Train a linear SVM model
svm_model = SVC(kernel='linear', random_state=42)
svm_model.fit(X_train, y_train)

# Get absolute values of coefficients as feature importance
coefficients = np.abs(svm_model.coef_).mean(axis=0)  # Average for multiclass
feature_names = X.columns

# Plot
plt.figure(figsize=(8, 5))
sns.barplot(x=coefficients, y=feature_names, palette="coolwarm")
plt.title("Feature Importance from Linear SVM")
plt.xlabel("Importance Score")
plt.ylabel("Features")
plt.tight_layout()
plt.show()

### Summary and Conclusion:

1. **Random Forest Model:**
   - **Accuracy:** The Random Forest model achieved a perfect accuracy of **1.0**, indicating that it correctly predicted the class for every data point in the test set.
   - **Classification Report:** 
     - The **precision**, **recall**, and **f1-score** for all three classes (Iris-setosa, Iris-versicolor, and Iris-virginica) are **1.00**, meaning that the model made no mistakes in predicting the class labels for any sample.
     - The model is performing excellently across all metrics, with no misclassifications, and the support for each class is relatively balanced.

2. **Support Vector Machine (SVM) Model:**
   - **Accuracy:** The SVM model performed very well with an accuracy of approximately **0.97** (96.67%), which is slightly lower than the Random Forest model but still quite good.
   - **Classification Report:**
     - The precision, recall, and f1-scores for **Iris-setosa** and **Iris-versicolor** are near perfect, with values of **1.00** and **0.89** for recall of Iris-versicolor. 
     - The **Iris-virginica** class achieved slightly lower precision at **0.92**, but the recall is perfect at **1.00**, resulting in a high **f1-score** of **0.96**.
     - Overall, the SVM model performs excellently, though it shows some minor room for improvement in handling the Iris-versicolor and Iris-virginica classes.

### Conclusion:
- **Random Forest** outperforms the SVM model slightly, achieving perfect accuracy and flawless metrics across all classes.
- **SVM** is a very strong competitor but shows a slightly lower performance, particularly in precision for the **Iris-virginica** class. However, it still maintains very high accuracy and excellent classification performance.

Both models are highly effective for this classification task, with Random Forest showing a slight edge in overall performance, especially in terms of perfect classification accuracy.