<a href="https://www.kaggle.com/code/sachin7yadav2511/iris-flower?scriptVersionId=202005243" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# **Objective:** The objective is to build and evaluate a logistic regression model using the Iris dataset to classify species based on features, emphasizing exploration and visualization.

# **Q1: Data Loading and Exploration**

**Import necessary libraries.**

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import seaborn as sns

**Load the Iris dataset from sklearn.**

In [None]:
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target variable

**Display the first few rows of the dataset.**

In [None]:
# Display the first few rows of the dataset
print(df.head())

**Explore the dataset to understand the features and target variable.**

In [None]:
# Explore the dataset to understand features and target variable
print(df.info())
print(df.describe())

# Visualize the target variable distribution
sns.countplot(x='species', data=df)
plt.title('Distribution of Species in the Dataset')
plt.show()

# **Q2: Data Preprocessing**

**Split the dataset into training and testing sets.**

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Split the dataset into features (X) and target (y)
X = df.drop('species', axis=1)
y = df['species']

# Split into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

**Scale the features using StandardScaler from sklearn.**

In [None]:
# Scale the features using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

**Visualizing Feature Relationships**

In [None]:
# Pairplot to visualize the relationship between features
sns.pairplot(df, hue='species', markers=["o", "s", "D"])
plt.show()

# **Q3: Model Training**

**Import the Logistic Regression model from sklearn.**

In [None]:
from sklearn.linear_model import LogisticRegression

**Train the model using the training data.**

In [None]:
# Create and train the Logistic Regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train_scaled, y_train)

**Print the coefficients and intercept of the trained model**

In [None]:
print("Model Coefficients:", model.coef_)
print("Model Intercept:", model.intercept_)

# **Q4: Model Prediction**

**Predict the target values for the testing set.**

In [None]:
# Predict the target values for the testing set
y_pred = model.predict(X_test_scaled)

**Display the predicted values.**

In [None]:
# Display the predicted values
print("Predicted Species:", y_pred)
print("Actual Species:", np.array(y_test))

# **Q5: Model Evaluation**

**Import evaluation metrics from sklearn.**

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

**Calculate and display the accuracy score.**

In [None]:
# Calculate and display the accuracy score
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy Score: {accuracy * 100:.2f}%")

**Generate and display the confusion matrix.**

In [None]:
# Generate and display the confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", conf_matrix)

In [None]:
# Visualize the confusion matrix using a heatmap
plt.figure(figsize=(6, 4))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.show()

**Generate and display the classification report.**

In [None]:
# Generate and display the classification report
class_report = classification_report(y_test, y_pred, target_names=iris.target_names)
print("Classification Report:\n", class_report)