# Iris Prediction
This notebook demonstrates training a Logistic Regression model on the Iris dataset and making predictions.

---


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings 
warnings.filterwarnings('ignore')
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [2]:
df=pd.read_csv('iris.csv')

## About the Iris Dataset
The Iris dataset contains 150 samples of iris flowers, each described by four features:
- Sepal length (cm)
- Sepal width (cm)
- Petal length (cm)
- Petal width (cm)

The target variable is the species of the iris flower, which can be one of three classes: Setosa, Versicolor, or Virginica.

In [3]:
df.head()

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,1,5.1,3.5,1.4,0.2,Iris-setosa
1,2,4.9,3.0,1.4,0.2,Iris-setosa
2,3,4.7,3.2,1.3,0.2,Iris-setosa
3,4,4.6,3.1,1.5,0.2,Iris-setosa
4,5,5.0,3.6,1.4,0.2,Iris-setosa


## About the Iris Dataset
The Iris dataset contains 150 samples of iris flowers, each described by four features:
- Sepal length (cm)
- Sepal width (cm)
- Petal length (cm)
- Petal width (cm)

The target variable is the species of the iris flower, which can be one of three classes: Setosa, Versicolor, or Virginica.

In [4]:
df.columns

Index(['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm',
       'Species'],
      dtype='object')

## Basic Statistics
The following cell shows basic statistical details of the dataset using the `describe()` method.

In [5]:
df.describe()

Unnamed: 0,Id,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
count,150.0,150.0,150.0,150.0,150.0
mean,75.5,5.843333,3.054,3.758667,1.198667
std,43.445368,0.828066,0.433594,1.76442,0.763161
min,1.0,4.3,2.0,1.0,0.1
25%,38.25,5.1,2.8,1.6,0.3
50%,75.5,5.8,3.0,4.35,1.3
75%,112.75,6.4,3.3,5.1,1.8
max,150.0,7.9,4.4,6.9,2.5


In [6]:
X=df.drop(columns=['Id','Species'])
y=df['Species']

In [7]:
X.head()

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [8]:
X_train , X_test ,  y_train , y_test= train_test_split(X,y , test_size=0.2, random_state=42)

In [9]:
scaler= StandardScaler()
X_train_scaled=scaler.fit_transform(X_train)
X_test_scaled=scaler.transform(X_test)

In [10]:
model_lr=LogisticRegression()

In [11]:
model_lr.fit(X_train_scaled,y_train)


In [12]:
y_pred=model_lr.predict(X_test_scaled)
accuracy=accuracy_score(y_test,y_pred)

In [13]:
accuracy

1.0

In [14]:
new_data=np.array([
    [5.1 , 3,1,0.5],
    [5.1 , 3,3,0.5],
    [4.3 , 2,1.5,0.1]
])
new_data

array([[5.1, 3. , 1. , 0.5],
       [5.1, 3. , 3. , 0.5],
       [4.3, 2. , 1.5, 0.1]])

## Predicting New Samples
The following cells demonstrate how to use the trained model to predict the species of new iris samples. The input features are scaled using the same scaler as the training data before making predictions.

In [15]:
new_data_scaled=scaler.transform(new_data)

In [16]:
prediction=model_lr.predict(new_data_scaled)


In [17]:
prediction

array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa'], dtype=object)

In [18]:
from sklearn.neighbors import KNeighborsClassifier

In [23]:
model_knn=KNeighborsClassifier(n_neighbors=10)

In [24]:
model_knn.fit(X_train,y_train)

In [25]:
model_knn.score(X_test,y_test)

1.0

In [26]:
from sklearn.svm import SVC


In [33]:
model_svc=SVC(gamma='auto')

In [28]:
model_svc.fit(X_train,y_train)

In [30]:
model_svc.score(X_test,y_test)

1.0

In [31]:
from sklearn.model_selection import GridSearchCV


In [34]:
classifier=GridSearchCV(model_svc,{
    'C':[1,20,30,40],
    'kernel':['rbf','linear']
},cv=5,return_train_score=False)

In [35]:
classifier.fit(X,y)

In [36]:
classifier.cv_results_

{'mean_fit_time': array([0.01123247, 0.01183124, 0.01629858, 0.008847  , 0.00643034,
        0.00460625, 0.00364776, 0.00309558]),
 'std_fit_time': array([0.009623  , 0.00601888, 0.01566885, 0.00540266, 0.00159015,
        0.0011495 , 0.00087459, 0.00029031]),
 'mean_score_time': array([0.00681577, 0.00608211, 0.01132545, 0.00366035, 0.00585842,
        0.00351753, 0.00242996, 0.00277061]),
 'std_score_time': array([0.00121149, 0.00109129, 0.01062727, 0.0011069 , 0.00154092,
        0.00163409, 0.00012911, 0.00079857]),
 'param_C': masked_array(data=[1, 1, 20, 20, 30, 30, 40, 40],
              mask=[False, False, False, False, False, False, False, False],
        fill_value=999999),
 'param_kernel': masked_array(data=['rbf', 'linear', 'rbf', 'linear', 'rbf', 'linear',
                    'rbf', 'linear'],
              mask=[False, False, False, False, False, False, False, False],
        fill_value=np.str_('?'),
             dtype=object),
 'params': [{'C': 1, 'kernel': 'rbf'},
  {'C

In [37]:
results=pd.DataFrame(classifier.cv_results_)
results

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_C,param_kernel,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.011232,0.009623,0.006816,0.001211,1,rbf,"{'C': 1, 'kernel': 'rbf'}",0.966667,1.0,0.966667,0.966667,1.0,0.98,0.01633,1
1,0.011831,0.006019,0.006082,0.001091,1,linear,"{'C': 1, 'kernel': 'linear'}",0.966667,1.0,0.966667,0.966667,1.0,0.98,0.01633,1
2,0.016299,0.015669,0.011325,0.010627,20,rbf,"{'C': 20, 'kernel': 'rbf'}",0.966667,1.0,0.9,0.966667,1.0,0.966667,0.036515,3
3,0.008847,0.005403,0.00366,0.001107,20,linear,"{'C': 20, 'kernel': 'linear'}",1.0,1.0,0.9,0.933333,1.0,0.966667,0.042164,4
4,0.00643,0.00159,0.005858,0.001541,30,rbf,"{'C': 30, 'kernel': 'rbf'}",0.966667,1.0,0.9,0.933333,1.0,0.96,0.038873,5
5,0.004606,0.001149,0.003518,0.001634,30,linear,"{'C': 30, 'kernel': 'linear'}",1.0,1.0,0.9,0.9,1.0,0.96,0.04899,5
6,0.003648,0.000875,0.00243,0.000129,40,rbf,"{'C': 40, 'kernel': 'rbf'}",1.0,0.966667,0.9,0.933333,1.0,0.96,0.038873,5
7,0.003096,0.00029,0.002771,0.000799,40,linear,"{'C': 40, 'kernel': 'linear'}",1.0,1.0,0.9,0.9,1.0,0.96,0.04899,5


In [38]:
from sklearn.model_selection import RandomizedSearchCV


In [48]:
classifier_r=RandomizedSearchCV((model_svc),{
    'C':[1,10,20,30],
    'kernel':['rbf','linear'],
},n_iter=4,cv=5 , return_train_score= False)

In [49]:
classifier_r.fit(X,y)

In [50]:
res=pd.DataFrame(classifier_r.cv_results_)

In [51]:
res

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_kernel,param_C,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score
0,0.012207,0.000956,0.010169,0.002055,linear,1,"{'kernel': 'linear', 'C': 1}",0.966667,1.0,0.966667,0.966667,1.0,0.98,0.01633,1
1,0.009034,0.004949,0.007087,0.002659,rbf,1,"{'kernel': 'rbf', 'C': 1}",0.966667,1.0,0.966667,0.966667,1.0,0.98,0.01633,1
2,0.004197,0.001231,0.004186,0.002602,linear,20,"{'kernel': 'linear', 'C': 20}",1.0,1.0,0.9,0.933333,1.0,0.966667,0.042164,3
3,0.003357,0.001264,0.00253,0.001015,rbf,30,"{'kernel': 'rbf', 'C': 30}",0.966667,1.0,0.9,0.933333,1.0,0.96,0.038873,4
