Load the libraries for iris flower classification.

In [2]:
from sklearn.datasets import load_iris
import pandas as pd


Load the Dataset: 

The Iris dataset is included in scikit-learn. We'll load it and convert it into a pandas DataFrame for easier handling and manipulation. This dataset contains 150 samples of iris flowers, with features such as sepal length, sepal width, petal length, and petal width, along with the target labels indicating the species of the flower.

In [3]:
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['target'] = iris.target

Explore the Dataset:

It's important to understand the structure of the dataset. By printing the first few rows, we can get a sense of what the data looks like. This step helps us ensure that the data is loaded correctly.

In [4]:
print(data.head())


   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0                5.1               3.5                1.4               0.2   
1                4.9               3.0                1.4               0.2   
2                4.7               3.2                1.3               0.2   
3                4.6               3.1                1.5               0.2   
4                5.0               3.6                1.4               0.2   

   target  
0       0  
1       0  
2       0  
3       0  
4       0  


Split the Dataset:

To evaluate our model, we'll split the dataset into training and testing sets. The training set will be used to train the model, and the testing set will be used to assess its performance. We use an 80-20 split, meaning 80% of the data is for training, and 20% is for testing.

In [5]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)


Train a Classifier:

We will use the k-Nearest Neighbors (k-NN) classifier, which is a simple yet effective algorithm. It classifies a data point based on how its neighbors are classified. We'll set n_neighbors to 3, meaning the algorithm will look at the 3 closest neighbors to make a prediction.

In [6]:
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)


Make Predictions:

Once the model is trained, we can use it to make predictions on the test set. 
This step will help us understand how well our model generalizes to unseen data.

In [7]:
y_pred = knn.predict(X_test)


Evaluate the Model:

Finally, we evaluate the model's performance by comparing the predicted labels with the actual labels of the test set. The accuracy_score function will give us the accuracy, which is the proportion of correctly classified samples.

In [8]:
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')


Accuracy: 1.00
