This project demonstrates a practical application of the K-Nearest Neighbors (KNN) algorithm to classify iris flowers into their respective species β Setosa, Versicolor, and Virginica.
It uses the classic Iris dataset, which is one of the most well-known datasets in the field of Machine Learning.
This project is aimed at helping beginners understand how KNN works for supervised classification problems.
- Understand the K-Nearest Neighbors (KNN) algorithm and its working.
- Apply KNN to the Iris dataset using Scikit-learn.
- Visualize feature relationships and correlations.
- Evaluate accuracy and find the best
Kvalue. - Make predictions for new flower samples.
The dataset used is the Iris Flower Dataset, which contains:
- 150 samples
- 4 features: sepal length, sepal width, petal length, petal width
- 3 classes: Setosa, Versicolor, Virginica
| Feature | Description |
|---|---|
| Sepal Length | Length of the sepal in cm |
| Sepal Width | Width of the sepal in cm |
| Petal Length | Length of the petal in cm |
| Petal Width | Width of the petal in cm |
| Step | Description |
|---|---|
| 1οΈβ£ | Load the Dataset β Use Scikit-learnβs load_iris() function. |
| 2οΈβ£ | Data Visualization β Pair plots and heatmaps to see relationships. |
| 3οΈβ£ | Preprocessing β Split data and scale features. |
| 4οΈβ£ | Train Model β Apply KNeighborsClassifier() and fit on training data. |
| 5οΈβ£ | Evaluate Model β Compute accuracy and confusion matrix. |
| 6οΈβ£ | Tune K Value β Find optimal K for best accuracy. |
| 7οΈβ£ | Predict New Sample β Classify an unseen flower. |
The K-Nearest Neighbors algorithm classifies a sample based on the majority label among its k nearest points in the feature space.
Steps:
- Choose a value for
k(number of neighbors). - Calculate distance (usually Euclidean) between the new sample and training samples.
- Select the
kclosest points. - Assign the class that appears most frequently among those neighbors.
β
Model Accuracy: 95β100% (depending on K value)
β
Best K value found through tuning: around 5β7
β
Confusion matrix and classification report included in the notebook.
Example Prediction:
Input: [5.1, 3.5, 1.4, 0.2]
Predicted Species: setosa
- Python 3.x
- Pandas
- NumPy
- Scikit-learn
- Matplotlib
- Seaborn
- Clone this repository
git clone https://github.com/asimsheikh-coder/iris-flower-classifier-using-knn.git cd iris-flower-classifier-using-knn - Install dependencies
pip install pandas numpy scikit-learn matplotlib seaborn
- Run the notebook
jupyter notebook KNN_Iris_Classification.ipynb
Asim Sheikh
12th Grade Student | Aspiring AI Engineer
π§ Email: asimusmansheikh0@gmail.com
π GitHub: @asimsheikh-coder
Sheikh, A. Iris Flower Classifier using K-Nearest Neighbors (KNN). 2025. GitHub Repository.
https://github.com/asimsheikh-coder/iris-flower-classifier-using-knn
This project is an excellent introduction to supervised learning and the KNN algorithm.
By analyzing the Iris dataset, students can understand how distance-based algorithms classify new data points based on proximity β a foundational concept for more advanced ML techniques.