This project is a beginner-friendly data science project that classifies iris flowers into their respective species (Setosa, Versicolor, or Virginica) based on their sepal and petal measurements.
The project uses the famous Iris dataset, which contains 150 samples of iris flowers, each with four features:
- Sepal length (cm)
- Sepal width (cm)
- Petal length (cm)
- Petal width (cm)
- Data exploration and visualization
- Building a K-Nearest Neighbors (KNN) classification model
- Model evaluation and visualization of results
- Python
- Pandas for data manipulation
- NumPy for numerical operations
- Matplotlib and Seaborn for data visualization
- Scikit-learn for machine learning
The KNN model achieved 100% accuracy on the test set, demonstrating that the iris species can be accurately classified based on their sepal and petal measurements.
- Install the required libraries:
- Open the Jupyter notebook:
- Run the cells in the notebook to reproduce the analysis.
- Try other classification algorithms and compare performance
- Perform feature engineering to potentially improve the model
- Collect more data to test the model's robustness