This repository contains a Jupyter Notebook (iris_analysis.ipynb
) that explores and analyzes the Iris dataset ๐ฑโa classic dataset in machine learning and statistics. Using Python (pandas, matplotlib, seaborn, scikit-learn), the notebook demonstrates best practices in data analysis, visualization, and insights sharing.
โจ Inspired by Ubuntuโs principle of community, this project emphasizes clarity, simplicity, and shareability.
The Iris dataset includes 150 samples of iris flowers, each described by:
- 4 numerical features: sepal length, sepal width, petal length, petal width
- 1 categorical label: species (setosa, versicolor, virginica)
The notebook walks through:
- โ Loading and exploring the dataset
- ๐ Computing descriptive statistics & grouping by species
- ๐จ Creating four visualizations (line chart, bar chart, histogram, scatter plot)
- ๐ Summarizing key findings (e.g. petal length is a strong discriminator between species)
- Python 3.10+
- Libraries:
pandas matplotlib seaborn scikit-learn jupyter