A comprehensive data analysis and visualization project using the classic Iris dataset.
This repository contains an in-depth analysis of the famous Iris dataset using Python’s powerful data analysis and visualization libraries.
The assignment demonstrates skills in data loading, exploration, analysis, and visualization.
- Samples: 150 iris flowers
- Species:
- Setosa
- Versicolor
- Virginica
- Features (per flower):
- Sepal length (cm)
- Sepal width (cm)
- Petal length (cm)
- Petal width (cm)
iris-analysis/
├── Iris_Dataset_Analysis.ipynb # Main Jupyter notebook
├── requirements.txt # Python dependencies
├── iris_visualizations.png # Generated visualizations
└── README.md # Project documentation
- Python 3.6+
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
- jupyter
Clone the repository:
git clone <repository-url>
cd iris-analysis
Install dependencies:
pip install -r requirements.txt
Or manually:
pip install pandas numpy matplotlib seaborn scikit-learn jupyter
- Open a terminal/command prompt
- Navigate to the project directory
- Start Jupyter Notebook:
jupyter notebook
- In the web interface, open
Iris_Dataset_Analysis.ipynb
- Run the cells in order (Cell > Run All or Shift + Enter)
- Loaded Iris dataset with scikit-learn
- Displayed sample rows
- Explored structure & data types
- Checked for missing values (none found)
- Computed statistics for numerical columns
- Grouped by species and calculated means
- Identified patterns & differences
- Line chart: Measurement trends across species
- Bar chart: Average petal length by species
- Histogram: Sepal length distribution
- Scatter plot: Sepal vs. petal length relationship
- Species Differences:
Setosa (smallest), Virginica (largest), Versicolor (intermediate) - Measurement Patterns:
- Strong positive correlation between petal length & width
- Sepal length and petal length are positively correlated
- Setosa is easily distinguishable by petal measurements
- Data Quality:
- No missing values
- All measurements are numerical and correctly typed
Iris_Dataset_Analysis.ipynb
— Main analysis notebookiris_visualizations.png
— Combined visualizationsrequirements.txt
— Python dependenciesREADME.md
— Project documentation
- Includes
try-except
blocks for robust error handling - Visualizations have clear titles, labels, and legends
- The analysis follows data exploration and visualization best practices
Happy Analyzing!
For questions or suggestions, feel free to open an issue or submit a pull request.