# Advanced CSV Analysis Tutorial

Welcome to this beginner-friendly guide on analyzing CSV files with Python!
In this notebook, we'll learn how to load, inspect, clean, explore, and visualize CSV data using pandas.

Let's get started!

## 📋 CSV Analysis Workflow

Follow this step-by-step workflow to analyze your CSV data:

1. **Load:** Read CSV file with `pd.read_csv()`
2. **Inspect:** Preview data with `.head()`, check info with `.info()`, get summary with `.describe()`
3. **Clean:** Handle missing values and duplicates as needed
4. **Explore:** Filter data, group by categories, analyze patterns
5. **Visualize:** Create simple plots to understand data better

## 🌸 Iris Dataset Analysis

Here's the classic Iris dataset, which contains measurements of different iris flower species.
It's perfect for practicing data analysis!

![Iris flowers showing different species with measurements visualization](images/iris_dataset.png)

Let's see how to analyze this dataset step-by-step.

In [None]:
import pandas as pd
import numpy as np

## 💻 Complete CSV Analysis Example

Run the following code to load and analyze the Iris dataset.

In [None]:
# Load the Iris dataset from URL
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')

In [None]:
# Initial inspection: shape of dataset
print("Dataset shape:", iris.shape)

In [None]:
# Preview first 5 rows
print("\nFirst 5 rows:")
print(iris.head())

In [None]:
# Check for missing values in each column
print("\nMissing values:")
print(iris.isna().sum())

In [None]:
# Get basic statistics of numerical columns
print("\nSummary statistics:")
print(iris.describe())

In [None]:
# Group data by species and compute mean and std for sepal and petal lengths
species_stats = iris.groupby('species').agg({
    'sepal_length': ['mean', 'std'],
    'petal_length': ['mean', 'std']
})
print("\nSpecies statistics:")
print(species_stats)

In [None]:
# Create boxplots for sepal length and petal length grouped by species
iris.boxplot(column=['sepal_length', 'petal_length'], by='species')

## 🎯 CSV Analysis Mastery

"Every data scientist's journey begins with CSV files!"

Remember, always inspect your data thoroughly before starting your analysis to avoid surprises later on!