# 04_data_analysis_visualization.ipynb

Welcome to Step 4! In this notebook, we will learn to analyze and visualize data using the powerful **pandas** library and **matplotlib**.

This will prepare your AI to work with real-world data, extract insights, and present results visually.

## Installing required packages

First, make sure pandas and matplotlib are installed:

```bash
!pip install pandas matplotlib
```

Run this cell if you haven't installed them yet.

In [None]:
!pip install pandas matplotlib

## Importing libraries

Import pandas for data analysis and matplotlib.pyplot for plotting.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

## Loading CSV data into a DataFrame

Use the CSV file from the previous step or any CSV file you have.

In [None]:
# Loading people.csv from step 3
df = pd.read_csv('people.csv')
df

## Basic data exploration

- View top rows
- Get summary statistics
- Check data types

In [None]:
print(df.head())  # First 5 rows
print(df.describe())  # Summary statistics
print(df.dtypes)  # Data types

## Filtering data

Get rows where Age > 28

In [None]:
filtered = df[df['Age'] > 28]
filtered

## Simple plotting

Plot Age distribution as a bar chart.

In [None]:
plt.bar(df['Name'], df['Age'])
plt.title('Age Distribution')
plt.xlabel('Name')
plt.ylabel('Age')
plt.show()

## Exercise for your AI lab partner

1. Load a CSV file with numerical and categorical data.
2. Calculate mean, median, and standard deviation for numeric columns.
3. Filter data based on conditions (e.g., Role == 'Engineer').
4. Create at least two types of plots (bar chart, line chart, scatter plot).
5. Save a plot as an image file.

Practice data wrangling and visualization interactively.

## Summary

- Pandas simplifies tabular data manipulation.
- Matplotlib enables powerful visualizations.
- Combining them prepares your AI for insightful data work.

Next, we can explore databases, web scraping, or automation!