# 📊 Module 2 — Data Manipulation with NumPy & Pandas

> Your gateway to real data science work. Learn to clean, filter, group, and transform data like a pro.

## 🚀 How to use this notebook
- Run the cell below to execute the full lesson from the Python script.
- Use the Playground section to experiment with different datasets.
- Complete the practice tasks to solidify your learning.

In [None]:
# 🔁 Run the original script (source of truth)
%run 2_data_manipulation.py

## 🧪 Playground
Try creating your own data, experimenting with filters, groupby operations, and data cleaning.

In [None]:
import pandas as pd
import numpy as np

# Create sample data
data = {
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'age': [25, 30, 35, 28],
    'city': ['NYC', 'LA', 'Chicago', 'Boston'],
    'salary': [70000, 80000, 90000, 75000]
}
df = pd.DataFrame(data)
print(df)
print('\nAverage age:', df['age'].mean())
print('High earners (>75k):', df[df['salary'] > 75000]['name'].tolist())

## 🎯 Practice Tasks (Real-World Scenarios)
- Load a messy CSV file and clean it (handle missing values, duplicates).
- Group sales data by month and calculate total revenue per month.
- Create a pivot table showing average performance by department and region.
- Merge two datasets (employees + departments) and analyze combined metrics.

## ✅ Before you move on
- [ ] I can load data from CSV/Excel and inspect its structure.
- [ ] I can filter DataFrames using boolean indexing.
- [ ] I can group data and apply aggregate functions (sum, mean, count).
- [ ] I can handle missing values (fillna, dropna, interpolate).
- [ ] I understand when to use NumPy vs Pandas.