# 📘 Day 2: Working with pandas and DataFrames

## 1. Introduction to pandas
- What is pandas?
- Why use DataFrames instead of lists or dictionaries?
- Importing pandas

In [None]:
import pandas as pd

## 2. Loading a Dataset
- Reading data from a CSV file

In [None]:
df = pd.read_csv("school-safety-report.csv")

df.head()

## 3. Cleaning Column Names
- Why clean column names (spaces, casing, consistency)?
- Common methods:

In [None]:
df.columns = df.columns.str.lower()
df.columns = df.columns.str.replace(" ", "_")
df.head()

## 4. Exploring Data
- Number of unique values: `nunique()`
- Count of values: `count()`
- Mean of numerical columns: `mean()`
- Descriptive stats: `describe()`

In [None]:
df["school_name"].nunique()
df["enrollment"].count()
df["enrollment"].mean()
df.describe()

## 5. Selecting Data
- Selecting a single column
- Selecting multiple columns
- Selecting rows by index (`iloc`)
- Selecting rows/columns by labels (`loc`)

In [None]:
df["school_name"].head()
df[["school_name","borough"]].head()
df.iloc[0]
df.loc[0,"school_name"]

## 6. Filtering Data
- Filter rows with conditions
- Combine multiple conditions with `&` (and), `|` (or)

In [None]:
df[df["borough"] == "Brooklyn"].head()
df[(df["enrollment"] > 1000) & (df["borough"] == "Bronx")].head()

## 7. Creating New DataFrames from Filtering/Grouping
- Save results of a filter as a new DataFrame
- Save results of grouping as a new DataFrame

In [None]:
brooklyn_schools = df[df["borough"] == "Brooklyn"]

avg_enrollment_by_borough = df.groupby("borough")[["enrollment"]].mean().reset_index()

brooklyn_schools.head()
avg_enrollment_by_borough

## 8. Modifying DataFrames
- Adding new columns
- Dropping columns

In [None]:
df["enrollment_in_10_years"] = df["enrollment"] * 1.1
df.head()

## 9. Sorting Data
- Sorting by a single column
- Sorting by multiple columns

In [None]:
df.sort_values("enrollment", ascending=False).head()

## 10. Aggregation & Grouping
- Using `groupby()`
- Applying aggregations like `mean`, `sum`, `count`, `nunique`

In [None]:
df.groupby("borough")["enrollment"].agg(["count","mean","nunique"])

## 11. Saving and Loading Data
- Exporting DataFrame to CSV
- Reading CSV into DataFrame

In [None]:
brooklyn_schools.to_csv("brooklyn_schools.csv", index=False)
pd.read_csv("brooklyn_schools.csv").head()

## 12. Summary & Next Steps
- Today we learned: loading data, cleaning column names, filtering/grouping, creating new DataFrames, and using aggregation methods (`count`, `mean`, `nunique`, `describe`).
- Next time (Day 3): deeper into **data cleaning** (missing values, duplicates, data types).