# 📂 Notebook 03: Loading and Exploring Data

Welcome to the real world — where data lives in messy CSVs and your job is to make sense of it.

In this notebook, you’ll:
- Load a CSV file using `pd.read_csv()`
- Use `.head()`, `.tail()`, `.info()`, and `.describe()` to explore your data
- Identify potential issues (missing values, bad types, oddball rows)

Let’s get our hands dirty.
---

In [None]:
import pandas as pd

## 📥 Load a CSV

Replace the filename below with a real CSV path or URL. For testing, use built-in seaborn datasets.

In [None]:
# Example with seaborn's Titanic dataset
import seaborn as sns
df = sns.load_dataset("titanic")
df.head()

## 📊 Peek at the Data

In [None]:
# First and last few rows
print("First 5 rows:")
print(df.head())

print("\nLast 5 rows:")
print(df.tail())

## 🧠 Understand Structure

In [None]:
# Dimensions and columns
print("Shape:", df.shape)
print("\nColumns:", df.columns.tolist())

## 🧼 Data Types & Missing Values

In [None]:
# Data types and null counts
print("\nInfo:")
df.info()

# How many missing values per column?
print("\nMissing values per column:")
print(df.isnull().sum())

## 📈 Summary Statistics

In [None]:
df.describe(include="all")

## ✏️ Renaming Columns (Optional but fun)

In [None]:
# Rename 'sex' to 'gender' for clarity
df.rename(columns={"sex": "gender"}, inplace=True)
df.head(2)

---
## 🔍 Your Turn

1. Load a dataset of your choice (`pd.read_csv()` or `sns.load_dataset()`)
2. Print the first and last 5 rows.
3. Show `.info()` and `.describe()` results.
4. Print the column names. Rename one of them.

🎯 **Bonus:** What percentage of rows have *any* missing values?

```python
# HINT:
df.isnull().any(axis=1).mean() * 100  # percent of rows with any NaNs
```


In [None]:
# Your exploratory code playground!

---
## 🎓 Why This Matters

Every data science project begins here — importing and inspecting data. If your data is a mess (spoiler: it always is), you need to know how to check it, clean it, and prep it.

Next up: slicing and dicing — using `.loc[]`, `.iloc[]`, and boolean masks to select what you want and ignore what you don't.