<a href="https://colab.research.google.com/github/sornpat/build-and-learn-project/blob/main/build_and_learn_pandas_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# 🐼 Build & Learn: Your First Pandas Notebook

Welcome to your first hands-on data science session!

In this notebook, you'll learn how to:
- Load a CSV dataset using `pandas`
- Explore the data: head, shape, summary
- Clean some messy data
- Do basic analysis like filtering, grouping, and sorting

You don’t need to install anything — just run each cell and follow along.

---

💡 If you ever get stuck, ask ChatGPT or your peers in the Meetup!


In [None]:
import pandas as pd
import numpy as np

## 📥 Step 1: Load a sample dataset

In [None]:
# Load from a URL (Titanic dataset)
url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)

# Show first few rows
df.head()

## 🔍 Step 2: Explore the dataset

In [None]:
df.shape  # rows and columns

df.info()  # column types and nulls

df.describe()  # summary stats for numeric columns

## 🧹 Step 3: Clean the data

In [None]:
# Check for missing values
df.isnull().sum()

# Fill missing Age with median
df['Age'].fillna(df['Age'].median(), inplace=True)

# Drop 'Cabin' column (too many missing values)
df.drop(columns=['Cabin'], inplace=True)

# Drop any remaining rows with nulls
df.dropna(inplace=True)

df.info()

## 🔎 Step 4: Filter the data

In [None]:
# People older than 60
df[df['Age'] > 60]

# Female passengers
df[df['Sex'] == 'female']

## 📊 Step 5: Analyze by group

In [None]:
# Average age by class
df.groupby('Pclass')['Age'].mean()

# Survival rate by gender
df.groupby('Sex')['Survived'].mean()


---

🎉 That's it! You've:
- Loaded and cleaned a real dataset
- Explored, filtered, and grouped data

Next steps:
- Try plotting with `matplotlib` or `seaborn`
- Build a small question and answer it using code

🧠 Bonus Challenge:
What is the survival rate of women vs men in each passenger class?
