# 🐼 Pandas Lecture: Core Functions in Action
**Dataset**: Titanic (`sns.load_dataset("titanic")`)

Learn essential Pandas functions used in real-world data work.

## 📥 1. Load the Dataset

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = sns.load_dataset("titanic")
df.head()

## 🔎 2. DataFrame Exploration

In [1]:
df.head()
df.tail()
df.shape
df.columns
df.dtypes
df.info()
df.describe()

NameError: name 'df' is not defined

🎯 **Student Task**:
- Show first and last 7 rows
- List column names and data types

## 🔍 3. Selecting and Filtering

In [None]:
df['age'].head()
df[['sex', 'age', 'survived']].head()
df.loc[0:5, ['sex', 'age']]
df.iloc[0:5, 1:4]

df[df['age'] > 60]
df[(df['sex'] == 'female') & (df['fare'] > 100)]

🎯 **Student Task**: Select all males from first class older than 50

## 🧹 4. Data Cleaning

In [None]:
df_clean = df.drop(['deck'], axis=1)
df_clean.rename(columns={'sibsp': 'siblings_spouses_aboard'}, inplace=True)
df_clean['age'] = df_clean['age'].astype(float)

🎯 **Student Task**: Drop `embark_town`, rename `parch` to `parents_children`

## ❓ 5. Missing Value Handling

In [None]:
df_clean.isnull().sum()
df_clean['age'].fillna(df_clean['age'].median(), inplace=True)
df_clean.dropna(subset=['embarked'], inplace=True)

🎯 **Student Task**: Fill `embark_town` with its most frequent value

## 🛠️ 6. Feature Engineering

In [None]:
df_clean['family_size'] = df_clean['siblings_spouses_aboard'] + df_clean['parch']
df_clean['is_child'] = df_clean['age'].apply(lambda x: x < 12)
df_clean['sex_code'] = df_clean['sex'].map({'male': 0, 'female': 1})

🎯 **Student Task**: Create column `is_elderly` for age > 65

## 📊 7. Aggregation & Grouping

In [None]:
df_clean['sex'].value_counts()
df_clean.groupby('sex')['survived'].mean()
df_clean.groupby(['class', 'sex'])['age'].agg(['mean', 'count'])

🎯 **Student Task**: Find average fare paid by each class

## 🔗 8. Combining Data

In [None]:
df1 = df_clean[['sex', 'age']]
df2 = df_clean[['fare', 'survived']]
pd.concat([df1, df2], axis=1).head()

🎯 **Student Task**: Create two subsets and merge with `concat()`

## 📊 9. Visualization

In [None]:
df_clean['age'].plot.hist(bins=20, title='Age Distribution')
plt.xlabel('Age')
plt.show()

df_clean['sex'].value_counts().plot.pie(autopct='%1.1f%%')
plt.ylabel('')
plt.show()

🎯 **Student Task**: Plot histogram of fare distribution

## 📘 Final Assignment: Use `tips` Dataset

🎯 **Instructions**:
1. Load and explore the dataset.
2. Use 10 functions from this lecture.
3. Create new columns:
   - `bill_per_person` = `total_bill` / `size`
   - `is_weekend` = True if `day` is Sat/Sun
4. Group by `day` and analyze tip and bill
5. Plot pie chart of smokers, histogram of tips