# 🧠 Lecture 2.4: Recoding and Renaming Variables in Pandas

### Learning Objectives
By the end of this 15-minute lecture, you will:
- Know how to recode values using `.replace()` and `.map()`
- Learn to rename columns for clarity using `.rename()`
- Understand how and why to use `pd.Categorical` for ordered factors


In [None]:
import pandas as pd

# Sample dataset
df = pd.DataFrame({
    'sex': [0, 1, 1, 0],
    'smoker': [1, 0, 0, 1],
    'edu': [1, 2, 3, 4]
})

df.head()


## 🔁 Recoding values using `.replace()`

In [None]:
# 0 = Female, 1 = Male
df['sex'] = df['sex'].replace({0: 'Female', 1: 'Male'})

# 0 = No, 1 = Yes
df['smoker'] = df['smoker'].replace({0: 'No', 1: 'Yes'})

df


## 🔁 Recoding values using `.map()` (alternative)

In [None]:
# This does the same thing, but map only works on Series
df['sex'] = df['sex'].map({'Female': 'F', 'Male': 'M'})
df


## ✏️ Renaming columns

In [None]:
df = df.rename(columns={
    'sex': 'Sex',
    'smoker': 'SmokerStatus',
    'edu': 'EducationLevel'
})
df


## 🗂 Using `pd.Categorical` for ordered categories

In [None]:
# Suppose EducationLevel is 1=Primary, 2=Secondary, 3=College, 4=Graduate
edu_order = ['Primary', 'Secondary', 'College', 'Graduate']

df['EducationLevel'] = pd.Categorical(
    df['EducationLevel'].map({1: 'Primary', 2: 'Secondary', 3: 'College', 4: 'Graduate'}),
    categories=edu_order,
    ordered=True
)

df.dtypes


## ✅ Summary
- Use `.replace()` and `.map()` to recode variable values
- Use `.rename()` to clean up column names
- Use `pd.Categorical` to specify ordering and levels of categorical variables

👉 These tools are essential for cleaning and preparing survey or registry data for analysis.
