
# STEP 9: Rename Columns (Standardization) – Complete Pandas Guide

This notebook covers **ALL practical and enterprise-grade ways** to RENAME and STANDARDIZE
column names in a Pandas DataFrame.

Focus: **readability, consistency, joins, analytics, and best practices**.


In [None]:

import pandas as pd
import numpy as np


## 1. Sample Dataset with Messy Column Names

In [None]:

df = pd.DataFrame({
    "Order ID": [1001, 1002, 1003],
    "Customer Name": ["Alice", "Bob", "Charlie"],
    "Total Sales($)": [2500, 1800, 2200],
    "Order-Date": ["2024-01-01", "2024-01-02", "2024-01-03"],
    "Ship City ": ["Mumbai", "Delhi", "Pune"]
})
df


## 2. Rename Columns Using Dictionary

In [None]:

df_rename_dict = df.rename(columns={
    "Order ID": "order_id",
    "Customer Name": "customer_name",
    "Total Sales($)": "total_sales",
    "Order-Date": "order_date"
})
df_rename_dict


## 3. Rename Columns Using inplace=True

In [None]:

df_inplace = df.copy()
df_inplace.rename(columns={"Ship City ": "ship_city"}, inplace=True)
df_inplace


## 4. Rename All Columns at Once

In [None]:

df_all = df.copy()
df_all.columns = [
    "order_id",
    "customer_name",
    "total_sales",
    "order_date",
    "ship_city"
]
df_all


## 5. Convert Columns to snake_case (Most Used)

In [None]:

df_snake = df.copy()
df_snake.columns = (
    df_snake.columns
    .str.strip()
    .str.lower()
    .str.replace(" ", "_")
    .str.replace("-", "_")
    .str.replace(r"[()$]", "", regex=True)
)
df_snake


## 6. Remove Special Characters from Column Names

In [None]:

df_special = df.copy()
df_special.columns = df_special.columns.str.replace(r'[^a-zA-Z0-9_]', '', regex=True)
df_special


## 7. Prefix or Suffix Columns

In [None]:

df_prefix = df.copy()
df_prefix = df_prefix.add_prefix("ord_")

df_suffix = df.copy()
df_suffix = df_suffix.add_suffix("_col")

df_prefix, df_suffix


## 8. Rename Columns for Joins / Merges

In [None]:

df_left = df.rename(columns={"Order ID": "order_id"})
df_right = df.rename(columns={"Order ID": "order_id"})


## 9. Rename Using Function (Advanced)

In [None]:

def clean_column(col):
    return col.strip().lower().replace(" ", "_")

df_func = df.copy()
df_func.columns = df_func.columns.map(clean_column)
df_func


## 10. Validate Column Names

In [None]:

df_func.columns.is_unique
df_func.columns



## ✅ Best Practices & Interview Notes
- Use **snake_case** consistently
- Avoid spaces and special characters
- Rename before joins and aggregations
- Column names should be semantic and stable
- Standardization improves code quality



## ✔ Summary
- `rename()` is safest for partial changes
- `.columns` assignment is fastest for full rename
- String methods make bulk standardization easy
- Clean column names = fewer bugs
