# What is Column Transformation?


# Why is Column Transformation Important?

# Common Types of Column Transformations


In [None]:
Data Type	Typical     Transformation
Numerical	Scaling (StandardScaler, MinMaxScaler), Imputation of missing values
Categorical	Encoding (OneHotEncoder, LabelEncoder), Imputation (filling missing categories)
Text	Text vectorization (CountVectorizer, TF-IDF)
Date/Time	Extracting features like year, month, day

# How is Column Transformation Done?
Python’s scikit-learn library provides a tool called ColumnTransformer that allows you to specify which transformation to apply on which columns in one place.

In [3]:
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import seaborn as sns

# Load dataset
titanic = sns.load_dataset('titanic')

# Select features
X = titanic[['age', 'fare', 'sex', 'embarked']]

X = titanic[['age', 'fare', 'sex', 'embarked']]

# Correct way to fill missing values without warning
titanic['age'] = titanic['age'].fillna(titanic['age'].mean())
titanic['fare'] = titanic['fare'].fillna(titanic['fare'].mean())
titanic['embarked'] = titanic['embarked'].fillna('S')

X = titanic[['age', 'fare', 'sex', 'embarked']]

# Define which columns are numeric and which are categorical
numeric_cols = ['age', 'fare']
categorical_cols = ['sex', 'embarked']

# Define column transformer
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_cols),
        ('cat', OneHotEncoder(), categorical_cols)
    ])

# Apply transformations
X_processed = preprocessor.fit_transform(X)
