### One-Hot Encoding
One-Hot Encoding converts categorical values into a format where each unique value is represented as a binary vector. This method is suitable when categorical variables are nominal (no ordinal relationship).
**Example:**
If we have a categorical feature `color` with values `red`, `green`, `blue`, one-hot encoding will convert it into:
- `red` -> [1, 0, 0]
- `green` -> [0, 1, 0]
- `blue` -> [0, 0, 1]


In [15]:
from sklearn.preprocessing import OneHotEncoder
import numpy as np

# Step 1: Create sample data
data = np.array(['red', 'green', 'blue', 'green', 'red']).reshape(-1, 1)

# Step 2: Create an instance of OneHotEncoder
encoder = OneHotEncoder(sparse_output=False)  # Use sparse_output instead of sparse

# Step 3: Fit and transform the data
encoded_data = encoder.fit_transform(data)

# Display the one-hot encoded data
print(encoded_data)

[[0. 0. 1.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [18]:
import pandas as pd 
df = pd.DataFrame({ 'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'], 'points': [25, 12, 15, 14, 19, 23, 25, 29] })
df

Unnamed: 0,team,points
0,A,25
1,A,12
2,B,15
3,B,14
4,B,19
5,B,23
6,C,25
7,C,29


In [19]:
df_encoded = pd.get_dummies(df,columns=['team'])
df_encoded

Unnamed: 0,points,team_A,team_B,team_C
0,25,True,False,False
1,12,True,False,False
2,15,False,True,False
3,14,False,True,False
4,19,False,True,False
5,23,False,True,False
6,25,False,False,True
7,29,False,False,True


# Label Encoding
Label Encoding converts categorical values into numeric values. Each unique category is assigned an integer value. This method is suitable for ordinal categorical variables (where the order matters).
**Example:**
If we have a categorical feature `color` with values `red`, `green`, `blue`, label encoding will convert it into:
- `red` -> 2
- `green` -> 1
- `blue` -> 0


In [24]:

import pandas as pd
from sklearn.preprocessing import LabelEncoder

# Example data
data = pd.DataFrame({'color': ['red', 'green', 'blue', 'green', 'red']})

# Initialize the encoder
encoder = LabelEncoder()

# Fit and transform the data
label_encoded_data = encoder.fit_transform(data['color'])

print("Label Encoded Data:\n", label_encoded_data)

Label Encoded Data:
 [2 1 0 1 2]
