## Target Guided Ordinal Encoding 
It is a technique used to encode categorical variables based on their relationship with the target variable. This encoding technique is useful when we have a categorical variable with a large number of unique categories, and we want to use this variable as a feature in our machine learning model.

In Target Guided Ordinal Encoding, we replace each category in the categorical variable with a numerical value based on the mean or median of the target variable for that category. This creates a monotonic relationship between the categorical variable and the target variable, which can improve the predictive power of our model.

In [1]:
import pandas as pd

# create a sample dataframe with a categorical variable and a target variable
df = pd.DataFrame({
    'city': ['New York', 'London', 'Paris', 'Tokyo', 'New York', 'Paris'],
    'price': [200, 150, 300, 250, 180, 320]
})

In [2]:
df

Unnamed: 0,city,price
0,New York,200
1,London,150
2,Paris,300
3,Tokyo,250
4,New York,180
5,Paris,320


In [3]:
mean_price=df.groupby('city')['price'].mean().to_dict()

In [4]:
mean_price

{'London': 150.0, 'New York': 190.0, 'Paris': 310.0, 'Tokyo': 250.0}

In [5]:
df['city_encoded']=df['city'].map(mean_price)

In [6]:
df[['price','city_encoded']]

Unnamed: 0,price,city_encoded
0,200,190.0
1,150,150.0
2,300,310.0
3,250,250.0
4,180,190.0
5,320,310.0


In [10]:
import seaborn as sns
tips = sns.load_dataset('tips')

In [8]:
# Perform Target Guided Ordinal Encoding on tips dataset

# Select categorical feature and target variable
cat_column = 'day'          # categorical column from tips dataset
target_column = 'tip'       # numeric target column

In [11]:
# Step 1: Compute mean tip for each day
encoding_dict = tips.groupby(cat_column)[target_column].mean().to_dict()
encoding_dict

  encoding_dict = tips.groupby(cat_column)[target_column].mean().to_dict()


{'Thur': 2.7714516129032254,
 'Fri': 2.734736842105263,
 'Sat': 2.993103448275862,
 'Sun': 3.2551315789473683}

In [12]:
# Step 2: Map encoded values back to dataset
tips[cat_column + '_encoded'] = tips[cat_column].map(encoding_dict)

tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,day_encoded
0,16.99,1.01,Female,No,Sun,Dinner,2,3.255132
1,10.34,1.66,Male,No,Sun,Dinner,3,3.255132
2,21.01,3.5,Male,No,Sun,Dinner,3,3.255132
3,23.68,3.31,Male,No,Sun,Dinner,2,3.255132
4,24.59,3.61,Female,No,Sun,Dinner,4,3.255132


In [13]:
encoding_dict = tips.groupby('sex')['tip'].mean().to_dict()
tips['sex_encoded'] = tips['sex'].map(encoding_dict)
tips.head()


  encoding_dict = tips.groupby('sex')['tip'].mean().to_dict()


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,day_encoded,sex_encoded
0,16.99,1.01,Female,No,Sun,Dinner,2,3.255132,2.833448
1,10.34,1.66,Male,No,Sun,Dinner,3,3.255132,3.089618
2,21.01,3.5,Male,No,Sun,Dinner,3,3.255132,3.089618
3,23.68,3.31,Male,No,Sun,Dinner,2,3.255132,3.089618
4,24.59,3.61,Female,No,Sun,Dinner,4,3.255132,2.833448
