## Target Guided Ordinal Encoding 
It is a technique used to encode categorical variables based on their relationship with the target variable. This encoding technique is useful when we have a categorical variable with a large number of unique categories, and we want to use this variable as a feature in our machine learning model.

In Target Guided Ordinal Encoding, we replace each category in the categorical variable with a numerical value based on the mean or median of the target variable for that category. This creates a monotonic relationship between the categorical variable and the target variable, which can improve the predictive power of our model.

In [3]:
import pandas as pd

# create a sample dataframe with categorical variable and target variable
df = pd.DataFrame({
    'city': ['NewYork', 'London', 'Paris', 'Tokyo', 'NewYork', 'Paris'],
    'price': [200,150,300,250,180,320]
})

In [4]:
df

Unnamed: 0,city,price
0,NewYork,200
1,London,150
2,Paris,300
3,Tokyo,250
4,NewYork,180
5,Paris,320


In [7]:
mean_price = df.groupby('city')['price'].mean().to_dict()

In [8]:
mean_price

{'London': 150.0, 'NewYork': 190.0, 'Paris': 310.0, 'Tokyo': 250.0}

In [9]:
df['city_encoded']= df['city'].map(mean_price)

In [11]:
df[['price', 'city_encoded']]

Unnamed: 0,price,city_encoded
0,200,190.0
1,150,150.0
2,300,310.0
3,250,250.0
4,180,190.0
5,320,310.0


In [14]:
import seaborn as sns
df1=sns.load_dataset('tips')
df1
# time according to total_bill


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.50,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
...,...,...,...,...,...,...,...
239,29.03,5.92,Male,No,Sat,Dinner,3
240,27.18,2.00,Female,Yes,Sat,Dinner,2
241,22.67,2.00,Male,Yes,Sat,Dinner,2
242,17.82,1.75,Male,No,Sat,Dinner,2


In [22]:
time_mean = df1.groupby('time')['total_bill'].mean().sort_values()
time_mean

  time_mean = df1.groupby('time')['total_bill'].mean().sort_values()


time
Lunch     17.168676
Dinner    20.797159
Name: total_bill, dtype: float64

In [24]:
df1['time_encoded']= df1['time'].map(time_mean)

In [36]:
final_df1= df1[['time_encoded', 'time']]
final_df1.head(200)

Unnamed: 0,time_encoded,time
0,20.797159,Dinner
1,20.797159,Dinner
2,20.797159,Dinner
3,20.797159,Dinner
4,20.797159,Dinner
...,...,...
195,17.168676,Lunch
196,17.168676,Lunch
197,17.168676,Lunch
198,17.168676,Lunch
