## Target Guided Ordinal Encoding 
It is a technique used to encode categorical variables based on their relationship with the target variable. This encoding technique is useful when we have a categorical variable with a large number of unique categories, and we want to use this variable as a feature in our machine learning model.

In Target Guided Ordinal Encoding, we replace each category in the categorical variable with a numerical value based on the mean or median of the target variable for that category. This creates a monotonic relationship between the categorical variable and the target variable, which can improve the predictive power of our model.

In [5]:
import pandas as pd

# Create a sample dataframe with a categorical variable and a target variable 
df = pd.DataFrame({
    'city':['New York','London','Paris','Tokyo','New York','Paris'],
    'price':[200,150,300,250,180,320]
})
df.head()

Unnamed: 0,city,price
0,New York,200
1,London,150
2,Paris,300
3,Tokyo,250
4,New York,180


In [8]:
### Calculate the mean price of each city 
# When you have any outliers, consider median, 
# Otherwise consider mean 
df.groupby('city')['price'].mean()

city
London      150.0
New York    190.0
Paris       310.0
Tokyo       250.0
Name: price, dtype: float64

In [12]:
# Replace each city with its mean price 
mean_price=df.groupby('city')['price'].mean().to_dict()
mean_price

{'London': 150.0, 'New York': 190.0, 'Paris': 310.0, 'Tokyo': 250.0}

In [14]:
df['city_Encoded']=df['city'].map(mean_price)
df

Unnamed: 0,city,price,city_Encoded
0,New York,200,190.0
1,London,150,150.0
2,Paris,300,310.0
3,Tokyo,250,250.0
4,New York,180,190.0
5,Paris,320,310.0


In [27]:
import seaborn as sns
df=sns.load_dataset('tips')
print(df.head())
# Here you can use the same technique to encode data 
mean_totalbill=df.groupby('day')['total_bill'].mean().to_dict()

df['bill_Encoded']=df['day']
#df['bill_Encoded']=mean_totalbill['total_bill'].map(mean_totalbill)
#df

   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4


{'Thur': 17.682741935483868,
 'Fri': 17.15157894736842,
 'Sat': 20.44137931034483,
 'Sun': 21.41}

In [23]:
import seaborn as sns

# Load the tips dataset from seaborn
df = sns.load_dataset('tips')

# Group the data by days and calculate the mean of total_bill
mean_totalbill = df.groupby('day')['total_bill'].mean().reset_index()

# Rename the mean_totalbill column to 'bill_Encoded'
mean_totalbill = mean_totalbill.rename(columns={'total_bill': 'bill_Encoded'})

# Merge the mean_totalbill DataFrame with the original DataFrame based on the 'day' column
df = df.merge(mean_totalbill, on='day')

# Print the updated DataFrame
print(df.head())


   total_bill   tip     sex smoker  day    time  size  bill_Encoded
0       16.99  1.01  Female     No  Sun  Dinner     2         21.41
1       10.34  1.66    Male     No  Sun  Dinner     3         21.41
2       21.01  3.50    Male     No  Sun  Dinner     3         21.41
3       23.68  3.31    Male     No  Sun  Dinner     2         21.41
4       24.59  3.61  Female     No  Sun  Dinner     4         21.41
