**Mean Encoding**

Mean Encoding is a technique used in machine learning and data preprocessing to encode categorical variables into numerical.<br>
It involves replacing each category with the mean value of the target variable for that category.

Here's how the mean encoding process works:

1. Calculate the mean of the target variable for each category in the categorical feature.
2. Replace each category with its corresponding mean value.

***Import the necessary libraries:*** 

In [1]:
import pandas as pd

data = {'Color': ['Red', 'Blue', 'Green','Red','Blue'],
        'Sales': [100, 200, 150,120,180]}

df = pd.DataFrame(data)
df

Unnamed: 0,Color,Sales
0,Red,100
1,Blue,200
2,Green,150
3,Red,120
4,Blue,180


In [2]:
# First, we calculate the mean sales for each color category:

# Mean sales for Red: (100 + 120) / 2 = 110
# Mean sales for Blue: (200 + 180) / 2 = 190
# Mean sales for Green: 150


In [3]:
#Count 
df.groupby(['Color'])['Sales'].count()

Color
Blue     2
Green    1
Red      2
Name: Sales, dtype: int64

In [4]:
# Find mean
df.groupby(['Color'])['Sales'].mean()

Color
Blue     190.0
Green    150.0
Red      110.0
Name: Sales, dtype: float64

In [5]:
# Finally assigning the mean value and map with df['Color']

In [6]:
Mean_encoded_subject = df.groupby(['Color'])['Sales'].mean().to_dict()
df['Color'] =  df['Color'].map(Mean_encoded_subject)
  
print(df)

   Color  Sales
0  110.0    100
1  190.0    200
2  150.0    150
3  110.0    120
4  190.0    180


In [7]:
# Source: https://www.geeksforgeeks.org/mean-encoding-machine-learning