### What is One Hot Encoding?
- **One Hot Encoding is a method for converting categorical variables into a binary format.**

- binary columns (0s and 1s) for each category in the original variable.  
- Each category in the original column is represented as a separate column, where a value of 1 indicates the presence of that category, and 0 indicates its absence.

In [10]:
import pandas as pd
from sklearn.preprocessing import OneHotEncoder

In [94]:
df = pd.DataFrame(data)
print(df)

   Employee id Gender Remarks
0           10      M    Good
1           20      F    Nice
2           15      F    Good
3           25      M   Great
4           30      F    Nice


In [96]:
encoder = OneHotEncoder(sparse_output=False)

In [98]:
categorical_cols = df.select_dtypes(include=['object']).columns.tolist()

In [100]:
one_hot_encoded = encoder.fit_transform(df[categorical_cols])

In [102]:
one_hot_df = pd.DataFrame(one_hot_encoded)
one_hot_df

Unnamed: 0,0,1,2,3,4
0,0.0,1.0,1.0,0.0,0.0
1,1.0,0.0,0.0,0.0,1.0
2,1.0,0.0,1.0,0.0,0.0
3,0.0,1.0,0.0,1.0,0.0
4,1.0,0.0,0.0,0.0,1.0


In [104]:
colums = encoder.get_feature_names_out(categorical_cols)
colums

array(['Gender_F', 'Gender_M', 'Remarks_Good', 'Remarks_Great',
       'Remarks_Nice'], dtype=object)

In [110]:
one_hot_df = pd.DataFrame(one_hot_df)
one_hot_df

Unnamed: 0,0,1,2,3,4
0,0.0,1.0,1.0,0.0,0.0
1,1.0,0.0,0.0,0.0,1.0
2,1.0,0.0,1.0,0.0,0.0
3,0.0,1.0,0.0,1.0,0.0
4,1.0,0.0,0.0,0.0,1.0


In [112]:
# Concatenate the one-hot encoded columns with the original DataFrame
df_sklearn_encoded = pd.concat([df.drop(categorical_cols,axis=1),one_hot_df],axis=1)
print(df_sklearn_encoded)

   Employee id    0    1    2    3    4
0           10  0.0  1.0  1.0  0.0  0.0
1           20  1.0  0.0  0.0  0.0  1.0
2           15  1.0  0.0  1.0  0.0  0.0
3           25  0.0  1.0  0.0  1.0  0.0
4           30  1.0  0.0  0.0  0.0  1.0
