### Label Encoding 
* Label encoding and ordinal encoding are two techniques used to encode categorical data as numerical data.

* Label encoding involves assigning a unique numerical label to each category in the variable. 

* The labels are usually assigned in alphabetical order or based on the frequency of the categories. 

* For example, if we have a categorical variable "color" with three possible values (red, green, blue), we can represent it using label encoding as follows:

1. Red: 1
2. Green: 2
3. Blue: 3

In [5]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder

## Create a dataframe
df = pd.DataFrame({
    'color':['red','green','blue','green','red']
})

# Create LabelEncoder Object
lbl_encoder=LabelEncoder()

lbl_encoder.fit_transform(df)

print(lbl_encoder.transform([['red']]))
print(lbl_encoder.transform([['green']]))
print(lbl_encoder.transform([['blue']]))


[2]
[1]
[0]


  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)
  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)
  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)


### Ordinal Encoding
* With `Label Encoding`, the problem is that model might consider value assigned to that category as important if it is the higher number even if it is not the case. 
* It is used to encode categorical data that have an intrinsic order or ranking. 
* In this technique, each category is assigned a numerical value based on its position in the order. 
* For example, if we have a categorical variable "education level" with four possible values (high school, college, graduate, post-graduate), we can represent it using ordinal encoding as follows:

    1. High school: 1
    2. College: 2
    3. Graduate: 3
    4. Post-graduate: 4

In [8]:
import pandas as pd
from sklearn.preprocessing import OrdinalEncoder

# create a sample dataframe with an ordinal variable
df = pd.DataFrame({
    'size': ['small', 'medium', 'large', 'medium', 'small', 'large']
})

# Create LabelEncoder Object
o_encoder=OrdinalEncoder(categories=[['small','medium','large']])

o_encoder.fit_transform(df[['size']])

print(o_encoder.transform([['small']]))
print(o_encoder.transform([['medium']]))
print(o_encoder.transform([['large']]))

[[0.]]
[[1.]]
[[2.]]


