### Label Encoding 
Label encoding and ordinal encoding are two techniques used to encode categorical data as numerical data.

Label encoding involves assigning a unique numerical label to each category in the variable. The labels are usually assigned in alphabetical order or based on the frequency of the categories. For example, if we have a categorical variable "color" with three possible values (red, green, blue), we can represent it using label encoding as follows:

1. Red: 0
2. Green: 1
3. Blue: 2

In [2]:
## Create a simple dataframe 
import pandas as pd
import numpy as np
df = pd.DataFrame({
    'color': ['red', 'blue', 'green', 'green', 'red', 'blue']
})

In [3]:
df

Unnamed: 0,color
0,red
1,blue
2,green
3,green
4,red
5,blue


In [8]:
from sklearn.preprocessing import LabelEncoder

label_encoder=LabelEncoder()

encoded_colors=label_encoder.fit_transform(df[['color']])
encoded_colors

  y = column_or_1d(y, warn=True)


array([2, 0, 1, 1, 2, 0])

In [14]:
encoded_data=pd.DataFrame(data=encoded_colors , columns=['Encoded color'])
final_df=pd.concat([df , encoded_data] , axis=1)
final_df

Unnamed: 0,color,Encoded color
0,red,2
1,blue,0
2,green,1
3,green,1
4,red,2
5,blue,0


#### New value

In [5]:
label_encoder.transform([['red']])

  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)


array([2])

In [6]:
label_encoder.transform([['blue']])

  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)


array([0])

In [7]:
label_encoder.transform([['green']])

  y = column_or_1d(y, dtype=self.classes_.dtype, warn=True)


array([1])

### Ordinal Encoding
It is used to encode categorical data that have an intrinsic order or ranking. In this technique, each category is assigned a numerical value based on its position in the order. For example, if we have a categorical variable "education level" with four possible values (high school, college, graduate, post-graduate), we can represent it using ordinal encoding as follows:

1. High school: 1
2. College: 2
3. Graduate: 3
4. Post-graduate: 4

In [15]:
from sklearn.preprocessing import OrdinalEncoder

In [16]:
# create a sample dataframe with an ordinal variable
df = pd.DataFrame({
    'size': ['small', 'medium', 'large', 'medium', 'small', 'large']
})

In [32]:
encoder=OrdinalEncoder(categories=[['small' , 'medium' , 'large']])
encoded_ranks=encoder.fit_transform(df[['size']])
encoded_ranks=encoded_ranks.flatten()
encoded_ranks

array([0., 1., 2., 1., 0., 2.])

In [34]:
encoded_df=pd.DataFrame(data=encoded_ranks , columns=['Size_encoded'])

final_df=pd.concat([df , encoded_df] , axis=1)
final_df


Unnamed: 0,size,Size_encoded
0,small,0.0
1,medium,1.0
2,large,2.0
3,medium,1.0
4,small,0.0
5,large,2.0
