# Ordinal Encoders 
Ordinal encoding is a popular technique for encoding categorical variables with ordered values, such as low, medium, and high. The encoding assigns a unique integer value to each category, based on their order or ranking. This helps to preserve the ordinal relationship between the categories while converting them into numerical values, which can be used as input for machine learning algorithms.

For example, suppose we have a dataset of students' grades, with a variable "performance level" that can take on values of "low", "medium", and "high". We can use ordinal encoding to map these values to integers, such as 0, 1, and 2, respectively. This allows us to perform arithmetic operations on the variable and use it as input for models that require numerical data.

It is important to note that ordinal encoding assumes an inherent order or ranking among the categories, which may not always be the case. In such situations, other encoding techniques such as one-hot encoding may be more appropriate. Additionally, ordinal encoding may not be suitable for algorithms that assume a linear relationship between the encoded variable and the target variable, as the actual distance between the categories may not reflect the difference in their ordinal values.

In [11]:
import pandas as pd 
import numpy as np

Lets assume we have this dataset 

In [9]:
data = pd.read_csv("/content/data.csv")

In [10]:
data

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


Embarked column seems to be in a good state for the `Embarked`, lets focus on this 

In [12]:
data["Embarked"].value_counts()

S    644
C    168
Q     77
Name: Embarked, dtype: int64

For finding the unique values in this column 

In [13]:
data["Embarked"].value_counts().index

Index(['S', 'C', 'Q'], dtype='object')

Now we just need to change the dataset according to this 

In [14]:
for i , j in zip(data["Embarked"].value_counts().index , 
                 range(len(data["Embarked"].value_counts().index))):
    
    data["Embarked"] = np.where(data["Embarked"] == i , j , data["Embarked"])

And now if we see the dataset

In [15]:
data

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,0
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,1
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,0
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,0
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,0
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,0
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,0
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,0
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,1


It has perfectly changed, now we just need to add one functionality to the code, 
* What if user gives out a list of columns, rather than one 

In [16]:
def ordinal_encoder(dataset , columns):
    for i in columns: 
        for j , k in zip(data[i].value_counts().index , 
                         range(len(data[i].value_counts().index))):
    
            data[i] = np.where(data[i] == j , k , data[i])