### Ordinal Type of Data

Ordinal data is a type of categorical data with a set order or scale to it. Unlike nominal data, which has no inherent order, ordinal data has a clear ordering of the categories. However, the intervals between the values are not necessarily equal or known. Examples of ordinal data include:

- Educational levels (e.g., high school, bachelor's, master's, doctorate)
- Customer satisfaction ratings (e.g., very unsatisfied, unsatisfied, neutral, satisfied, very satisfied)
- Likert scales (e.g., strongly disagree, disagree, neutral, agree, strongly agree)

### Ordinal Encoding

Ordinal encoding is a technique used to convert ordinal data into numerical values. This is useful for machine learning algorithms that require numerical input. In ordinal encoding, each unique category is assigned an integer value based on its order. For example:

- Educational levels: high school (1), bachelor's (2), master's (3), doctorate (4)
- Customer satisfaction ratings: very unsatisfied (1), unsatisfied (2), neutral (3), satisfied (4), very satisfied (5)

This encoding preserves the order of the categories, which is important for algorithms that can leverage this information.

### D


In [25]:
import pandas as pd
listOfSize = ["s","m","l","xl","s","s","l","s","m","l","xl","s","m","l","xl"]
df = pd.DataFrame({"Size":listOfSize})
data = df.copy()
df.head(3)

Unnamed: 0,Size
0,s
1,m
2,l


In [26]:
uniqueSize = [["s","m","l","xl"]]

In [27]:
from sklearn.preprocessing import OrdinalEncoder


In [28]:
oe = OrdinalEncoder(categories=uniqueSize,dtype=int)
df["size_en"] = oe.fit_transform(df[["Size"]])

In [29]:
df

Unnamed: 0,Size,size_en
0,s,0
1,m,1
2,l,2
3,xl,3
4,s,0
5,s,0
6,l,2
7,s,0
8,m,1
9,l,2


## Ordinal Encoding using Map

In [30]:

size_mapping = {'s': 9, 'm': 1, 'l': 2, 'xl': 3}
df['size_en_map'] = data['Size'].map(size_mapping)
df

Unnamed: 0,Size,size_en,size_en_map
0,s,0,9
1,m,1,1
2,l,2,2
3,xl,3,3
4,s,0,9
5,s,0,9
6,l,2,2
7,s,0,9
8,m,1,1
9,l,2,2


## work with big dataset

In [31]:
dataset = pd.read_csv("loan.csv")
dataset.head(3)

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849,0.0,,360.0,1.0,Urban,Y
1,LP001003,Male,Yes,1,Graduate,No,4583,1508.0,128.0,360.0,1.0,Rural,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000,0.0,66.0,360.0,1.0,Urban,Y


In [32]:
#mapping property area
print(dataset["Property_Area"].unique())
map_Property_Area = {"Urban":0,"Rural":1,"Semiurban":2}
en_data_ord = [['Urban', 'Rural', 'Semiurban']]
print(map_Property_Area)

['Urban' 'Rural' 'Semiurban']
{'Urban': 0, 'Rural': 1, 'Semiurban': 2}


In [33]:
from sklearn.preprocessing import OrdinalEncoder
oen = OrdinalEncoder(categories=en_data_ord,handle_unknown='use_encoded_value',unknown_value=-1,dtype=int)


In [34]:
dataset["Property_Area"] = oen.fit_transform(dataset[["Property_Area"]])
dataset.head(100)

Unnamed: 0,Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
0,LP001002,Male,No,0,Graduate,No,5849,0.0,,360.0,1.0,0,Y
1,LP001003,Male,Yes,1,Graduate,No,4583,1508.0,128.0,360.0,1.0,1,N
2,LP001005,Male,Yes,0,Graduate,Yes,3000,0.0,66.0,360.0,1.0,0,Y
3,LP001006,Male,Yes,0,Not Graduate,No,2583,2358.0,120.0,360.0,1.0,0,Y
4,LP001008,Male,No,0,Graduate,No,6000,0.0,141.0,360.0,1.0,0,Y
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,LP001326,Male,No,0,Graduate,,6782,0.0,,360.0,,0,N
96,LP001327,Female,Yes,0,Graduate,No,2484,2302.0,137.0,360.0,1.0,2,Y
97,LP001333,Male,Yes,0,Graduate,No,1977,997.0,50.0,360.0,1.0,2,Y
98,LP001334,Male,Yes,0,Not Graduate,No,4188,0.0,115.0,180.0,1.0,2,Y
