# Ordinal numbering encoding
**Ordinal categorical variables**

Categorical variable which categories can be meaningfully ordered are called ordinal. For example:

- Student's grade in an exam (A, B, C or Fail).
- Days of the week can be ordinal with Monday = 1, and Sunday = 7.
- Educational level, with the categories: Elementary school, High school, College graduate, PhD ranked from 1 to 4.
- When the categorical variable is ordinal, the most straightforward approach is to replace the labels by some ordinal number.

**Advantages**

- Keeps the semantical information of the variable (human readable content)
- Straightforward

**Disadvantage**

- Does not add machine learning valuable information
- I will simulate some data below to demonstrate this exercise

In [24]:
import pandas as pd
import datetime
import calendar

In [25]:
# create a variable with dates, and from  that extract the weekday

In [26]:
df_base  = datetime.datetime.today()
df_base

datetime.datetime(2022, 11, 24, 21, 33, 35, 724044)

In [27]:
d = df_base - datetime.timedelta(days = 2)

d

datetime.datetime(2022, 11, 22, 21, 33, 35, 724044)

In [28]:
# create a list of dates with 20 days difference from today
# then transform it into a dataframe


df_date_list = [df_base - datetime.timedelta(days = x) for x in range (0,20)]
df_date_list

[datetime.datetime(2022, 11, 24, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 23, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 22, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 21, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 20, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 19, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 18, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 17, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 16, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 15, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 14, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 13, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 12, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 11, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 10, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 9, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 8, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 7, 21, 33, 35, 724044),
 datetime.datetime(2022, 11, 6,

In [29]:
df = pd.DataFrame(df_date_list)
df.columns = ['day']
df

Unnamed: 0,day
0,2022-11-24 21:33:35.724044
1,2022-11-23 21:33:35.724044
2,2022-11-22 21:33:35.724044
3,2022-11-21 21:33:35.724044
4,2022-11-20 21:33:35.724044
5,2022-11-19 21:33:35.724044
6,2022-11-18 21:33:35.724044
7,2022-11-17 21:33:35.724044
8,2022-11-16 21:33:35.724044
9,2022-11-15 21:33:35.724044


In [30]:
# extract the week day name


df['name_of_day'] = df['day'].dt.day_name()         ##    df['day_of_week'] = df['day'].dt.strftime('%A')

df.head()

Unnamed: 0,day,name_of_day
0,2022-11-24 21:33:35.724044,Thursday
1,2022-11-23 21:33:35.724044,Wednesday
2,2022-11-22 21:33:35.724044,Tuesday
3,2022-11-21 21:33:35.724044,Monday
4,2022-11-20 21:33:35.724044,Sunday


# ORDINAL number encoding

In [32]:
# Engineer categorical variable by ordinal number replacement

weekday_dict = {'Monday':1 , 'Tuesday':2 , 'Wednesday':3 , 'Thursday':4 , 'Friday':5 , 'Saturday':6 ,
               'Sunday':7}

df['day_ordinal'] = df.name_of_day.map(weekday_dict)
df.head(10)

Unnamed: 0,day,name_of_day,day_ordinal
0,2022-11-24 21:33:35.724044,Thursday,4
1,2022-11-23 21:33:35.724044,Wednesday,3
2,2022-11-22 21:33:35.724044,Tuesday,2
3,2022-11-21 21:33:35.724044,Monday,1
4,2022-11-20 21:33:35.724044,Sunday,7
5,2022-11-19 21:33:35.724044,Saturday,6
6,2022-11-18 21:33:35.724044,Friday,5
7,2022-11-17 21:33:35.724044,Thursday,4
8,2022-11-16 21:33:35.724044,Wednesday,3
9,2022-11-15 21:33:35.724044,Tuesday,2
