## Categorical Features: 
Categorical Features are of two types:

**1. Nominal:** Unrelated/Uncomparable featured data. **Eg.** Colour:red, green, blue

**2. Ordinal:** Mathematical relational featured data. **Eg.** Size:S, L, XL, XXL

We label encode these features into numeric values so as to apply numerical computation and analysis using:
- map()
- get_dummies()
- label_encode()
- onehotencoder()

In [1]:
import pandas as pd

In [2]:
df = pd.DataFrame([['Green', 'S', 200, 'class1'],
                   ['Red', 'M', 250, 'class2'],
                   ['Blue', 'L', 350, 'class3'],
                   ['Red', 'M', 375, 'class3'],
                   ['Green', 'XL', 400, 'class2']])
df.columns = ['color', 'size', 'price', 'target']
df

Unnamed: 0,color,size,price,target
0,Green,S,200,class1
1,Red,M,250,class2
2,Blue,L,350,class3
3,Red,M,375,class3
4,Green,XL,400,class2


In [3]:
df.dtypes

color     object
size      object
price      int64
target    object
dtype: object

In [4]:
# 'size' column is having ordinal features.

In [5]:
df['size'].unique()

array(['S', 'M', 'L', 'XL'], dtype=object)

### Label Encoding using map():

In [6]:
size_mapping = {'S':1, 'M':2, 'L':3, 'XL':4}
df['size']= df['size'].map(size_mapping)
df

Unnamed: 0,color,size,price,target
0,Green,1,200,class1
1,Red,2,250,class2
2,Blue,3,350,class3
3,Red,2,375,class3
4,Green,4,400,class2


In [7]:
# 'color' column is having nominal feature.

In [8]:
df['color'].unique()

array(['Green', 'Red', 'Blue'], dtype=object)

### Label Encoding using get_dummies():

In [9]:
c_df = pd.get_dummies(df['color'])
c_df

Unnamed: 0,Blue,Green,Red
0,0,1,0
1,0,0,1
2,1,0,0
3,0,0,1
4,0,1,0


In [10]:
ex_o = df.iloc[:,:-1].select_dtypes(exclude=['object'])
ex_o

Unnamed: 0,size,price
0,1,200
1,2,250
2,3,350
3,2,375
4,4,400


In [11]:
df_new = pd.concat((c_df, ex_o), axis=1)
df_new

Unnamed: 0,Blue,Green,Red,size,price
0,0,1,0,1,200
1,0,0,1,2,250
2,1,0,0,3,350
3,0,0,1,2,375
4,0,1,0,4,400


In [12]:
# 'target' column is having ordinal features.

In [18]:
df.iloc[:, 3]

0    class1
1    class2
2    class3
3    class3
4    class2
Name: target, dtype: object

In [19]:
from sklearn.preprocessing import LabelEncoder
encoder_x= LabelEncoder()
df.iloc[:, 3] = encoder_x.fit_transform(df.iloc[:, 3])
df

Unnamed: 0,color,size,price,target
0,Green,1,200,0
1,Red,2,250,1
2,Blue,3,350,2
3,Red,2,375,2
4,Green,4,400,1
