# One Hot Encoder
It converts numerical data which was converted by label encoder to the boolean variable in 0's or 1's.

##### It refers to splitting the column which contains numerical categorical data to many columns depending on the number of categories present in that column. 
##### Each column contains “0” or “1” corresponding to which column it has been placed.

#### Note :-
1. The one hot encoder does not accept 1-dimensional array or a pandas series, the input should always be 2 Dimensional.
2. The data passed to the encoder should not contain strings.

In [1]:
import numpy as np
import pandas as pd

In [2]:
data = pd.read_csv('encoder_datasets.csv')

In [3]:
data

Unnamed: 0,Gender,Location
0,M,France
1,M,Spain
2,M,France
3,F,Spain
4,F,France
5,M,Germany
6,F,France
7,F,Spain
8,M,France
9,F,Spain


In [4]:
from sklearn.preprocessing import LabelEncoder

In [5]:
le = LabelEncoder()

In [6]:
data['Gender'] = le.fit_transform(data['Gender'])
data['Location'] = le.fit_transform(data['Location'])

In [7]:
data

Unnamed: 0,Gender,Location
0,1,0
1,1,2
2,1,0
3,0,2
4,0,0
5,1,1
6,0,0
7,0,2
8,1,0
9,0,2


In [8]:
from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer 

In [9]:
# creating one hot encoder object with categorical feature 0 
# indicating the first column 
columnTransformer = ColumnTransformer([('encoder', OneHotEncoder(), [0])], remainder='passthrough') 

In [10]:
ohe = OneHotEncoder()

In [11]:
data = np.array(columnTransformer.fit_transform(data), dtype = np.str) 

In [12]:
data

array([['0.0', '1.0', '0.0'],
       ['0.0', '1.0', '2.0'],
       ['0.0', '1.0', '0.0'],
       ['1.0', '0.0', '2.0'],
       ['1.0', '0.0', '0.0'],
       ['0.0', '1.0', '1.0'],
       ['1.0', '0.0', '0.0'],
       ['1.0', '0.0', '2.0'],
       ['0.0', '1.0', '0.0'],
       ['1.0', '0.0', '2.0'],
       ['0.0', '1.0', '1.0']], dtype='<U32')