### One Hot Encoding (OHE)
One-hot encoding is a technique used in machine learning and data preprocessing to convert categorical (non-numeric) data into a numerical format that models can understand.

### How it works
- It represents each category as a binary vector where:
- One position is 1 (hot)
- All others are 0

### Example of OHE
- Assume we have to calculate the Vector representation of the following Documents
- 1) Document-1 = The food is good
- 2) Document-2 = The food is bad
- 3) Document-3 = Pizza is amazing


- Step 1) Calculate the Vocabulary (unique words) from these Documents
    - Vocubulary = The food is good bad pizza amazing

- Step 2) Calculate the Vector Matrix for the Vocabulary
<table border="1" cellpadding="6" cellspacing="0">
  <thead>
    <tr>
      <th>Token</th>
      <th>The</th>
      <th>food</th>
      <th>is</th>
      <th>good</th>
      <th>bad</th>
      <th>pizza</th>
      <th>amazing</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>The</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td>
    </tr>
    <tr>
      <td>food</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td>
    </tr>
    <tr>
      <td>is</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td><td>0</td>
    </tr>
    <tr>
      <td>good</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td><td>0</td>
    </tr>
    <tr>
      <td>bad</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td><td>0</td>
    </tr>
    <tr>
      <td>pizza</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td><td>0</td>
    </tr>
    <tr>
      <td>amazing</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>1</td>
    </tr>
  </tbody>
</table>

- Step 3) Calculate the Vector representation of all the Documents one by one
- Step 3.1) Vector representation of Document-1
<table>
<tr><td>[</td></tr>
<tr>
<td>
 [1 0 0 0 0 0 0],     
</td>
<td>the</td>
</tr>
<tr><td>
 [0 1 0 0 0 0 0],     
 <td>food</td>
 </td></tr>
 <tr><td>
 [0 0 1 0 0 0 0],     
 <td>is</td>
 </td></tr>
 <tr><td>
 [0 0 0 1 0 0 0],     
 <td>good</td>
]

</td></tr>
 <tr><td>]</td></tr>
</table>

- Step 3.2) Vector representation of Document-2
<table>
<tr><td>[</td></tr>
<tr>
<td>
 [1 0 0 0 0 0 0],     
</td>
<td>the</td>
</tr>
<tr><td>
 [0 1 0 0 0 0 0],     
 <td>food</td>
 </td></tr>
 <tr><td>
 [0 0 1 0 0 0 0],     
 <td>is</td>
 </td></tr>
 <tr><td>
 [0 0 0 0 1 0 0],     
 <td>bad</td>
]

</td></tr>
 <tr><td>]</td></tr>
</table>

- Step 3.3) Vector representation of Document-3
<table>
<tr><td>[</td></tr>
<tr>
<td>
 [0 0 0 0 0 1 0],     
</td>
<td>pizza</td>
</tr>
 <tr><td>
 [0 0 1 0 0 0 0],     
 <td>is</td>
 </td></tr>
 <tr><td>
 [0 0 0 0 0 0 1],     
 <td>amazing</td>
]
 
</td></tr>
 <tr><td>]</td></tr>
</table>

**Note**
- 1) Document-1 Vector Matrix dimensions = 4(rows) X 7(columns)
- 2) Document-1 Vector Matrix dimensions = 4(rows) X 7(columns)
- 3) Document-1 Vector Matrix dimensions = 3(rows) X 7(columns)

### Advantages of OHE
- 1) Very easy to implement using python. Following libraries are used:- 
    - (1) sklearn - OneHotEncoder class
    - (2) pandas - get_dummies() function

### Disadvantages of OHE
- 1) ***Sparse Matrix*** : OHE leads to creation of Sparse Matrix which in turn leads to Overfitting which is not good.
- 2) ***Number of features in OHE is not fixed*** :
  - From the previous example 
    - a) Document-1 Vector Matrix dimensions = 4(rows) X 7(columns)
    - b) Document-1 Vector Matrix dimensions = 4(rows) X 7(columns)
    - c) Document-1 Vector Matrix dimensions = 3(rows) X 7(columns)
    - We cannot traing any Machine Learning Algorithm if the dimensions of the features is not fixed.

- 3) ***No semantic meaning is gets captured in OHE*** :
  - Example: 
  - <img src="./resources/images/ohe_disadvantage_1.png">



- 4) ***Out of Vocabulary*** : We cannot create a vector matrix for a new word that does not exist in the Vocabulary.
  - e.g. Vocabulary (Training data) = food pizza amazing
  - New Document (Test Data)= Burger is bad
  - Here Burger is a new word that does not exist in our Vocabulary and so we cannot create a Vector Matrix and hence cannot perform OHE on this.


### Definitions
- ***Sparse Matrix***
  - A sparse matrix is a matrix in which most of the elements are zero. Instead of storing every value (including all those zeros), we store only the non-zero elements, which saves memory and computation time.

- ***Overfitting***
  - Overfitting is a machine learning problem where the model learns the Training data too well and performs very well on Training data but Underperforms on the test data.
