# One-Hot Encoding in Machine Learning

## What is One-Hot Encoding?
One-hot encoding is a method to represent categorical variables as binary vectors. Each unique category is represented as a vector with all elements set to 0 except the one corresponding to the category, which is set to 1.

### Example
Given three categories: `["Red", "Green", "Blue"]`, one-hot encoding would represent them as:

- **Red**: `[1, 0, 0]`
- **Green**: `[0, 1, 0]`
- **Blue**: `[0, 0, 1]`

---

## Why Use One-Hot Encoding?
One-hot encoding is crucial because many machine learning algorithms work with numerical data and cannot process categorical text directly. Its benefits include:

### 1. No Ordinality Assumption
Unlike label encoding, one-hot encoding does not imply any order or ranking between categories. This avoids introducing unintended bias in models.

### 2. Compatibility
Algorithms like linear regression and neural networks perform better with one-hot encoded features since they interpret the features as independent.

---

## Algorithm for One-Hot Encoding
To perform one-hot encoding, follow these steps:

1. **Identify the categorical variable(s)** in the dataset.
2. **Determine the unique categories** in each variable.
3. **Create a binary column** for each unique category.
4. **Assign values**:
   - Assign a value of `1` for the category present in a given observation.
   - Assign a value of `0` for all other categories.




In [7]:
import pandas as pd
import numpy as np

In [9]:
data = pd.read_csv('student.csv')

In [11]:
data

Unnamed: 0,result
0,pass
1,fail
2,pass
3,pass
4,absent
5,fail
6,fail
7,pass
8,pass
9,absent


In [13]:
result_category = data['result']

In [19]:
from sklearn.preprocessing import LabelEncoder
obj = LabelEncoder()
result = obj.fit_transform(result_category)

In [21]:
result

array([2, 1, 2, 2, 0, 1, 1, 2, 2, 0, 2])

In [23]:
from sklearn.preprocessing import LabelBinarizer
obj = LabelBinarizer()
result = obj.fit_transform(result_category)


In [25]:
result

array([[0, 0, 1],
       [0, 1, 0],
       [0, 0, 1],
       [0, 0, 1],
       [1, 0, 0],
       [0, 1, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 0, 1],
       [1, 0, 0],
       [0, 0, 1]])