# 🌟 1. What is Label Encoding?

- ##### **Label Encoding** is a **Categorical data preprocessing technique** used to **convert non-numeric labels** (like “Male”, “Female”, “Yes”, “No”, “Red”, “Blue”) **into numeric form** so that machine learning models can understand them.

## 📘 Why?
- ##### Most ML algorithms (like Decision Trees, SVMs, or Neural Networks) work only with numbers, not text. So we need to convert text categories → numbers.
  
<Br>
<Hr>
<br>

# 🧩 2. When to Use Label Encoding

- ##### Use Label Encoding when:

    - ##### The **categorical feature has ordinal relationship** (meaning the order matters). Example: “Low”, “Medium”, “High” → 0, 1, 2.

    - ##### Or sometimes for **non-ordinal data**, but **only with algorithms that can handle arbitrary numeric codes internally (like Tree-based models)**.

##### 💀 Avoid Label Encoding for **non-ordinal** features in models like **Linear Regression**, because it might introduce **false numeric relationships**.
<Br>
<Hr>
<br>

# 🧠 Real Truth About Label Encoding :

- ### On the face **LabelEncoder** looks like that it is used in case of the **Categorical** feature **having ordinal relationship** among its categories but this thought process is very **`counter intuative`** in nature as it encodes the categorical values according to the alphabetical order meaning low -> 1 medium -> 2 and high -> 0 which is altogether counter intuative as it breaks the real world logic

- ### **On Top of it we cannot specify the order which we want to give other than the default Alphabetical order of LabelEncoder**

- ### So in reality there is no such thing that it is meant for ordinal data **infact it will be better to use it when we have nominal data with Tree-Based Algorithms**

### 📊 Comparison with Other Encoders

| Encoder Type    | Preserves Order? | Suitable for          | Output Example                  |
| --------------- | ---------------- | --------------------- | ------------------------------- |
| Label Encoder   | ❌ No             | Target/Single Feature | `Red → 0, Blue → 1`             |
| Ordinal Encoder | ✅ Yes            | Ordered Features      | `Low → 0, Medium → 1, High → 2` |
| One-Hot Encoder | ❌ No             | Unordered Features    | `[1,0,0], [0,1,0], [0,0,1]`     |






# 4. ➡️ Now Comes the Implementation

In [16]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder

In [17]:
# creating data for dataframe
severity = ["Low","Medium","High","Low","Medium","High","Highest"] # This Data has the Ordinal Relationship

In [27]:
# now creating the DataFrame
df = pd.DataFrame({"Severity": severity})
df

Unnamed: 0,Severity
0,Low
1,Medium
2,High
3,Low
4,Medium
5,High
6,Highest


In [19]:
# Label Encoding
le = LabelEncoder()
df["Severity"] = le.fit_transform(df["Severity"])
df

Unnamed: 0,Severity
0,2
1,3
2,0
3,2
4,3
5,0
6,1


In [22]:
df["Severity"].unique()

array([2, 3, 0, 1])

## 🧾 Inverse Transformation
- You can also decode encoded labels back to their original form using:

In [25]:
decoded = le.inverse_transform(df["Severity"].unique())

In [26]:
decoded

array(['Low', 'Medium', 'High', 'Highest'], dtype=object)


## ⚡ Label Encoding Multiple Columns Automatically

If you want to encode all categorical columns at once:

```python
    df_encoded = df.apply(lambda col: le.fit_transform(col) if col.dtypes == 'object' else col)
    print(df_encoded)

```
This applies LabelEncoder to all columns of type object (string).

## 🚫 Limitations of Label Encoding


| Limitation                       | Explanation                                                                                         |
| -------------------------------- | --------------------------------------------------------------------------------------------------- |
| ❌ Creates ordinal relationships  | For non-ordinal data (like Country), “France=0”, “Germany=1”, “Spain=2” introduces false hierarchy. |
| ❌ Not suitable for linear models | Algorithms may interpret “Spain” > “France” numerically, which is meaningless.                      |
| ⚠️ Must handle unseen categories | If new data has a category unseen during training, it’ll cause an error.                            |


## 🧠 Summary Table

| Feature                | Label Encoding           | One-Hot Encoding       |
| ---------------------- | ------------------------ | ---------------------- |
| Output Type            | Integer (0, 1, 2...)     | Binary columns         |
| Keeps Ordinal Info     | ✅ Yes                    | ❌ No                   |
| For Non-Ordinal Data   | ❌ Avoid                  | ✅ Use                  |
| Affected by Model Type | Yes                      | No                     |
| Example Use Case       | Size: Small/Medium/Large | Country, Color, Gender |
