# BINING

**Definition:**  
- Groups the values into **bins**.  
- Converts **numerical data into categorical data**.

---

## Example i) Grouping (Numerical to Categorical)

This example shows how numerical **Price** data is converted into categorical "bins" (**HIGH, MEDIUM, LOW**).

| S.No. | Price | → | S.No. | Price | Price Binned |
|-------|-------|---|-------|-------|--------------|
| 1     | 1000  |   | 1     | 1000  | HIGH         |
| 2     | 600   |   | 2     | 600   | MEDIUM       |
| 3     | 500   |   | 3     | 500   | MEDIUM       |
| 4     | 100   |   | 4     | 100   | LOW          |

---

## Example ii) Converting into Categorical (One-Hot Encoding)

This example shows how a categorical feature (**FUEL**) is converted into multiple **binary numerical features** (**GAS** and **DIESEL**) using One-Hot Encoding.

| S.No. | FUEL   | → | S.No. | FUEL   | GAS | DIESEL |
|-------|--------|---|-------|--------|-----|--------|
| 1     | GAS    |   | 1     | GAS    | 1   | 0      |
| 2     | DIESEL |   | 2     | DIESEL | 0   | 1      |
| 3     | DIESEL |   | 3     | DIESEL | 0   | 1      |
| 4     | GAS    |   | 4     | GAS    | 1   | 0      |


## Numeric to Categorical

In [28]:
import pandas as pd
path = "D:/tutedude/Data Analysis/CSV/numeric.csv"
df = pd.read_csv(path)
print(df)

   S.No   A  B   C   D   E  Length
0     1   2  1   5   3   4      45
1     2   4  3  10   6   8      43
2     3   6  5  15   9  12      66
3     4   8  7  20  12  16      78
4     5  10  9  25  15  20      32


In [30]:
import numpy as np
import pandas as pd

df["Length_binned"] = np.select(
    [df["Length"] == df["Length"].min(), df["Length"] == df["Length"].max()],
    ["LOW", "HIGH"],
    default="MEDIUM"
)

df


Unnamed: 0,S.No,A,B,C,D,E,Length,Length_binned
0,1,2,1,5,3,4,45,MEDIUM
1,2,4,3,10,6,8,43,MEDIUM
2,3,6,5,15,9,12,66,MEDIUM
3,4,8,7,20,12,16,78,HIGH
4,5,10,9,25,15,20,32,LOW


## Categorical to Numeric
- we don't have to use numpy library for this

In [33]:
pd.get_dummies(df["Length"])

Unnamed: 0,32,43,45,66,78
0,False,False,True,False,False
1,False,True,False,False,False
2,False,False,False,True,False
3,False,False,False,False,True
4,True,False,False,False,False


In [35]:
pd.get_dummies(df["Length_binned"])


Unnamed: 0,HIGH,LOW,MEDIUM
0,False,False,True
1,False,False,True
2,False,False,True
3,True,False,False
4,False,True,False
