## Classification
#### Classification codes are used to describe or define an industrial activity in publications and databases. Official publications produced by governments and international organisations, including national and international statistical offices, will generally make use of these codes, as do many commercial databases

##### Simple Classification
Dataset: Mushroom Classification

Objective: Classify mushrooms as edible or poisonous.


Finds the k nearest data points to the new input

Looks at their classes (majority voting)

Predicts the most common class

In [5]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns


In [8]:
# Load from Kaggle download path or directly if in working directory
df = sns.load_dataset("Iris")

In [9]:
df

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica
146,6.3,2.5,5.0,1.9,virginica
147,6.5,3.0,5.2,2.0,virginica
148,6.2,3.4,5.4,2.3,virginica


In [None]:
# Step 2: Load Dataset
# Drop ID column if present
df.columns

Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width',
       'species'],
      dtype='object')

In [None]:
# Step 3: Prepare Data
# Features and labels
X = df.drop('species', axis=1)
y = df['species']

In [16]:
# Step 4: Split into Train/Test Sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [18]:
#Step 5: Feature Scaling (Important for KNN)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


In [20]:
#Step 6: Train KNN Model
knn = KNeighborsClassifier(n_neighbors=3)  # You can change k
knn.fit(X_train, y_train)


In [22]:
#step 7: Predict and Evaluate
y_pred = knn.predict(X_test)

print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))


Confusion Matrix:
 [[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]

Classification Report:
               precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        10
  versicolor       1.00      1.00      1.00         9
   virginica       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



In [33]:
# Step 8: Predict on New Input
new_sample = [[5.1, 3.5, 1.4, 0.2]]  # Example from Iris-setosa
new_sample_scaled = scaler.transform(new_sample)
new_sample_scaled




array([[-0.86445224,  0.98006827, -1.33331205, -1.31260282]])

In [34]:
predicted_class = knn.predict(new_sample_scaled)
print("Predicted class:", predicted_class[0])

Predicted class: setosa


#  Multiclass Classification with K-Nearest Neighbors (KNN)

This notebook demonstrates how to build a **multiclass classification model** using **K-Nearest Neighbors (KNN)** to predict a person’s **"likeness"** (Biryani, Pakora, or Samosa) based on their **age**, **height**, **weight**, and **gender**.

---

## 📊 Dataset

| age | height  | weight | gender | likeness |
|-----|---------|--------|--------|----------|
| 27  | 170.688 | 76     | Male   | Biryani  |
| 41  | 165     | 70     | Male   | Pakora   |
| 29  | 171     | 80     | Male   | Samosa   |
| 27  | 173     | 102    | Male   | Samosa   |
| 29  | 164     | 67     | Male   | Biryani  |
| 28  | 174     | 46     | Female | Biryani  |
| 27  | 151     | 64.3   | Female | Biryani  |
| 34  | 176.5   | 98     | Male   | Pakora   |
| 32  | 181     | 87.5   | Male   | Biryani  |

---

##  Objective

Train a KNN model to predict the **"likeness"** class based on the other columns.

---

##  Steps in Code

```python
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report

# Step 1: Load data
data = pd.read_csv("food_likeness.csv")

# Step 2: Encode categorical features
le_gender = LabelEncoder()
data['gender'] = le_gender.fit_transform(data['gender'])  # Male=1, Female=


In [43]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report

# Step 1: Load data
data = pd.read_csv("dataset.csv")
data = data.dropna()



In [44]:

# Step 2: Encode categorical column
le_gender = LabelEncoder()
data['gender'] = le_gender.fit_transform(data['gender'])  # Male=1, Female=0

le_likeness = LabelEncoder()
data['likeness'] = le_likeness.fit_transform(data['likeness'])  # Biryani=0, Pakora=1, Samosa=2

In [45]:

# Step 3: Split features and labels
X = data[['age', 'height', 'weight', 'gender']]
y = data['likeness']


In [46]:

# Step 4: Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
# Step 5: KNN Model
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)


In [59]:

# Step 6: Predict
y_pred = knn.predict(X_test)
knn.predict([[25, 165, 60, 0]])




array([0])

In [60]:
# Step 7: Make predictions
all_labels = [0, 1, 2]  # Biryani, Pakora, Samosa
all_class_names = le_likeness.classes_
print("Classification Report:\n")
print(classification_report(
    y_test,
    y_pred,
    labels=all_labels,
    target_names=all_class_names,
    zero_division=0
))

Classification Report:

              precision    recall  f1-score   support

     Biryani       0.50      1.00      0.67         1
      Pakora       0.00      0.00      0.00         1
      Samosa       0.00      0.00      0.00         0

    accuracy                           0.50         2
   macro avg       0.17      0.33      0.22         2
weighted avg       0.25      0.50      0.33         2



In [61]:
# Example input: [age, height, weight, gender]
# gender = 1 for Male, 0 for Female

new_input = [[25, 165, 60, 1]]  # A 25-year-old male, 165 cm tall, 60 kg

# Make prediction
predicted_class = knn.predict(new_input)[0]

# Decode class back to label
predicted_label = le_likeness.inverse_transform([predicted_class])[0]

print("Predicted food likeness:", predicted_label)


Predicted food likeness: Biryani


