# **Machine Learning Classification: KNN vs Decision Trees**

## **Overview**
Both programs demonstrate **supervised learning** for binary classification - distinguishing between apples and oranges using different fruit characteristics.

---

## **Program 1: K-Nearest Neighbors (KNN)**

### **How it Works:**
- **Algorithm**: Finds the 3 closest training examples to new input
- **Features**: Weight (grams) + Size (cm)
- **Decision Method**: Majority vote among 3 nearest neighbors
- **Training Data**: 10 samples (5 apples, 5 oranges)

### **Classification Logic:**
```
Lighter/Smaller → Apple (0)
Heavier/Larger → Orange (1)
```

---

## **Program 2: Decision Tree**

### **How it Works:**
- **Algorithm**: Creates decision rules based on feature splits
- **Features**: Size (cm) + Color intensity (1-10)
- **Decision Method**: Tree-based rules (if-then conditions)
- **Training Data**: 4 samples (2 apples, 2 oranges)

### **Classification Logic:**
```
Smaller + Lighter Color → Apple (0)
Larger + Darker Color → Orange (1)
```

---

## **Key Differences**

| Aspect | **KNN** | **Decision Tree** |
|--------|---------|-------------------|
| **Method** | Distance-based | Rule-based |
| **Interpretability** | Black box | Highly interpretable |
| **Training** | Lazy learning | Builds model structure |
| **Memory** | Stores all data | Compact tree structure |
| **Features Used** | Weight + Size | Size + Color |

---

## **Common Workflow**

1. **Import** algorithm from sklearn
2. **Prepare** training data (X = features, y = labels)
3. **Create** model instance
4. **Train** model using `fit(X, y)`
5. **Get** user input for prediction
6. **Predict** using `predict([[new_data]])`
7. **Display** human-readable result

---

## **Summary**

Both programs solve the same problem (**fruit classification**) using different approaches:

- **KNN**: "What are the 3 most similar fruits we've seen before?"
- **Decision Tree**: "What rules can separate apples from oranges?"

These represent two fundamental paradigms in machine learning:
- **Instance-based learning** (KNN)
- **Rule-based learning** (Decision Trees)

Both are excellent starting points for understanding classification concepts in machine learning!

In [11]:
# Import the K-Nearest Neighbors classifier from scikit-learn
from sklearn.neighbors import KNeighborsClassifier

# Training data: Features [weight, size] for fruits
# Each inner list represents [weight_in_grams, size_in_cm]
X = [
        [180,6],    # Small, light fruit
        [200,6.5],  # Small, light fruit
        [250,7],    # Small, light fruit
        [300,7.5],  # Small, light fruit
        [350,8],    # Small, light fruit
        [400,8.5],  # Large, heavy fruit
        [450,9],    # Large, heavy fruit
        [500,9.5],  # Large, heavy fruit
        [550,10],   # Large, heavy fruit
        [600,10.5], # Large, heavy fruit
    ]

# Target labels: 0 = Apple, 1 = Orange
# First 5 samples are apples (0), last 5 are oranges (1)
y=[0,0,0,0,0,1,1,1,1,1]

# Create KNN classifier with 3 neighbors
# Algorithm will look at 3 closest data points to make prediction
model=KNeighborsClassifier(n_neighbors=3)

# Train the model with our fruit data
model.fit(X,y)

# Get user input for new fruit characteristics
weight=float(input("Enter the Weight "))
size=float(input("Enter the size "))

# Make prediction for the new fruit
# predict() returns array, so [0] gets first element
prediction=model.predict([[weight,size]])[0]

# Display result based on prediction
if prediction==0:
  print("the fruit is an Apple")    # Prediction: 0 = Apple
else:
  print("the fruit is an orange")  # Prediction: 1 = Orange


Enter the Weight 600
Enter the size 10
the fruit is an orange


In [13]:
# Import the Decision Tree classifier from the scikit-learn library
from sklearn.tree import DecisionTreeClassifier

# Training data: Features [Size, Color]
# Each inner list represents [Size_in_cm, Color_shade_from_1_to_10]
X = [
    [7,4],   # Small size, light color
    [8,6],   # Small size, medium color
    [9,7],   # Large size, medium color
    [10,9]   # Large size, dark color
]

# Target labels: 0 = Apple, 1 = Orange
# Corresponds to each data point in X
y = [0,0,1,1]

# Create an instance of the Decision Tree model
model = DecisionTreeClassifier()

# Train the model by fitting it to the feature data (X) and target labels (y)
model.fit(X,y)

# Get user input for the characteristics of a new fruit
Size = float(input("Enter the Size in cm "))
Color = float(input("Enter the color shade in range (1-10) "))

# Use the trained model to predict the fruit type for the new data
# .predict() expects a 2D array, so we use [[...]]
# [0] extracts the single prediction from the returned array (e.g., from [0] to 0)
prediction = model.predict([[Size,Color]])[0]

# Display the result based on the numerical prediction
if prediction == 0:
  print("the fruit is likely to be an Apple ")  # Prediction of 0 means Apple
else:
  print("the fruit is likely to be an orange ") # Prediction of 1 means Orange


Enter the Size in cm7
Enter the color shade in range (1-10)5
the fruit is likely to be an Apple 
