# 📄 **README: MNIST Classification Assignment**

## **📚 Project Overview**
This project aims to classify handwritten digits from the **MNIST dataset** using traditional machine learning techniques (excluding neural networks and deep learning). The objective is to develop, train, and evaluate models that achieve an **F1-score greater than 0.95** on the testing dataset.

---

## **📂 Dataset Description**
- **Dataset Source:** [Kaggle - MNIST Dataset](https://www.kaggle.com/datasets/hojjatk/mnist-dataset)
- **Content:**
  - **Images:** 70,000 grayscale images of handwritten digits (0-9).
  - **Training Set:** 60,000 images.
  - **Testing Set:** 10,000 images.
- **Image Size:** 28x28 pixels (flattened to 784 features per image).
- **Labels:** Digits 0 through 9 (multi-class classification problem).
- **File Format:** `.idx` binary files for images and labels.

---

## **🛠️ Project Structure**
```
project_folder/
│
├── Assignment1_applied_ML.ipynb   # Jupyter Notebook with complete code and outputs
├── README.md                      # This README file
├── t10k-images.idx3-ubyte         # Test images
├── t10k-labels.idx1-ubyte         # Test labels
└── report.pdf                     # Final professional report (5-6 pages)
```

---

## **🔄 Preprocessing Steps**
- **Normalization:** Pixel values scaled from [0, 255] to [0, 1].
- **Flattening:** 28x28 images reshaped into 784-dimensional vectors.
- **Standardization:** Applied `StandardScaler` for optimal model performance.

---

## **⚡ Model Training & Evaluation**
### ✨ **Models Used:**
1. **K-Nearest Neighbors (KNN):**
   - Hyperparameters: `n_neighbors`, `weights`.
   - Achieved F1-Score: 0.9532

2. **Support Vector Machine (SVM):**
   - Hyperparameters: `C`, `kernel`.
   - Achieved F1-Score: 0.9578

3. **Random Forest Classifier:**
   - Hyperparameters: `n_estimators`, `max_depth`.
   - Achieved F1-Score: 0.9645 (**Best Performing Model**)

### 📊 **Evaluation Metrics:**
- **Primary Metric:** F1-Score (Macro Average)
- **Visualization:**
  - Confusion Matrices for each model.
  - Bar plots comparing F1-scores.

---

## **🚀 How to Run the Project**
### 1️⃣ **Clone or Download the Repository:**
```bash
git clone : https://github.com/Trusha-Rana/MNIST_Dataset_Classification_model
```

### 2️⃣ **Install Required Libraries:**
Ensure you have the following Python libraries:
```bash
pip install numpy matplotlib seaborn scikit-learn
```

### 3️⃣ **Run the Notebook:**
```bash
jupyter notebook Assignment1_applied_ML.ipynb
```

---

## **🎯 Key Results & Findings**
- **Best Model:** Random Forest Classifier with an F1-score of **0.9645**.
- **Conclusion:** The project successfully met the target performance metric (**F1-score > 0.95**).

---

## **📝 References**
- [Kaggle - MNIST Dataset](https://www.kaggle.com/datasets/hojjatk/mnist-dataset)
- [Scikit-learn Documentation](https://scikit-learn.org/)
- Python Libraries: `numpy`, `matplotlib`, `seaborn`, `scikit-learn`

---

✨ **This README file provides a comprehensive overview of the MNIST classification project, including dataset details, preprocessing steps, model training, evaluation, and key findings.**
