# Deep Learning - Exercise 1

## ðŸ§ª Build and Document an Image Classifier with fastai

### **Goal**
In this part, you will **design, build, and document** a complete image-classification system using the **fastai** library.  
You will **define your own problem**, **collect or create your own dataset**, **train a model**, and **evaluate its performance**.  
Your solution will be submitted as a **GitHub repository**, along with a short **presentation** summarizing your workflow and results.

This exercise simulates a real-world applied deep learning pipelineâ€”from idea to dataset to model to evaluation.

---

## ðŸŽ¯ Learning Objectives
By completing this part, you will:

- Formulate a supervised learning problem and justify its relevance.
- Collect, construct, clean, and organize an image dataset.
- Build reproducible data pipelines using fastai.
- Train and fine-tune a modern deep-learning model.
- Critically evaluate performance using qualitative and quantitative tools.
- Communicate your process clearly in both code and presentation.

---

## ðŸ“¦ Deliverables

### **1. GitHub Repository (the main submission)**
Your GitHub project must include:

#### **A. Jupyter Notebook (`project.ipynb`)**
This notebook should contain:

1. **Problem Definition**  
   - Description of the classes, use-case, and motivation  
   - Expected challenges  

2. **Dataset Creation and Preparation**  
   - How you collected the images  
   - Cleaning and filtering steps  
   - Final dataset structure (document with code or screenshots)

3. **Data Loading in fastai**  
   - `ImageDataLoaders` or `DataBlock`  
   - Example batches with `show_batch()`

4. **Model Training**  
   - Transfer learning setup  
   - Fine-tuning  
   - Training curves

5. **Evaluation**  
   - Confusion matrix  
   - `plot_top_losses`  
   - Examples of correct and incorrect predictions  
   - Discussion of model weaknesses (if there are any)


---

#### **B. Dataset**
Your dataset should be:

- Stored inside the repo **or** downloaded automatically with code  
- Organized into class-named folders  
- At least **100 images per class**

If using external sources, please give correct citation.

Don't make the dataset too big, training the model should not take too long.

---

### **2. Presentation (8â€“12 slides)**
Your presentation must cover:

1. **Problem Definition**  
   - Why you chose this problem  
   - What are the classes?  

2. **Data Collection Process**  
   - Sources, steps, difficulties  
   - Dataset size and examples  

3. **Model Pipeline**  
   - fastai approach  
   - Architecture chosen (e.g., ResNet34)  
   - Training details  

4. **Results**  
   - Accuracy, confusion matrix  
   - What the model learned / struggled with  

5. **Challenges Encountered**  
   - Noisy labels, class imbalance, unexpected behavior  

---

## ðŸ“¤ How to Submit
Submit **only** your presentation, including a link to your GitHub repository, through the course system.

Your repo should contain:
- The notebook  
- The dataset (or download script)  

No additional files will be accepted outside GitHub.

---

This assignment is designed to reflect the real-world workflow of applied ML research and industry practice.  
Choose a problem you find meaningfulâ€”and have fun exploring!
