# Project 2: Supvervised Learning

## Total Points: 25

### Overview
In this project, you will **fine-tune an autoregressive language model (e.g., GPT-style) on a domain-specific**, labeled dataset using Parameter-Efficient Fine-Tuning (PEFT) techniques such as LoRA or adapters. The objective is to gain hands-on experience with preparing supervised data, applying PEFT to minimize training overhead, and evaluating the model’s performance on a task like text classification, summarization, or instruction following. You will compare the model's performance before and after fine-tuning using metrics such as accuracy, BLEU, or perplexity, depending on the task.


### 1. Identify the task and labeled datset
- Choose a supervised learning task such as text classification, summarization, question answering, or instruction following, preferably within a specific domain (e.g., legal, medical, finance, science, etc.).

- Select a labeled dataset appropriate for your task — one that includes both inputs and target outputs (e.g., prompts and responses, questions and answers, or documents and labels).

- Make sure the dataset is:
* Relevant to your chosen domain and task

* Small enough to fine-tune efficiently (especially with PEFT), but large enough to produce meaningful improvements

* You may need to experiment with dataset size or sampling to find a good balance between training time and performance.

### 2. Process Your Data
- Clean and preprocess the dataset to be suitable for an SL.
- Convert the dataset into a format compatible with your training pipeline. 
- Consider preprocessing your data to improve training quality.



### 3. Baseline Measurement
* Before fine-tuning, evaluate the performance of a pretrained autoregressive language model on your supervised task.

* Use your labeled dataset in its original input-output format (e.g., prompt → response, input → label).

* Measure how well the model performs without fine-tuning, using metrics appropriate for your task:

* Accuracy, F1-score for classification

* BLEU, ROUGE, or exact match for generation or QA


* This will serve as your baseline for comparing performance after PEFT fine-tuning.

### 4. Fine-Tune a AR model using PEFT
* Fine-tune a pretrained autoregressive language model on your domain-specific, labeled dataset using Parameter-Efficient Fine-Tuning (PEFT) with LoRA (Low-Rank Adaptation).

* Use a PEFT framework such as the Hugging Face peft library to integrate LoRA into your training pipeline.

* LoRA modifies only a small subset of the model's parameters, allowing for efficient training even on limited hardware.

* Implement a training loop or use tools like Trainer or accelerate for easier setup.

* After training, save the LoRA adapter weights for reuse or evaluation — you can later merge them with the base model if needed.


### 5. Remeasure the Fine-Tuned Model
* Evaluate the performance of your fine-tuned autoregressive model (now updated using LoRA) on the same supervised task and dataset used for the baseline.

* Use the same evaluation metric as before (e.g., accuracy, BLEU, ROUGE, or perplexity, depending on the task).

* Compare the results with the baseline measurement to assess the impact of fine-tuning.

* Analyze whether the model has improved in task performance, and consider potential areas for further tuning or data refinement.


### 7. Organize & Summarize Results
- Present your results in a structured format.
- Compare the performance of the pre-trained and fine-tuned models.
- Discuss key takeaways from the fine-tuning process.


## Hints & Tips
- **Use Generative Models**: Make use of tools like chatgpt, claude to help you brainstorm, etc.
- **Focus on Learning**: The goal is to understand MLM fine-tuning rather than achieving the best possible score. So focus on the process and not results
- **Experiment & Analyze**: Try different settings and analyze their impact.