### **Finetuning + Distillation + Quantization**

<h4> <b>Data Preparation</b> </h4> 
<h5> Data Sources </h5>
<ol>
  <li> <a href="https://www.kaggle.com/datasets/sunilthite/llm-detect-ai-generated-text-dataset"> LLM - Detect AI Generated Text Dataset </a> </li>
  <li> <a href="https://www.kaggle.com/datasets/thedrcat/daigt-v2-train-dataset"> DAIGT V2 Train Dataset </a> </li>
</ol>

<p>Both of these CSV files were added to the following kaggle notebook as an input for merging them and converting them to get our desired output format.</p>
<a href="https://www.kaggle.com/code/openmihirpatel/aivsog-dataprep"> Kaggle Notebook for data preparation and preprocessing. </a>
<p>After following the notebook a new combined CSV file will be generated (in the output folder) which will be our combined training data.</p>

<h3><b>Fine Tuning and Distillation</b></h3>
<h5><b>Fine Tuning</b></h5><p>We will use <b>BERT-base</b> available on <a href="https://huggingface.co/google-bert/bert-base-uncased"> Hugging Face </a> as our teacher model after finetuning it on our downstream task (which is Text-Classification).</p>

<h5><b>Distillation</b></h5><p>We will use <b>DistilBERT-base</b> available on <a href="https://huggingface.co/distilbert/distilbert-base-uncased"> Hugging Face </a> as our student model for distillation with the help of our teacher model.</p>

<p>The following notebook will take us to model distillation and quantization part of this project.</p>
<p><b><a href="https://www.kaggle.com/code/openmihirpatel/finetuning-and-distillation">Model Distillation and Quantization</a></b></p>


### **Quantization**

**Quantization** is a process that reduces the computational complexity and memory usage of a model by compressing it. In essence, quantization converts the model's parameters from higher precision (e.g., 32-bit floats) to lower precision (e.g., 8-bit integers), which decreases the model size and speeds up inference times. Quantization is especially useful in deploying models to edge devices with limited hardware resources since it can lead to faster and more efficient model performance.

**Benefits**:
- **Reduces Model Size**: Decreases the storage needed for the model.
- **Speeds Up Inference**: Reduces the time taken to make predictions.
- **Energy Efficiency**: Useful for low-power devices.


### **Distillation**

**Distillation** is a technique to compress a large, complex model (often called the **teacher model**) into a smaller, faster model (known as the **student model**). This process allows the student model to learn from the output predictions of the teacher model, retaining much of the original accuracy while being more efficient.

The teacher model is trained on a specific task, such as text classification, and is then used to train the student model. The student model learns to mimic the teacher, making it possible to achieve similar accuracy with reduced model size and computational needs.

**Benefits**:
- **Improved Efficiency**: Student models are smaller and faster.
- **Retained Performance**: Often achieves accuracy close to the teacher model.
- **Scalable Deployment**: Useful for real-time applications needing faster response times.


<h5><b>Results discussion</b></h5>

<table>
  <tr>
    <th>Model type</th>
    <th>Accuracy </th>
    <th>Loss </th>
    <th>Params </th>
    <th>Size (MB)</th>
    <th>Time (ms)</th>
  </tr>
  <tr>
    <td>BERT-base-uncased</td>
    <td> 0.993515 </td>
    <td> 0.025127 </td>
    <td> 109.48 M </td>
    <td> 438.003 MB </td>
    <td> 495.485 ms </td>
  </tr>
  <tr>
    <td>Distilled-BERT-base-uncased</td>
    <td> 0.990813 </td>
    <td> 0.040144 </td>
    <td> 66.36 M </td>
    <td> 265.490 MB </td>
    <td> 299.864 ms </td>
  </tr>
  <tr>
    <td>Quantized BERT-base-uncased</td>
    <td> 0.997027 </td>
    <td> 0.019512 </td>
    <td> 109.48 M </td>
    <td> 181.483 MB </td>
    <td> 340.033 ms </td>
  </tr>
  <tr>
    <td>Quantized Distilled-BERT-base-uncased</td>
    <td> 0.977033 </td>
    <td> 0.065918 </td>
    <td> 66.36 M </td>
    <td> 138.112 MB </td>
    <td> 191.589 ms </td>
  </tr>
</table>