### **Assignment Prompt: Comparative Analysis of BERT, GPT, and BART**  

#### **Objective**  
In this report, you will analyze and compare three major Transformer-based models: **BERT, GPT, and BART**. Your goal is to explore their **architectures, pretraining approaches, fine-tuning strategies, attention mechanisms, and real-world applications** relevant to your field of interest.  
These three architectures are examples of the main transformer models used in NLP today and you should get familiar with them.


This assignment requires **critical thinking and personalized analysis**, so a chatbot alone **will not be able to generate a complete, high-quality response**. You must integrate:  
✅ **Original explanations** of key concepts,  
✅ **Personal insights** based on your professional or academic background, and  
✅ **Real-world examples** that demonstrate practical applications.  

📌 **Using generative AI for help with definitions is allowed, but additional sources must be provided for verification and expansion.**  

---  

### **Sections to Include in Your Report**  

#### **1. Introduction**  
- Introduce **encoder, decoder, and encoder-decoder architectures** and explain their roles in modern AI.  
- Explain why each of these architectures is important for different NLP tasks.  
- You may focus specifically on BERT (encoder-only), GPT (decoder-only), and BART (encoder-decoder) if you prefer.  

#### **2. Model Architectures & Attention Mechanisms**  
- Compare the **encoder-decoder structures** of the three models.  
- Explain **how BERT’s bidirectional encoding** differs from **GPT’s autoregressive approach** and **BART’s hybrid nature**.  
- **Describe the three types of attention mechanisms** used in these models and where they are applied:  
  - **Self-Attention (Encoder-Side, as in BERT and BART's Encoder)** – Each token attends to all other tokens in the input to capture bidirectional context.  
  - **Causal Self-Attention (Decoder-Side, as in GPT and BART's Decoder)** – Each token can only attend to previous tokens, enabling autoregressive text generation.  
  - **Cross-Attention (Encoder-to-Decoder, as in BART)** – The decoder attends to all encoder outputs, allowing for contextualized sequence-to-sequence learning.  
- Include **hand-drawn diagrams** illustrating **how attention works differently in each model**. These must be included as cropped scans or photos in your submission.  To include pictures in your markdown, save them in the same directory as your notebook and use HTML to include them in the markdown like this: &lt;img src="diagram1.png" alt="Attention Mechanism Diagram" width="600" /&gt;

#### **3. Pretraining Objectives & Key Terminology**  
For each model, explain its **pretraining strategy** and define the following terms **in your own words**:  
- **Masked Language Modeling (MLM)** (BERT)  
- **Causal Language Modeling (CLM)** (GPT)  
- **Autoregression**  
- **Denoising Autoencoder** (BART)  
- **Self-Attention vs. Causal Attention**  
- **Fine-tuning** (How each model is adapted for specific tasks)  
- **Transfer Learning** (Why these models can be used for many NLP applications)  

To ensure originality, you must include **one example per term** that relates to a unique dataset, real-world challenge, or personal experience in your field.  

#### **4. Fine-Tuning Approaches**  
- Discuss how each model can be fine-tuned for tasks like **text classification, summarization, or question-answering**.   
- Describe how you would **fine-tune one of these models for your own work or in your field of interest**.  

#### **5. Real-World Applications & Your Professional Interests**  
- Identify **three NLP applications** that interest you.  
- Discuss which model (**BERT, GPT, or BART**) would be most effective for each and **justify your choice**.  
- Mention **challenges in your field of expertise or interest** that these models might help solve (e.g., legal document analysis, medical text summarization, automated customer support).  

#### **6. Conclusion**  
- Summarize key takeaways.  
- Reflect on **which model you find most useful** for your own interests and why.  

---  

### **Formatting & Submission Guidelines**  
✅ **Length:** 1500-2000 words (this would be 5-7 pages of double-spaced text).  You can run the `Count_Words.ipynb` notebook to count the words in a notebook or HTML file.  
✅ **Citations:** At least 3 sources (e.g., academic papers, Hugging Face documentation, blog posts). You must provide **one source that is not from OpenAI or Hugging Face**.  
✅ **Figures/Diagrams:** Include **at least three hand-drawn diagrams** of the attention mechanisms. Other hand-drawn figures are welcome. These must be scanned or photographed and cropped for clarity.  
✅ **Use Clear Formatting:** Each section and defined term must have **clear headers and subtitles** for readability.  We've included a file, `Homework_09_Report.ipynb` with all the headers to get you started.
✅ **Submission Formats:** You can write your report in **Markdown within a Jupyter Notebook** (exported to HTML) or use **Word or a similar program and submit as a PDF**.  
✅ **Code is not required:** If you want, you can include small examples of applications in your area of interest, but this is not a requirement.  

---  

### **Tips to Ensure Originality (Avoiding Chatbot Reliance)**  
📌 **Personalization is required** – You must relate the models to your field. A chatbot cannot generate **your professional insights or unique examples**.  
📌 **Diagrams & real-world applications** – Explain concepts visually or with domain-specific examples (e.g., healthcare, finance, law).  
📌 **Compare different sources** – Don't rely on a single AI-generated explanation. Use additional sources beyond chatbots.  
📌 **Critical thinking** – Challenge model limitations. Chatbots often **fail at deep analysis**—you should go beyond surface-level descriptions.  

---  

### **Evaluation Criteria**  
✅ **Completeness, Figures** - 15 points. You addressed all of the prompts and included thoughtful hand-drawn figures.  
✅ **Personalization, Applications** - 15 points. You added significant personalization and discussed applications in your field of interest.  
✅ **Citations, Supporting Evidence** - 10 points. Well-researched with credible sources.  
✅ **AI-Generated Content Penalty** - A generic chatbot-generated document will earn *at most* 20 out of the 40-point total.  

---  

You should begin by copying the headers for each of the numbered sections described above.

I fully expect you to use the text and chatbots to understand each of the ideas and terms 
but I also expect you to think about your answers, personalize them, and provide additional supprort for them.
Personalization and effort are key to earning full credit for this report.  

I'm generally interested in applications of AI/deep learning in education and healthcare.  To give you an idea of what I'm looking for in your reports, here are some examples of applications related to my own interests that I would discuss in a report:
* **Encoder only models**
  * Text classification model for classifying patient messages to their doctors as requiring immediate response or not.
  * Vision transformer models for classification are encoder only models.  I'm interested in testing those in a computer-aided diagnosis system for diagnosing cancer in breast ultrasound images and videos.
* **Decoder only models**
  * Develop a fine-tuned generative model for producing tailored impressions, and possibly even structured reports, as part of computer-aided diagnostic system for cancer diagnosis.  This would help reduce dictation time and fatique.
  * Develop a fine-tuned reasoning LLM for math tutoring that drives a collaborative tutoring system for helping student learn mathematics.
* **Encoder-Decoder models**
  * Question answering models that can be fine-tuned or used in conjuction with a retrieval system that can ingest a corpus of scientific papers and answer questions about them.  (LLMs can already do this to some extent.)
  * Vision tranformer models for segmentation of lesions in breast ultrasound images and videos.

