# 📚 Table of Contents

- [🌍 Introduction to Multilingual NLP](#introduction-to-multilingual-nlp)
  - [🌐 Why multilingual models matter](#why-multilingual-models-matter)
  - [🔤 Challenges in multilingual NLP](#challenges-in-multilingual-nlp)
- [🧠 XLM (Cross-lingual Language Model)](#xlm-cross-lingual-language-model)
  - [🌍 XLM’s approach to multilingual representation](#xlms-approach-to-multilingual-representation)
  - [🌐 Using XLM for translation and classification](#using-xlm-for-translation-and-classification)
  - [🧪 Example: Fine-tuning XLM](#example-fine-tuning-xlm)
- [📚 RoBERTa and mT5 for Multilingual Tasks](#roberta-and-mt5-for-multilingual-tasks)
  - [🚀 RoBERTa’s improvements over BERT](#robertas-improvements-over-bert)
  - [📤 Using mT5 for multilingual generation and translation](#using-mt5-for-multilingual-generation-and-translation)
  - [🧪 Example: Fine-tuning mT5](#example-fine-tuning-mt5)


### **1. Multilingual NLP Overview**
```mermaid
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
flowchart TD
    subgraph Challenges["Key Challenges"]
        direction LR
        A[Tokenization] --> B[Language Diversity]
        B --> C[Data Scarcity]
        C --> D[Script Variations]
    end
    
    subgraph Importance["Why Multilingual?"]
        direction LR
        E[🌍 Global Apps] --> F[🔄 Cross-Lingual Transfer]
        F --> G[📈 Low-Resource Languages]
    end
    
    Challenges -->|Solve With| Models[Multilingual Models]
    Importance --> Models
```

---

### **2. XLM Architecture**
```mermaid
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
flowchart LR
    subgraph XLM["XLM Workflow"]
        direction TB
        LangA[English Text] --> SharedEmb[Shared Embeddings]
        LangB[中文文本] --> SharedEmb
        LangC[Texto español] --> SharedEmb
        SharedEmb --> Transformer[Transformer Encoder]
        Transformer --> Task[Cross-lingual Tasks]
    end
    
    Task --> CLS[Classification]
    Task --> MT[Translation]
    Task --> QA[Question Answering]
```

---

### **3. RoBERTa Improvements**
```mermaid
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
flowchart LR
    BERT -->|Enhancements| RoBERTa
    subgraph RoBERTa["RoBERTa Optimizations"]
        direction TB
        A[Larger Dataset] --> B[Longer Training]
        B --> C[No NSP Objective]
        C --> D[Dynamic Masking]
    end
    
    RoBERTa --> Apps[Better Multilingual Performance]
```

---

### **4. mT5 for Multilingual Generation**
```mermaid
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
flowchart LR
    subgraph mT5["mT5 Architecture"]
        direction LR
        Input["Multilingual Input"] --> Encoder
        Encoder --> Decoder
        Decoder --> Output["Multilingual Output"]
    end
    
    Input -.->|English: "Hello"| Output1["French: "Bonjour""]
    Input -.->|中文: "你好"| Output2["Spanish: "Hola""]
```

---

### **5. Fine-tuning Workflow**
```mermaid
%%{init: {'theme': 'base', 'themeVariables': { 'fontSize': '14px'}}}%%
flowchart TD
    Model[Pretrained Model] --> FT[Fine-tuning]
    FT -->|Multilingual Dataset| Train[Training]
    Train --> Eval[Evaluation]
    
    subgraph Dataset["Example Data"]
        direction LR
        D1["English: Positive"] --> Mix
        D2["中文: 正面"] --> Mix
        D3["Español: Positivo"] --> Mix
    end
    
    Dataset --> FT
```

---



# <a id="introduction-to-multilingual-nlp"></a>🌍 Introduction to Multilingual NLP

# <a id="why-multilingual-models-matter"></a>🌐 Why multilingual models matter

# <a id="challenges-in-multilingual-nlp"></a>🔤 Challenges in multilingual NLP

---

# <a id="xlm-cross-lingual-language-model"></a>🧠 XLM (Cross-lingual Language Model)

# <a id="xlms-approach-to-multilingual-representation"></a>🌍 XLM’s approach to multilingual representation

# <a id="using-xlm-for-translation-and-classification"></a>🌐 Using XLM for translation and classification

# <a id="example-fine-tuning-xlm"></a>🧪 Example: Fine-tuning XLM

---

# <a id="roberta-and-mt5-for-multilingual-tasks"></a>📚 RoBERTa and mT5 for Multilingual Tasks

# <a id="robertas-improvements-over-bert"></a>🚀 RoBERTa’s improvements over BERT

# <a id="using-mt5-for-multilingual-generation-and-translation"></a>📤 Using mT5 for multilingual generation and translation

# <a id="example-fine-tuning-mt5"></a>🧪 Example: Fine-tuning mT5
