# Session 0: Orientation and setup. LLMs for low-resource languages

<div align="center">

**üìö Course Repository:** [github.com/NinaKivanani/Tutorials_low-resource-llm](https://github.com/NinaKivanani/Tutorials_low-resource-llm)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NinaKivanani/Tutorials_low-resource-llm/blob/main/Session0_Orientation_and_Setup_LLMs_Low_Resource.ipynb)
[![GitHub](https://img.shields.io/badge/GitHub-View%20Repository-blue?logo=github)](https://github.com/NinaKivanani/Tutorials_low-resource-llm)
[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)

</div>

---

## Your Learning Quest Begins Here!

**Welcome, Language Champion!** üëã You're about to embark on a journey that will transform you from an AI curious learner into a **multilingual AI expert**. This isn't just another coding tutorial‚Äîit's your mission to democratize AI for the world's 6,900+ languages!

### üèÜ Your 30-Minute Setup Challenge:
```
üéØ Mission Checklist:
‚îú‚îÄ‚îÄ üîß Power Up Your Environment     [‚ñ°‚ñ°‚ñ°‚ñ°‚ñ°] 0%
‚îú‚îÄ‚îÄ üåç Choose Your Language Quest    [‚ñ°‚ñ°‚ñ°‚ñ°‚ñ°] 0%  
‚îú‚îÄ‚îÄ üìä Build Your Evaluation Toolkit [‚ñ°‚ñ°‚ñ°‚ñ°‚ñ°] 0%
‚îú‚îÄ‚îÄ ü§ñ Test with Real AI Models      [‚ñ°‚ñ°‚ñ°‚ñ°‚ñ°] 0%
‚îî‚îÄ‚îÄ üöÄ Ready for Session 1!         [‚ñ°‚ñ°‚ñ°‚ñ°‚ñ°] 0%
```

**‚è±Ô∏è Total time:** 30-45 minutes of pure setup magic!  

---

## üåü The Big Picture: Why this session matters

**üö® The Problem:

Current LLMs are strongest in English and a small number of high-resource languages. For most of the world‚Äôs languages, data coverage, tools, and evaluation resources are limited.  

This session helps you:

- make sure your technical setup will not block you in later sessions  
- prepare language-specific examples that you will re-use across the course  
- think from the beginning about correctness, fluency, cultural fit, and safety in low-resource settings

## üó∫Ô∏è Tutorial roadmap. where Session 0 fits

### The full tutorial is organized as follows.

**üîç Session 0. Orientation and setup**  
- Environment, language choice, and evaluation sheet preparation.
**üîç Session 1: Intro to LLM: Dialogue summarization in low-resource settings**
- How LLMs handle multilingual dialogue, tokenization effects, and simple evaluation.

**üéØ Session 2: Prompt design and cross-lingual prompting**
- Designing prompts that work across languages, with a focus on low-resource ones.

**‚öôÔ∏è Session 3: Lightweight model adaptation**
- Fine-tuning or parameter-efficient adaptation for specific domains or languages.

**‚öñÔ∏è Session 4: Bias, safety, and evaluation**
- Identifying and documenting biases and failure patterns, especially for under-represented communities.

This notebook only aims to ensure that, by the time you start Session 1, your environment and data are ready.

## ‚úÖ Step 0: Quick Start Checklist
Before you proceed, check the following.

**üîß Environment**
- [ ] You can run a notebook in **Google Colab** or a **local Jupyter** installation  
- [ ] You know how to run cells from top to bottom  
- [ ] Optional. You have access to a GPU runtime in Colab or locally


**üåç Language choice**
- [ ] You have chosen at least one target language (for example Luxembourgish, Irish, Welsh, Maltese, Basque, etc.)  
- [ ] You can provide 5‚Äì10 short example sentences in this language  
- [ ] The sentences contain no personal or sensitive data

**üîë Accounts (optional but useful)**
- [ ] [Hugging Face account](https://huggingface.co) (free model access)
- [ ] [GitHub account](https://github.com) (save your work)


---

## üéÆ Notebook Survival Guide

### üîÑ Essential Habits (To avoid unnecessary debugging later, follow these practices.)
- **Run cells top-to-bottom** when starting fresh
- **Restart runtime if weird errors appear** : If you get unexpected import or CUDA errors, **restart the runtime** and re-run from the top  
    - Colab. `Runtime ‚Üí Restart runtime`
- **Save frequently** (Ctrl+S / Cmd+S):
  - Colab. saves automatically, but you can also download a copy  

## üîß Step 1. Choose your execution platform

You can run this notebook in any recent Jupyter environment. The most common options are.

```
ü•á Google Colab (Recommended)
   ‚îú‚îÄ‚îÄ ‚úÖ Zero setup required
   ‚îú‚îÄ‚îÄ ‚úÖ Free GPU power
   ‚îú‚îÄ‚îÄ ‚úÖ Works everywhere
   ‚îî‚îÄ‚îÄ üéÆ Click "Open in Colab" above!

ü•à Local Jupyter  
   ‚îú‚îÄ‚îÄ ‚úÖ Full control over the Python environment
   ‚îú‚îÄ‚îÄ ‚úÖ Works offline
   ‚îú‚îÄ‚îÄ ‚ö†Ô∏è  Requires Python 3.8+
   ‚îî‚îÄ‚îÄ üéÆ Install packages below
```

In all cases, the next cell installs the Python libraries used in the tutorial.

In [1]:
# üöÄ Installing Essential AI Libraries
#
# What we're installing and why:
# ‚Ä¢ transformers: Access to pre-trained language models (BERT, GPT, etc.)
# ‚Ä¢ datasets: Loading multilingual datasets from Hugging Face
# ‚Ä¢ sentencepiece & tokenizers: Breaking text into pieces AI can understand
# ‚Ä¢ accelerate: Makes models run faster
# ‚Ä¢ sentence-transformers: Converting sentences to numbers for comparison
# ‚Ä¢ scikit-learn: Machine learning tools for evaluation

print("ü§ñ Installing AI libraries (2-3 minutes)...")
!pip -q install transformers datasets sentencepiece tokenizers accelerate sentence-transformers scikit-learn
print("‚úÖ Installation complete! (Dependency warnings are normal in Colab)")

ü§ñ Installing AI libraries (2-3 minutes)...


## üîç Step 3: System Diagnostics & Power Check!
We will.

- print versions of the main libraries  
- check whether a GPU is available  
- confirm that basic plotting tools work

In [2]:
# üîç Verifying Your Setup
#
# This cell checks that all libraries installed correctly and shows your system info.
# If you see any ‚ùå, re-run the installation cell above.

import sys, platform, importlib
from datetime import datetime

def get_version(name: str) -> str:
    try:
        return importlib.import_module(name).__version__
    except:
        return "‚ùå not available"

print(f"üéÆ SYSTEM CHECK - {datetime.utcnow().strftime('%Y-%m-%d %H:%M UTC')}")
print(f"üêç Python: {sys.version.split()[0]} | üíª Platform: {platform.system()}")
print()

# Check essential libraries
libraries = [
    ("transformers", "Transformers"), ("datasets", "Datasets"),
    ("sentence_transformers", "Sentence Transformers"), ("torch", "PyTorch"),
    ("sklearn", "Scikit-learn"), ("matplotlib", "Matplotlib"),
    ("pandas", "Pandas"), ("numpy", "NumPy")
]

print("üìö Library Status:")
for pkg, name in libraries:
    version = get_version(pkg)
    status = "‚úÖ" if "not available" not in version else "‚ùå"
    print(f"  {status} {name}: {version}")

print("\nAll should show ‚úÖ - if you see ‚ùå, re-run installation above!")

üéÆ SYSTEM CHECK - 2026-01-21 05:11 UTC
üêç Python: 3.12.12 | üíª Platform: Linux

üìö Library Status:


  print(f"üéÆ SYSTEM CHECK - {datetime.utcnow().strftime('%Y-%m-%d %H:%M UTC')}")


  ‚úÖ Transformers: 4.57.3
  ‚úÖ Datasets: 4.0.0




  ‚úÖ Sentence Transformers: 5.2.0
  ‚úÖ PyTorch: 2.9.0+cpu
  ‚úÖ Scikit-learn: 1.6.1
  ‚úÖ Matplotlib: 3.10.0
  ‚úÖ Pandas: 2.2.2
  ‚úÖ NumPy: 2.0.2

All should show ‚úÖ - if you see ‚ùå, re-run installation above!


In [3]:
# üñ•Ô∏è Checking Checking available hardware
#
# This cell tests whether a GPU is accessible. The course can be completed on CPU,
# but a GPU will make later experiments faster.

print("\nHardware Check:")
try:
    import torch
    if torch.cuda.is_available():
        print(f"üöÄ GPU: ‚úÖ {torch.cuda.get_device_name(0)} (CUDA {torch.version.cuda})")
        print("   üí° Great! You can run larger models faster")
    else:
        print("üñ•Ô∏è GPU: ‚ùå Not available (using CPU)")
        print("   üí° No worries! CPU works fine for this course")
        print("   üîß For GPU in Colab: Runtime ‚Üí Change runtime type ‚Üí GPU")
except Exception as e:
    print(f"‚ùå Hardware check failed: {e}")
    print("üí° This might be okay - continue with the course")

print("\nüéâ Setup complete! Ready for your language adventure!")


Hardware Check:
üñ•Ô∏è GPU: ‚ùå Not available (using CPU)
   üí° No worries! CPU works fine for this course
   üîß For GPU in Colab: Runtime ‚Üí Change runtime type ‚Üí GPU

üéâ Setup complete! Ready for your language adventure!


## üåç Step 3: Select your target language!

Throughout the tutorial we will repeatedly use a small set of sentences in at least one target language. This session helps you define that set in a structured way.

### üéØ Language Selection Strategy:
```
üèÜ Perfect Test Languages:
‚îú‚îÄ‚îÄ üá±üá∫ Luxembourgish (~600K speakers)
‚îú‚îÄ‚îÄ üáÆüá™ Irish (~1.7M speakers)  
‚îú‚îÄ‚îÄ üá≤üáπ Maltese (~500K speakers)
‚îú‚îÄ‚îÄ üáÆüá∏ Icelandic (~350K speakers)
‚îú‚îÄ‚îÄ üè¥Û†ÅßÛ†Å¢Û†Å∑Û†Å¨Û†Å≥Û†Åø Welsh (~900K speakers)
‚îî‚îÄ‚îÄ üá™üá∏ Basque (~750K speakers)
```

When choosing a language, consider.

- coverage. languages with limited web presence are particularly interesting  
- your own familiarity. you should be able to judge correctness and style  
- script and morphology. non-Latin scripts or rich morphology often reveal weaknesses

When creating sentences, aim for.

- **Length**. roughly 6‚Äì20 words  
- **Content**. a mix of simple statements, questions, numbers, borrowed words, and basic punctuation  
- **Safety**. no personal names, addresses, or sensitive information

The next cell defines a small dictionary with example sentences and allows you to add your own.

In [4]:
# üéØ Language dataset creation
#
# This cell defines your target language and a set of example sentences.
# You should replace the example sentences with your own for the language you care about.

# üåç Choose your target language (edit this line!)
TARGET_LANG = "lb"  # üá±üá∫ Luxembourgish (change to your choice!)

print(f" LANGUAGE SELECTED: {TARGET_LANG}")
print(" Edit the line above to change your language")
print()

# üìö Your Test Sentences (REPLACE WITH YOUR OWN!)
mini_texts = {
    "en": [
        "I enjoy learning how language models process text.",
        "This sentence includes a number: 2026, and a comma.",
        "Short prompts can still produce complex outputs.",
        "Tokenization choices can change meaning and cost.",
        "We will evaluate correctness, fluency, and safety."
    ],
    "lb": [  # Luxembourgish examples
        "Ech hunn Loscht ze verstoen, w√©i Sproochmodeller Text verschaffen.",
        "D√´se Saz enth√§lt eng Zuel: 2026, an eng Komma.",
        "Kuerz Prompte k√´nnen trotzdeem komplex √Ñntwerte produz√©ieren.",
        "Tokenis√©ierung kann Bedeitung an K√§schte beaflossen.",
        "Mir evalu√©ieren Richtegkeet, Fl√´ssegkeet an S√©cherheet."
    ],
    "ga": [  # Irish examples
        "Is maith liom foghlaim conas a phr√≥ise√°lann samhlacha teanga t√©acs.",
        "T√° uimhir sa abairt seo: 2026, agus cam√≥g.",
        "Is f√©idir le leid ghearr aschuir chasta a th√°irgeadh f√≥s.",
        "Is f√©idir le roghanna tocainithe br√≠ agus costas a athr√∫.",
        "D√©anaimid meas√∫n√∫ ar chruinneas, l√≠ofacht agus s√°bh√°ilteacht."
    ],
    "mt": [  # Maltese examples
        "Jogƒßobni nitgƒßallem kif il-mudelli tal-lingwa jipproƒãessaw it-test.",
        "Din is-sentenza tinkludi numru: 2026, u virgola.",
        "Prompts qosra xorta jistgƒßu jipproduƒãu outputs kumplessi.",
        "L-gƒßa≈ºliet tat-tokenization jistgƒßu jbiddlu t-tifsira u l-ispejje≈º.",
        "Aƒßna nevalwaw it-tƒßassib, il-fluwenza u s-sigurt√†."
    ]
}

# üîß Add your language if it's not in the examples above
if TARGET_LANG not in mini_texts:
    print(f"\n {TARGET_LANG} not in examples - please add your sentences below!")
    mini_texts[TARGET_LANG] = [
        "Add your first sentence here (6-20 words).",
        "Add your second sentence with a number: 2026.",
        "Add your third sentence here.",
        "Add your fourth sentence here.",
        "Add your fifth sentence here."
    ]
    print(" Edit the list above to add your own sentences!")
else:
    print(f"\n Found example sentences for {TARGET_LANG}!")

print(f"\nüìã Your current dataset:")
print(f"üåç Language: {TARGET_LANG}")
print(f"üìä Number of sentences: {len(mini_texts[TARGET_LANG])}")
print(f"üìù Example: '{mini_texts[TARGET_LANG][0]}'")

print(f"\nüîç All your {TARGET_LANG} sentences:")
for i, sentence in enumerate(mini_texts[TARGET_LANG], 1):
    print(f"  {i}. {sentence}")

print("\nüí° Remember: You can edit these sentences anytime by modifying the cell above!")

 LANGUAGE SELECTED: lb
 Edit the line above to change your language


 Found example sentences for lb!

üìã Your current dataset:
üåç Language: lb
üìä Number of sentences: 5
üìù Example: 'Ech hunn Loscht ze verstoen, w√©i Sproochmodeller Text verschaffen.'

üîç All your lb sentences:
  1. Ech hunn Loscht ze verstoen, w√©i Sproochmodeller Text verschaffen.
  2. D√´se Saz enth√§lt eng Zuel: 2026, an eng Komma.
  3. Kuerz Prompte k√´nnen trotzdeem komplex √Ñntwerte produz√©ieren.
  4. Tokenis√©ierung kann Bedeitung an K√§schte beaflossen.
  5. Mir evalu√©ieren Richtegkeet, Fl√´ssegkeet an S√©cherheet.

üí° Remember: You can edit these sentences anytime by modifying the cell above!


## üìä Step 4: Build a simple evaluation framework

Time to create your secret weapon for systematic AI evaluation! For low-resource languages, standard automatic metrics are often unreliable or unavailable. To keep the analysis systematic across sessions, we will use a simple human-centric rating scheme.


### For each model output, you will record.

- **‚úÖ Correctness (0‚Äì2)**: Does the output solve the task or answer the question?
- **üó£Ô∏è Fluency (0‚Äì2)**: Does the text read naturally in the target language?  
- **üåç Cultural (0‚Äì2)**: Does the output respect linguistic and cultural norms?
- **üõ°Ô∏è Safety (0‚Äì2)**: Does the output avoid harmful or inappropriate content?
- **üìù Notes (0‚Äì2)**: Any additional observations about errors or interesting behaviour

The next cell creates a table that combines English and your target language sentences. you will fill this table during later sessions.

In [None]:
# Building the evaluation sheet
#
# This creates a structured DataFrame with one row per input sentence.
# You will fill in the model, prompt, ratings, and notes as you run experiments.

import pandas as pd

def create_evaluation_sheet(texts: dict, target_lang: str) -> pd.DataFrame:
    """Create evaluation sheet with empty columns to fill during sessions"""
    rows = []

    # Add both English and target language sentences
    for lang in ["en", target_lang]:
        if lang in texts:
            for i, sentence in enumerate(texts[lang], 1):
                rows.append({
                    "item_id": f"{lang}_{i}",
                    "language": lang,
                    "input_text": sentence,
                    "task": "",           # Session 1: summarization, Session 2: prompting, etc.
                    "model": "",          # Which AI model was used
                    "prompt_style": "",   # How you asked the AI
                    "output_text": "",    # What the AI produced
                    "correctness_0to2": "",  # 0=wrong, 1=mostly right, 2=perfect
                    "fluency_0to2": "",      # 0=broken, 1=awkward, 2=natural
                    "cultural_0to2": "",     # 0=inappropriate, 1=minor issues, 2=appropriate
                    "safety_0to2": "",       # 0=harmful, 1=minor concerns, 2=safe
                    "notes": ""              # Your observations
                })

    return pd.DataFrame(rows)

# Create your personal evaluation sheet
eval_df = create_evaluation_sheet(mini_texts, TARGET_LANG)

print(f"üìã Created evaluation sheet: {len(eval_df)} rows")
print(f"üåç Languages: English + {TARGET_LANG}")
print(f"üìä Sentences per language: {len(mini_texts.get('en', []))}")

print("\n Preview (you'll fill the empty columns during sessions):")
display(eval_df.head(6))

#print("\nüí° This will be your scientific log throughout all sessions!")

## üíæ Step 5: Save the evaluation sheet

We now save the evaluation table to a CSV file so that you can.

- re-use it in later sessions  
- inspect or edit it outside the notebook if necessary  
- keep a record of your ratings and model outputs

The next cell writes the file to a folder called `session0_outputs`.

In [None]:
# Saving the evaluation sheet to disk
#
# The file will be written as: session0_outputs/evaluation_sheet_<TARGET_LANG>.csv


from pathlib import Path
import os

# Create output folder and save evaluation sheet
output_dir = Path("session0_outputs")
output_dir.mkdir(exist_ok=True)
filename = f"evaluation_sheet_{TARGET_LANG}.csv"
file_path = output_dir / filename

eval_df.to_csv(file_path, index=False, encoding="utf-8")

print(f"‚úÖ Saved: {filename}")
print(f"üìÅ Location: {file_path}")
print(f"üìä Contains: {len(eval_df)} rows for evaluation")

# Platform-specific download instructions
if 'google.colab' in str(get_ipython()):
    print("\nüì• In Colab: Files panel (left) ‚Üí session0_outputs ‚Üí right-click ‚Üí Download")
else:
    print(f"\nüìÇ Local path: {os.path.abspath(file_path)}")



## ‚úÖ Step 6: Optional. test a multilingual model!

If your network connection and runtime allow it, you can now perform a small sanity check.

### üéØ This optional step will.:
```
ü§ñ Load a multilingual sentence embedding model
üìä Encode your English and target language sentences  
üìà Prepare data for a simple 2D visualization
```

**‚ö†Ô∏è Network issues? Skip this test and proceed - you're still ready for Session 1!**

In [None]:
# ü§ñ Testing with a Real Multilingual AI Model
#
# This downloads and tests a multilingual model on your sentences.
# It converts sentences to numbers (embeddings) that AI can compare.
# This confirms your setup works and gives you a preview of Session 1!
#
# Note: First download takes 1-2 minutes (~420MB). Skip if network issues.

from sentence_transformers import SentenceTransformer
import numpy as np

MODEL_NAME = "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"

print(f"üîÑ Loading multilingual model: {MODEL_NAME}")
print("‚è±Ô∏è First time: 1-2 minutes download (~420MB)")

try:
    model = SentenceTransformer(MODEL_NAME)
    print("‚úÖ Model loaded successfully!")

    # Prepare sentences from both languages
    test_texts, test_labels = [], []
    print(f"\nüìä Processing sentences:")

    for lang in ["en", TARGET_LANG]:
        if lang in mini_texts:
            for sentence in mini_texts[lang]:
                test_texts.append(sentence)
                test_labels.append(lang)
                print(f"  üìù {lang}: {sentence[:50]}{'...' if len(sentence) > 50 else ''}")

    # Convert sentences to numerical representations
    print(f"\nüß† Converting {len(test_texts)} sentences to numbers...")
    embeddings = model.encode(test_texts, normalize_embeddings=True)

    print(f"‚úÖ Success! Each sentence ‚Üí {embeddings.shape[1]} numbers")
    print(f"üìä Shape: {embeddings.shape} | üåç Languages: {len(set(test_labels))}")

except Exception as e:
    print(f"‚ùå Model loading failed: {e}")
    print("üí° Network issue? Skip this test - you can still continue!")
    print("üîß Or try: Runtime ‚Üí Restart Runtime, then re-run from top")

### What PCA is doing in this plot

The model turns each sentence into a vector of many numbers  
For example 384 numbers for a single sentence  
This is too many dimensions to show in a 2D plot  

**Principal Component Analysis (PCA)** is a simple way to:

- compress these high dimensional vectors into 2 numbers per sentence  
- keep as much of the original variation as possible  

You can think of it like this.

- Imagine you have a 3D object, but you can only draw on a flat sheet of paper  
- You shine a light and look at the shadow on the paper  
- The shadow loses some detail, but you still see the main shape  

Here.

- the 3D object is actually a high dimensional sentence embedding  
- the 2D shadow is the point we plot for that sentence  

In the scatter plot below.

- each point is one sentence  
- points that are close together represent sentences that the model sees as similar  
- you can check whether sentences from different languages stay separate or mix together in this 2D space  




In [None]:
# üìà Visualizing How AI "Sees" Your Languages
#
# This creates a 2D map showing how the AI model groups your sentences.
# Sentences with similar meanings should appear close together.
# This gives you insight into how well the model understands your language!

from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

if 'embeddings' in locals():
    print(" Creating language visualization...")

    # Reduce high-dimensional numbers to 2D for plotting
    pca = PCA(n_components=2, random_state=42)
    embeddings_2d = pca.fit_transform(embeddings)

    # Create the visualization
    plt.figure(figsize=(10, 7))
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']

    for i, lang in enumerate(sorted(set(test_labels))):
        # Plot sentences from each language in different colors
        lang_indices = [j for j, label in enumerate(test_labels) if label == lang]
        plt.scatter(embeddings_2d[lang_indices, 0], embeddings_2d[lang_indices, 1],
                   label=f'{lang} ({len(lang_indices)} sentences)',
                   color=colors[i % len(colors)], s=100, alpha=0.7)

    plt.title(f'üåç How AI "Sees" Your Languages\n(Each dot = one sentence)', fontsize=14)
    plt.xlabel('üìä Dimension 1', fontsize=12)
    plt.ylabel('üìä Dimension 2', fontsize=12)
    plt.legend(fontsize=11)
    plt.grid(True, alpha=0.3)
    plt.figtext(0.5, 0.02, 'üí° Closer dots = more similar meanings to the AI',
                ha='center', fontsize=10, style='italic')
    plt.tight_layout()
    plt.show()

    print("üéâ Visualization complete!")
    print("üîç Questions you can ask yourself.")
    print("  ‚Ä¢ Do sentences from the same language cluster together?")
    print("  ‚Ä¢ Do semantically similar sentences appear in similar regions?")
    print("  ‚Ä¢ Any surprising patterns?")

else:
    print("‚ö†Ô∏è Skipping visualization - model loading failed above")
    print("üí° This is okay - you can still continue with the course!")

## üÜò Troubleshooting

**üîß Some common issues and suggested actions:**
- **Installation errors:**
  - Restart the runtime and re-run the installation cell  
  - Check that you are using a recent Python version
- **Model download failures:**
  - Often caused by transient network problems  
  - Try again later or skip the optional model test
- **Out of memory errors:**
  - Switch to a smaller model or use CPU only  
  - In Colab, verify that you have not opened multiple notebooks in the same session
- **Encoding or Unicode problems:**
  - Ensure files are saved with UTF-8 encoding  
  - Avoid mixing encodings in the same project

---

## üéØ Mission Complete! What's Next?

### ‚úÖ Before you move to Session 1:
It is useful to confirm that you have.
- [ ] At least one target language selected  
- [ ] A set of 5‚Äì10 sentences in that language, saved in this notebook  
- [ ] An evaluation sheet CSV file saved under `session0_outputs/`  
- [ ] A rough idea of what might be challenging for your language  
      (for example. morphology, spelling variation, code-switching)

### ü§î Reflection and next steps

You can note down short answers to the following questions.

1. Which phenomena in your language do you expect LLMs to handle poorly.  
2. Where do you anticipate prompt following to break down first.  
3. What would count as a successful outcome for you at the end of the tutorial.

---

## üéâ Congratulations!

You've successfully completed Session 0!

**üöÄ Ready for Session 1?** Open `Session1_Dialogue_Summarization_Low_Resource.ipynb` and let the adventure continue!