<a href="https://colab.research.google.com/github/ngusadeep/Qwen3-Swahili-Cognition/blob/main/finetuning-notebooks/Qwen3_14B_Swahili_Text.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Fine-tuning Qwen3 LLM for Enhanced Swahili Cognitive Capabilities**
## Pushing the frontier of Swahili language understanding across Text, Speech, and Vision

This notebook fine-tunes the Qwen3 LLM suite to enhance its Swahili cognitive capabilities using diverse datasets tailored for the Swahili language across African countries and globally. We utilize the **Nadhari/Swahili-Thinking** datasets to improve the model's reasoning and conversational abilities in Swahili.

## Framework: Unsloth for Efficient Fine-tuning
<div class="align-center">
<a href="https://unsloth.ai/"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>



### Installation

In [None]:
%%capture

import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    import torch; v = re.match(r"[0-9]{1,}\.[0-9]{1,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.33.post1" if v=="2.9" else "0.0.32.post2" if v=="2.8" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.56.2
!pip install --no-deps trl==0.22.2

### Load Qwen3 Model with Unsloth

We'll load the Qwen3-14B model using Unsloth's FastLanguageModel for efficient fine-tuning. The model will be loaded in 4-bit quantization to optimize memory usage in Google Colab.

In [None]:
from unsloth import FastLanguageModel
import torch

fourbit_models = [
    "unsloth/Qwen3-1.7B-unsloth-bnb-4bit", # Qwen 14B 2x faster
    "unsloth/Qwen3-4B-unsloth-bnb-4bit",
    "unsloth/Qwen3-8B-unsloth-bnb-4bit",
    "unsloth/Qwen3-14B-unsloth-bnb-4bit",
    "unsloth/Qwen3-32B-unsloth-bnb-4bit",

    # 4bit dynamic quants for superior accuracy and low memory use
    "unsloth/gemma-3-12b-it-unsloth-bnb-4bit",
    "unsloth/Phi-4",
    "unsloth/Llama-3.1-8B",
    "unsloth/Llama-3.2-3B",
    "unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit" # [NEW] We support TTS models!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Qwen3-14B",
    max_seq_length = 2048,   # Context length - can be longer, but uses more memory
    load_in_4bit = True,     # 4bit uses much less memory
    load_in_8bit = False,    # A bit more accurate, uses 2x memory
    full_finetuning = False, # We have full finetuning now!
    # token = "hf_...",      # use one if using gated models
)

### Add LoRA Adapters for Efficient Fine-tuning

We now add LoRA (Low-Rank Adaptation) adapters so we only need to update 1 to 10% of all parameters! This makes fine-tuning much more memory-efficient while maintaining model performance.

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 32,           # Choose any number > 0! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 32,  # Best to choose alpha = rank or rank*2
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,   # We support rank stabilized LoRA
    loftq_config = None,  # And LoftQ
)

<a name="Test"></a>
### Test Model with Swahili Prompt (Before Fine-tuning)

Let's first test the base model's Swahili capabilities before fine-tuning to establish a baseline.

In [None]:
# Test the base model with a Swahili prompt
messages = [
    {"role" : "user", "content" : "Eleza kwa ufupi kuhusu historia ya Kiswahili na umuhimu wake katika Afrika Mashariki."}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True,
    enable_thinking = False,
)

from transformers import TextStreamer
print("=== Base Model Response (Before Fine-tuning) ===")
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 256,
    temperature = 0.7, top_p = 0.8, top_k = 20,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

<a name="Data"></a>
### Data Preparation for Swahili Fine-tuning

We'll use the **Nadhari/Swahili-Thinking** datasets which contain Swahili reasoning and conversational data. Qwen3 supports both reasoning and non-reasoning modes, so we'll prepare the datasets accordingly:

1. **Swahili-Thinking dataset**: Contains Swahili reasoning traces and problem-solving examples
2. **Swahili conversational data**: Contains multi-turn Swahili conversations in ShareGPT format

Let's load the Swahili datasets:

In [None]:
from datasets import load_dataset

# Load Swahili-Thinking dataset for reasoning capabilities
try:
    reasoning_dataset = load_dataset("Nadhari/Swahili-Thinking", split = "train")
    print("✓ Loaded Nadhari/Swahili-Thinking dataset")
except:
    # Fallback if dataset structure is different
    reasoning_dataset = load_dataset("Nadhari/Swahili-Thinking")
    if isinstance(reasoning_dataset, dict):
        reasoning_dataset = reasoning_dataset.get("train", list(reasoning_dataset.values())[0])
    print("✓ Loaded Nadhari/Swahili-Thinking dataset (alternative format)")

# Check if there's a conversational split or use the same dataset
try:
    non_reasoning_dataset = load_dataset("Nadhari/Swahili-Thinking", split = "chat")
    print("✓ Loaded conversational split")
except:
    # If no separate chat split, we'll use a subset of the main dataset
    non_reasoning_dataset = None
    print("ℹ Using same dataset for both reasoning and conversational data")

Let's examine the structure of the Swahili datasets:


In [None]:
print("Reasoning Dataset Structure:")
print(reasoning_dataset)
print(f"\nNumber of examples: {len(reasoning_dataset)}")
if len(reasoning_dataset) > 0:
    print("\nFirst example keys:", reasoning_dataset[0].keys())
    print("\nFirst example preview:")
    for key, value in list(reasoning_dataset[0].items())[:3]:
        print(f"  {key}: {str(value)[:200]}...")

### Convert Swahili Datasets to Conversational Format

We need to convert the Swahili-Thinking dataset into a conversational format that Qwen3 can understand. The exact field names may vary, so we'll handle different dataset structures:

In [None]:
def generate_conversation(examples):
    """Convert Swahili dataset examples to conversational format"""
    conversations = []

    # Get the batch size - determine from the first key that has a list
    batch_size = None
    for key in examples.keys():
        if isinstance(examples[key], list):
            batch_size = len(examples[key])
            break

    if batch_size is None:
        # Single example case
        batch_size = 1

    # Try different possible field names for Swahili-Thinking dataset
    # Common patterns: problem/solution, question/answer, instruction/response, etc.
    for i in range(batch_size):
        # Extract single example from batch
        example = {}
        for key in examples.keys():
            if isinstance(examples[key], list):
                example[key] = examples[key][i] if i < len(examples[key]) else None
            else:
                example[key] = examples[key]

        # Try to find question/problem/instruction and answer/solution/response
        user_content = None
        assistant_content = None

        # Check common field name patterns
        for user_key in ["problem", "question", "instruction", "input", "prompt", "user", "query"]:
            if user_key in example and example[user_key] is not None:
                user_content = example[user_key]
                break

        for assistant_key in ["generated_solution", "solution", "answer", "response", "output", "assistant", "completion"]:
            if assistant_key in example and example[assistant_key] is not None:
                assistant_content = example[assistant_key]
                break

        # If we have conversations already formatted
        if "conversations" in example and example["conversations"] is not None:
            conversations.append(example["conversations"])
        elif user_content and assistant_content:
            conversations.append([
                {"role" : "user", "content" : str(user_content)},
                {"role" : "assistant", "content" : str(assistant_content)},
            ])
        elif user_content:
            # If we only have user content, create a minimal conversation
            conversations.append([
                {"role": "user", "content": str(user_content)},
                {"role": "assistant", "content": ""}  # Empty assistant response
            ])
        elif "text" in example and example["text"] is not None:
            # If it's already formatted text, we'll handle it separately
            conversations.append([{"role": "user", "content": str(example.get("text", ""))}])
        else:
            # Fallback: create a minimal conversation with available data
            # Use the first non-None value as content
            content = None
            for key, value in example.items():
                if value is not None and key not in ["id", "index"]:
                    content = str(value)
                    break
            if content:
                conversations.append([
                    {"role": "user", "content": content},
                    {"role": "assistant", "content": ""}
                ])
            else:
                # Last resort: empty conversation
                conversations.append([
                    {"role": "user", "content": ""},
                    {"role": "assistant", "content": ""}
                ])

    return {"conversations": conversations}

# Convert reasoning dataset
reasoning_mapped = reasoning_dataset.map(generate_conversation, batched = True, remove_columns=reasoning_dataset.column_names)

# Verify the column exists before accessing it
print(f"Columns in reasoning_mapped: {reasoning_mapped.column_names}")
print(f"Number of examples: {len(reasoning_mapped)}")

# Check if conversations column exists
if "conversations" in reasoning_mapped.column_names:
    reasoning_conversations = tokenizer.apply_chat_template(
        list(reasoning_mapped["conversations"]),
        tokenize = False,
    )
else:
    raise ValueError(f"Expected 'conversations' column but found: {reasoning_mapped.column_names}")

Let's preview the first transformed Swahili conversation:

In [None]:
reasoning_conversations[0]

### Prepare Non-Reasoning Swahili Conversational Data

Next, we'll prepare the non-reasoning conversational dataset. If we have a separate conversational split, we'll use Unsloth's `standardize_sharegpt` function to format it properly.

In [None]:
from unsloth.chat_templates import standardize_sharegpt

if non_reasoning_dataset is not None:
    # Standardize the conversational dataset
    dataset = standardize_sharegpt(non_reasoning_dataset)
    non_reasoning_conversations = tokenizer.apply_chat_template(
        list(dataset["conversations"]),
        tokenize = False,
    )
else:
    # If no separate conversational dataset, use a subset of reasoning data
    # or create empty list (we'll adjust the ratio accordingly)
    print("ℹ No separate conversational dataset found. Using reasoning dataset for both.")
    non_reasoning_conversations = []

Preview the first non-reasoning conversation (if available):

In [None]:
if len(non_reasoning_conversations) > 0:
    print("First non-reasoning conversation:")
    print(non_reasoning_conversations[0][:500] + "..." if len(non_reasoning_conversations[0]) > 500 else non_reasoning_conversations[0])
else:
    print("No separate non-reasoning conversations available.")


In [None]:
non_reasoning_conversations[0]

### Dataset Statistics

Let's check the size of our Swahili datasets:

In [None]:
print(len(reasoning_conversations))
print(len(non_reasoning_conversations))

### Combine Swahili Datasets with Optimal Ratio

We want to balance reasoning capabilities with conversational abilities for Swahili. Let's define a ratio that maintains strong reasoning while ensuring good conversational performance. We'll use 75% reasoning data and 25% conversational data:

In [None]:
chat_percentage = 0.25

### Sample and Balance Swahili Datasets

We'll sample the datasets to achieve the desired ratio of reasoning to conversational data:

In [None]:
import pandas as pd

if len(non_reasoning_conversations) > 0:
    non_reasoning_subset = pd.Series(non_reasoning_conversations)
    target_size = int(len(reasoning_conversations) * (chat_percentage / (1 - chat_percentage)))

    if len(non_reasoning_subset) > target_size:
        non_reasoning_subset = non_reasoning_subset.sample(
            target_size,
            random_state = 2407,
        )
    else:
        print(f"ℹ Conversational dataset smaller than target. Using all {len(non_reasoning_subset)} examples.")
else:
    # If no conversational data, use a subset of reasoning data
    print("ℹ No separate conversational data. Using subset of reasoning data for diversity.")
    non_reasoning_subset = pd.Series(reasoning_conversations).sample(
        min(int(len(reasoning_conversations) * chat_percentage), len(reasoning_conversations)),
        random_state = 2407,
    )

print(f"Reasoning conversations: {len(reasoning_conversations)}")
print(f"Conversational subset: {len(non_reasoning_subset)}")
if len(non_reasoning_subset) + len(reasoning_conversations) > 0:
    print(f"Conversational ratio: {len(non_reasoning_subset) / (len(non_reasoning_subset) + len(reasoning_conversations)):.2%}")

### Final Dataset Combination

Now let's combine both Swahili datasets and shuffle them for training:

In [None]:
data = pd.concat([
    pd.Series(reasoning_conversations),
    pd.Series(non_reasoning_subset)
])
data.name = "text"

from datasets import Dataset
combined_dataset = Dataset.from_pandas(pd.DataFrame(data))
combined_dataset = combined_dataset.shuffle(seed = 3407)

<a name="Train"></a>
### Fine-tune Qwen3 for Swahili

Now let's train our model on the Swahili datasets. We'll do 30 steps initially for testing, but you can set `num_train_epochs=1` for a full training run, and set `max_steps=None` to disable step limiting.

In [None]:
from trl import SFTTrainer, SFTConfig
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = combined_dataset,
    eval_dataset = None, # Can set up evaluation!
    args = SFTConfig(
        dataset_text_field = "text",
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4, # Use GA to mimic batch size!
        warmup_steps = 5,
        # num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 30,
        learning_rate = 2e-4, # Reduce to 2e-5 for long training runs
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.001,
        lr_scheduler_type = "linear",
        seed = 3407,
        report_to = "none", # Use TrackIO/WandB etc
    ),
)

In [None]:
# @title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

### Start Swahili Fine-tuning

Let's begin training the model on Swahili data! To resume a training run, set `trainer.train(resume_from_checkpoint = True)`

In [None]:
trainer_stats = trainer.train()

In [None]:
# @title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory / max_memory * 100, 3)
lora_percentage = round(used_memory_for_lora / max_memory * 100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(
    f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training."
)
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

<a name="Inference"></a>
### Test Fine-tuned Model with Swahili Prompts

Let's test our fine-tuned model with Swahili prompts! According to the Qwen-3 team, the recommended settings for reasoning inference are `temperature = 0.6, top_p = 0.95, top_k = 20`. For normal chat-based inference, use `temperature = 0.7, top_p = 0.8, top_k = 20`.

In [None]:
# Test with Swahili conversational prompt (non-thinking mode)
messages = [
    {"role" : "user", "content" : "Eleza kwa ufupi kuhusu umuhimu wa lugha ya Kiswahili katika Afrika Mashariki na jinsi inavyotumika katika maisha ya kila siku."}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, # Must add for generation
    enable_thinking = False, # Disable thinking
)

from transformers import TextStreamer
print("=== Fine-tuned Model Response (Swahili - Non-Thinking Mode) ===")
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 256, # Increase for longer outputs!
    temperature = 0.7, top_p = 0.8, top_k = 20, # For non thinking
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

In [None]:
# Test with Swahili reasoning prompt (thinking mode enabled)
messages = [
    {"role" : "user", "content" : "Kama una shilingi 5000 na unataka kununua vitabu viwili: moja ni shilingi 2000 na nyingine ni shilingi 3500. Je, utabaki na pesa ngapi?"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize = False,
    add_generation_prompt = True, # Must add for generation
    enable_thinking = True, # Enable thinking for reasoning
)

from transformers import TextStreamer
print("\n=== Fine-tuned Model Response (Swahili - Thinking Mode) ===")
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 1024, # Increase for longer outputs!
    temperature = 0.6, top_p = 0.95, top_k = 20, # For thinking
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)

<a name="Save"></a>
### Save Fine-tuned Swahili Model

To save the final fine-tuned Swahili model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.

**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!

In [None]:
model.save_pretrained("lora_model")  # Local saving
tokenizer.save_pretrained("lora_model")# model.push_to_hub("your_name/lora_model", token = "...") # Online saving
# tokenizer.push_to_hub("your_name/lora_model", token = "...") # Online saving


### Load Saved Swahili Model

Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:

In [None]:
if False:
    from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "lora_model", # YOUR MODEL YOU USED FOR TRAINING
        max_seq_length = 2048,
        load_in_4bit = True,
    )

### Save Swahili Model in Different Formats

We also support saving to `float16` directly for VLLM deployment. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.

In [None]:
# Merge to 16bit
if False:
    model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit",)
if False: # Pushing to HF Hub
    model.push_to_hub_merged("hf/model", tokenizer, save_method = "merged_16bit", token = "")

# Merge to 4bit
if False:
    model.save_pretrained_merged("model", tokenizer, save_method = "merged_4bit",)
if False: # Pushing to HF Hub
    model.push_to_hub_merged("hf/model", tokenizer, save_method = "merged_4bit", token = "")

# Just LoRA adapters
if False:
    model.save_pretrained("model")
    tokenizer.save_pretrained("model")
if False: # Pushing to HF Hub
    model.push_to_hub("hf/model", token = "")
    tokenizer.push_to_hub("hf/model", token = "")


### Export Swahili Model to GGUF / llama.cpp Format

To save your fine-tuned Swahili model to `GGUF` / `llama.cpp` format, we support it natively! We clone `llama.cpp` and default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.

Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):
* `q8_0` - Fast conversion. High resource use, but generally acceptable.
* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.
* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.

[**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)

In [None]:
# Save to 8bit Q8_0
if False:
    model.save_pretrained_gguf("model", tokenizer,)
# Remember to go to https://huggingface.co/settings/tokens for a token!
# And change hf to your username!
if False:
    model.push_to_hub_gguf("hf/model", tokenizer, token = "")

# Save to 16bit GGUF
if False:
    model.save_pretrained_gguf("model", tokenizer, quantization_method = "f16")
if False: # Pushing to HF Hub
    model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "")

# Save to q4_k_m GGUF
if False:
    model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
if False: # Pushing to HF Hub
    model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "q4_k_m", token = "")

# Save to multiple GGUF options - much faster if you want multiple!
if False:
    model.push_to_hub_gguf(
        "hf/model", # Change hf to your username!
        tokenizer,
        quantization_method = ["q4_k_m", "q8_0", "q5_k_m",],
        token = "", # Get a token at https://huggingface.co/settings/tokens
    )

Now, use the `model-unsloth.gguf` file or `model-unsloth-Q4_K_M.gguf` file in llama.cpp.

## Summary

We've successfully fine-tuned the Qwen3-14B model for enhanced Swahili cognitive capabilities using the Nadhari/Swahili-Thinking datasets. The model now has improved:
- **Text understanding** in Swahili
- **Reasoning capabilities** for Swahili problem-solving
- **Conversational abilities** in Swahili

The fine-tuned model can be deployed for various Swahili language applications across African countries and globally.

---

## Resources and Support

If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!

Some other useful links:
1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)
2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)
3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)
4. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!

<div class="align-center">
  <a href="https://unsloth.ai"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>
  <a href="https://discord.gg/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/Discord.png" width="145"></a>
  <a href="https://docs.unsloth.ai/"><img src="https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true" width="125"></a>

  Join Discord if you need help + ⭐️ <i>Star us on <a href="https://github.com/unslothai/unsloth">Github</a> </i> ⭐️
</div>

  This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme).
