To run this, press "*Runtime*" and press "*Run all*" on a **free** Tesla T4 Google Colab instance!
<div class="align-center">
<a href="https://unsloth.ai/"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>
<a href="https://discord.gg/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/Discord button.png" width="145"></a>
<a href="https://docs.unsloth.ai/"><img src="https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true" width="125"></a></a> Join Discord if you need help + ‚≠ê <i>Star us on <a href="https://github.com/unslothai/unsloth">Github</a> </i> ‚≠ê
</div>

To install Unsloth your local device, follow [our guide](https://docs.unsloth.ai/get-started/install-and-update). This notebook is licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme).

You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save)


### News


Introducing FP8 precision training for faster RL inference. [Read Blog](https://docs.unsloth.ai/new/fp8-reinforcement-learning).

Unsloth's [Docker image](https://hub.docker.com/r/unsloth/unsloth) is here! Start training with no setup & environment issues. [Read our Guide](https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker).

[gpt-oss RL](https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning) is now supported with the fastest inference & lowest VRAM. Try our [new notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-GRPO.ipynb) which creates kernels!

Introducing [Vision](https://docs.unsloth.ai/new/vision-reinforcement-learning-vlm-rl) and [Standby](https://docs.unsloth.ai/basics/memory-efficient-rl) for RL! Train Qwen, Gemma etc. VLMs with GSPO - even faster with less VRAM.

Visit our docs for all our [model uploads](https://docs.unsloth.ai/get-started/all-our-models) and [notebooks](https://docs.unsloth.ai/get-started/unsloth-notebooks).


### Installation

In [1]:
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    import torch; v = re.match(r"[0-9]{1,}\.[0-9]{1,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.33.post1" if v=="2.9" else "0.0.32.post2" if v=="2.8" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
#!pip install transformers==4.57.3
#!pip install --no-deps trl==0.22.2

### Unsloth

In [3]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 4096 # Can choose any sequence length!
fourbit_models = [
    # 4bit Gemma 3 dynamic quants for superior accuracy and low memory use
    "unsloth/gemma-3-270m-it-unsloth-bnb-4bit",
    "unsloth/gemma-3-1b-it-unsloth-bnb-4bit",
    "unsloth/gemma-3-4b-it-unsloth-bnb-4bit",
    "unsloth/gemma-3-12b-it-unsloth-bnb-4bit",
    "unsloth/gemma-3-27b-it-unsloth-bnb-4bit",
    # Function Gemma models
    "unsloth/functiongemma-270m-it",
    "unsloth/functiongemma-270m-it-unsloth-bnb-4bit",
    "unsloth/functiongemma-270m-it-bnb-4bit",
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    # model_name = "unsloth/functiongemma-270m-it",
    model_name = "gemma-3-270m-it-unsloth-bnb-4bit",
    max_seq_length = max_seq_length, # Choose any for long context!
    load_in_4bit = False,  # 4 bit quantization to reduce memory
    load_in_8bit = False, # [NEW!] A bit more accurate, uses 2x memory
    load_in_16bit = True, # [NEW!] Enables 16bit LoRA
    full_finetuning = False, # [NEW!] We have full finetuning now!
    # token = "hf_...", # use one if using gated models
)


Please restructure your imports with 'import unsloth' at the top of your file.
  from unsloth import FastLanguageModel


ü¶• Unsloth Zoo will now patch everything to make training faster!


NameError: name 'SamplingParams' is not defined

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

Unsloth: Making `model.base_model.model.model` require gradients


<a name="Data"></a>
### Data Prep
We now use Google's Mobile Actions dataset, which contains a list of tools a phone could call, like making a calendar invite, setting an alarm etc.

In [None]:
from datasets import load_dataset
dataset = load_dataset("google/mobile-actions", split = "train")

dataset.jsonl:   0%|          | 0.00/25.7M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/9654 [00:00<?, ? examples/s]

Let's take a look at the first row for messages and the tools present.

We see the person is asking `Please set a reminder for a "Team Sync Meeting" this Friday, June 6th, 2025, at 2 PM.` so we expect the calendar or reminder tool to be used

In [None]:
dataset[0]["messages"]

[{'role': 'developer',
  'content': 'Current date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-06-04T15:29:23\nDay of week is Wednesday\nYou are a model that can do function calling with the following functions\n',
  'tool_calls': None},
 {'role': 'user',
  'content': 'Please set a reminder for a "Team Sync Meeting" this Friday, June 6th, 2025, at 2 PM.',
  'tool_calls': None},
 {'role': 'assistant',
  'content': None,
  'tool_calls': [{'function': {'name': 'create_calendar_event',
     'arguments': {'datetime': datetime.datetime(2025, 6, 6, 14, 0),
      'title': 'Team Sync Meeting',
      'email': None,
      'last_name': None,
      'first_name': None,
      'phone_number': None,
      'to': None,
      'body': None,
      'subject': None,
      'query': None}}}]}]

In [None]:
dataset[0]["tools"]

[{'function': {'name': 'turn_off_flashlight',
   'description': 'Turns the flashlight off.',
   'parameters': {'type': 'OBJECT',
    'properties': {'subject': None,
     'body': None,
     'to': None,
     'datetime': None,
     'title': None,
     'query': None,
     'phone_number': None,
     'last_name': None,
     'first_name': None,
     'email': None},
    'required': None}}},
 {'function': {'name': 'open_wifi_settings',
   'description': 'Opens the Wi-Fi settings.',
   'parameters': {'type': 'OBJECT',
    'properties': {'subject': None,
     'body': None,
     'to': None,
     'datetime': None,
     'title': None,
     'query': None,
     'phone_number': None,
     'last_name': None,
     'first_name': None,
     'email': None},
    'required': None}}},
 {'function': {'name': 'create_calendar_event',
   'description': 'Creates a new calendar event.',
   'parameters': {'type': 'OBJECT',
    'properties': {'subject': None,
     'body': None,
     'to': None,
     'datetime': {'typ

We can then apply FunctionGemma's chat template:

In [None]:
tokenizer.apply_chat_template(
    dataset[0]["messages"],
    tools = dataset[0]["tools"],
    tokenize = False,
    add_generation_prompt = False,
)

'<bos><start_of_turn>developer\nCurrent date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-06-04T15:29:23\nDay of week is Wednesday\nYou are a model that can do function calling with the following functions<start_function_declaration>declaration:turn_off_flashlight{description:<escape>Turns the flashlight off.<escape>,parameters:{properties:{body:{description:<escape><escape>,type:<escape><escape>},datetime:{description:<escape><escape>,type:<escape><escape>},email:{description:<escape><escape>,type:<escape><escape>},first_name:{description:<escape><escape>,type:<escape><escape>},last_name:{description:<escape><escape>,type:<escape><escape>},phone_number:{description:<escape><escape>,type:<escape><escape>},query:{description:<escape><escape>,type:<escape><escape>},subject:{description:<escape><escape>,type:<escape><escape>},title:{description:<escape><escape>,type:<escape><escape>},to:{description:<escape><escape>,type:<escape><escape>}},type:<escape>OBJECT<escape>}}<end_function_

We then create a dataset of just these mapped sequences.

In [None]:
def process_dataset(row, tokenizer):
    text = tokenizer.apply_chat_template(
        row["messages"],
        tools = row["tools"],
        tokenize = False,
        add_generation_prompt = False,
    )
    return {"text" : text}
dataset = dataset.map(process_dataset, fn_kwargs = {"tokenizer" : tokenizer})

Map:   0%|          | 0/9654 [00:00<?, ? examples/s]

Look at the first row

In [None]:
dataset[0]["text"]

'<bos><start_of_turn>developer\nCurrent date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-06-04T15:29:23\nDay of week is Wednesday\nYou are a model that can do function calling with the following functions<start_function_declaration>declaration:turn_off_flashlight{description:<escape>Turns the flashlight off.<escape>,parameters:{properties:{body:{description:<escape><escape>,type:<escape><escape>},datetime:{description:<escape><escape>,type:<escape><escape>},email:{description:<escape><escape>,type:<escape><escape>},first_name:{description:<escape><escape>,type:<escape><escape>},last_name:{description:<escape><escape>,type:<escape><escape>},phone_number:{description:<escape><escape>,type:<escape><escape>},query:{description:<escape><escape>,type:<escape><escape>},subject:{description:<escape><escape>,type:<escape><escape>},title:{description:<escape><escape>,type:<escape><escape>},to:{description:<escape><escape>,type:<escape><escape>}},type:<escape>OBJECT<escape>}}<end_function_

<a name="Train"></a>
### Train the model
Now let's train our model. We do 100 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`.

In [None]:
# We split the dataset into some training and testing
split_dataset = dataset.train_test_split(test_size = 50, shuffle = True, seed = 3407)

from trl import SFTTrainer, SFTConfig
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = split_dataset["train"],
    eval_dataset = split_dataset["test"], # For evaluation!
    args = SFTConfig(
        dataset_text_field = "text",
        per_device_train_batch_size = 4,
        gradient_accumulation_steps = 2, # Use GA to mimic batch size!
        warmup_steps = 5,
        # num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 100,
        eval_steps = 2,
        eval_strategy = "steps",
        learning_rate = 2e-4, # Reduce to 2e-5 for long training runs
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.001,
        lr_scheduler_type = "linear",
        seed = 3407,
        report_to = "none", # Use TrackIO/WandB etc
    ),
)

Unsloth: Switching to float32 training since model cannot work with float16


Unsloth: Tokenizing ["text"] (num_proc=12):   0%|          | 0/9604 [00:00<?, ? examples/s]

Unsloth: Tokenizing ["text"] (num_proc=12):   0%|          | 0/50 [00:00<?, ? examples/s]

The model is already on multiple devices. Skipping the move to device specified in `args`.


ü¶• Unsloth: Padding-free auto-enabled, enabling faster training.


We also mask all system and user instructions so our goal is to force the finetune to actually respond with the correct function calls. If not, the finetune might just learn the format, and not learn anything else.

In [None]:
from unsloth.chat_templates import train_on_responses_only
trainer = train_on_responses_only(
    trainer,
    instruction_part = "<start_of_turn>user\n",
    response_part = "<start_of_turn>model\n",
)

Map (num_proc=12):   0%|          | 0/9604 [00:00<?, ? examples/s]

Map (num_proc=12):   0%|          | 0/50 [00:00<?, ? examples/s]

We print the full un-masked string:

In [None]:
tokenizer.decode(trainer.train_dataset[100]["input_ids"])

'<bos><bos><start_of_turn>developer\nCurrent date and time given in YYYY-MM-DDTHH:MM:SS format: 2026-07-28T11:21:51\nDay of week is Tuesday\nYou are a model that can do function calling with the following functions<start_function_declaration>declaration:turn_on_flashlight{description:<escape>Turns the flashlight on.<escape>,parameters:{properties:{body:{description:<escape><escape>,type:<escape><escape>},datetime:{description:<escape><escape>,type:<escape><escape>},email:{description:<escape><escape>,type:<escape><escape>},first_name:{description:<escape><escape>,type:<escape><escape>},last_name:{description:<escape><escape>,type:<escape><escape>},phone_number:{description:<escape><escape>,type:<escape><escape>},query:{description:<escape><escape>,type:<escape><escape>},subject:{description:<escape><escape>,type:<escape><escape>},title:{description:<escape><escape>,type:<escape><escape>},to:{description:<escape><escape>,type:<escape><escape>}},type:<escape>OBJECT<escape>}}<end_function

See if we did the masking correctly - "-" is the mask

In [None]:
[tokenizer.decode([tokenizer.pad_token_id if x == -100 else x for x in trainer.train_dataset[100]["labels"]]).replace(tokenizer.pad_token, "-")]

['--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In [None]:
# @title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

GPU = Tesla T4. Max memory = 14.741 GB.
0.832 GB of memory reserved.


Let's train the model! To resume a training run, set `trainer.train(resume_from_checkpoint = True)`

In [None]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 9,604 | Num Epochs = 1 | Total steps = 100
O^O/ \_/ \    Batch size per device = 4 | Gradient accumulation steps = 2
\        /    Data Parallel GPUs = 1 | Total batch size (4 x 2 x 1) = 8
 "-____-"     Trainable parameters = 3,796,992 of 271,895,168 (1.40% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss,Validation Loss
2,4.2396,3.647664
4,2.3946,1.27669
6,0.7793,0.748503
8,0.7423,0.582926
10,0.5159,0.360648
12,0.2552,0.188659
14,0.147,0.118053
16,0.1034,0.09442
18,0.0938,0.077847
20,0.0867,0.065308


Unsloth: Not an error, but Gemma3ForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient


In [None]:
# @title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory / max_memory * 100, 3)
lora_percentage = round(used_memory_for_lora / max_memory * 100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(
    f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training."
)
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

1004.1125 seconds used for training.
16.74 minutes used for training.
Peak reserved memory = 10.459 GB.
Peak reserved memory for training = 9.627 GB.
Peak reserved memory % of max memory = 70.952 %.
Peak reserved memory for training % of max memory = 65.308 %.


<a name="Inference"></a>
### Inference
Let's run the model via Unsloth native inference for the first message which should create a calendar reminder

In [None]:
dataset[0]["messages"][:2]

[{'role': 'developer',
  'content': 'Current date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-06-04T15:29:23\nDay of week is Wednesday\nYou are a model that can do function calling with the following functions\n',
  'tool_calls': None},
 {'role': 'user',
  'content': 'Please set a reminder for a "Team Sync Meeting" this Friday, June 6th, 2025, at 2 PM.',
  'tool_calls': None}]

Let's try inference:

In [None]:
text = tokenizer.apply_chat_template(
    dataset[0]["messages"][:2],
    tools = dataset[0]["tools"],
    tokenize = False,
    add_generation_prompt = True, # Must add for generation
).removeprefix('<bos>')

from transformers import TextStreamer
_ = model.generate(
    **tokenizer(text, return_tensors = "pt").to("cuda"),
    max_new_tokens = 1024,
    streamer = TextStreamer(tokenizer, skip_prompt = False),
    top_p = 0.95, top_k = 64, temperature = 1.0,
)

<bos><start_of_turn>developer
Current date and time given in YYYY-MM-DDTHH:MM:SS format: 2025-06-04T15:29:23
Day of week is Wednesday
You are a model that can do function calling with the following functions<start_function_declaration>declaration:turn_off_flashlight{description:<escape>Turns the flashlight off.<escape>,parameters:{properties:{body:{description:<escape><escape>,type:<escape><escape>},datetime:{description:<escape><escape>,type:<escape><escape>},email:{description:<escape><escape>,type:<escape><escape>},first_name:{description:<escape><escape>,type:<escape><escape>},last_name:{description:<escape><escape>,type:<escape><escape>},phone_number:{description:<escape><escape>,type:<escape><escape>},query:{description:<escape><escape>,type:<escape><escape>},subject:{description:<escape><escape>,type:<escape><escape>},title:{description:<escape><escape>,type:<escape><escape>},to:{description:<escape><escape>,type:<escape><escape>}},type:<escape>OBJECT<escape>}}<end_function_decl

It looks correct!

<a name="Save"></a>
### Saving, loading finetuned models
To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.

**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!

In [None]:
model.save_pretrained("functiongemma")  # Local saving
tokenizer.save_pretrained("functiongemma")
# model.push_to_hub("your_name/functiongemma", token = "...") # Online saving
# tokenizer.push_to_hub("your_name/functiongemma", token = "...") # Online saving

('functiongemma/tokenizer_config.json',
 'functiongemma/special_tokens_map.json',
 'functiongemma/chat_template.jinja',
 'functiongemma/tokenizer.model',
 'functiongemma/added_tokens.json',
 'functiongemma/tokenizer.json')

Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:

In [None]:
if False:
    from unsloth import FastLanguageModel
    model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "functiongemma", # YOUR MODEL YOU USED FOR TRAINING
        max_seq_length = 2048,
        load_in_4bit = False,
    )

### Saving to float16 for VLLM

We also support saving to `float16` directly. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.

In [None]:
# Merge to 16bit
if False:
    model.save_pretrained_merged("functiongemma-finetune", tokenizer, save_method = "merged_16bit")
if False: # Pushing to HF Hub
    model.push_to_hub_merged("hf/functiongemma-finetune", tokenizer, save_method = "merged_16bit", token = "")

# Merge to 4bit
if False:
    model.save_pretrained_merged("functiongemma-finetune", tokenizer, save_method = "merged_4bit",)
if False: # Pushing to HF Hub
    model.push_to_hub_merged("hf/functiongemma-finetune", tokenizer, save_method = "merged_4bit", token = "")

# Just LoRA adapters
if False:
    model.save_pretrained("functiongemma-finetune")
    tokenizer.save_pretrained("functiongemma-finetune")
if False: # Pushing to HF Hub
    model.push_to_hub("hf/functiongemma-finetune", token = "")
    tokenizer.push_to_hub("hf/functiongemma-finetune", token = "")

### GGUF / llama.cpp Conversion
To save to `GGUF` / `llama.cpp`, we support it natively now for all models! For now, you can convert easily to `Q8_0, F16 or BF16` precision. `Q4_K_M` for 4bit will come later!

In [None]:
if False: # Change to True to save to GGUF
    model.save_pretrained_gguf(
        "functiongemma-finetune",
        tokenizer,
        quantization_method = "Q8_0", # For now only Q8_0, BF16, F16 supported
    )

Likewise, if you want to instead push to GGUF to your Hugging Face account, set `if False` to `if True` and add your Hugging Face token and upload location!

In [None]:
if False: # Change to True to upload GGUF
    model.push_to_hub_gguf(
        "HF_ACCOUNT/functiongemma-gguf",
        tokenizer,
        quantization_method = "Q8_0", # Only Q8_0, BF16, F16 supported
        token = "hf_...",
    )

Now, use the `functiongemma-finetune.gguf` file or `functiongemma-finetune-Q4_K_M.gguf` file in llama.cpp.

And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!

Some other links:
1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)
2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)
3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)
6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!

<div class="align-center">
  <a href="https://unsloth.ai"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>
  <a href="https://discord.gg/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/Discord.png" width="145"></a>
  <a href="https://docs.unsloth.ai/"><img src="https://github.com/unslothai/unsloth/blob/main/images/documentation%20green%20button.png?raw=true" width="125"></a>

  Join Discord if you need help + ‚≠êÔ∏è <i>Star us on <a href="https://github.com/unslothai/unsloth">Github</a> </i> ‚≠êÔ∏è
</div>

  This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme).
