# Fine Tuning LLM to better understand LTL Semantics

## Install Dependencies

In [None]:
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "xformers" trl peft accelerate bitsandbytes
!pip install protobuf==3.20.3 --force-reinstall --no-deps
!pip install llama-cpp-python --upgrade --no-cache-dir

Collecting unsloth@ git+https://github.com/unslothai/unsloth.git (from unsloth[colab-new]@ git+https://github.com/unslothai/unsloth.git)
  Cloning https://github.com/unslothai/unsloth.git to /tmp/pip-install-1ve7dggc/unsloth_3e3034a5fb21409d830bf724da09f863
  Running command git clone --filter=blob:none --quiet https://github.com/unslothai/unsloth.git /tmp/pip-install-1ve7dggc/unsloth_3e3034a5fb21409d830bf724da09f863
  Resolved https://github.com/unslothai/unsloth.git to commit 8c155a2c07fc6521c869e44fb5f43b93fc0b81ff
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting unsloth_zoo>=2025.10.1 (from unsloth@ git+https://github.com/unslothai/unsloth.git->unsloth[colab-new]@ git+https://github.com/unslothai/unsloth.git)
  Downloading unsloth_zoo-2025.10.1-py3-none-any.whl.metadata (31 kB)
Collecting tyro (from unsloth@ git+https://github.com/unslothai/unsloth.gi

Collecting llama-cpp-python
  Downloading llama_cpp_python-0.3.16.tar.gz (50.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.7/50.7 MB[0m [31m339.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m184.1 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.3.16-cp312-cp312-linux_x86_64.whl size=4503264 sha256=6b774e8f4ae0a6b3

## Prepare Dataset

In [None]:
from datasets import load_dataset
import torch

# Load your prepared dataset from the dataset.jsonl file
dataset = load_dataset("json", data_files="ltl_training_dataset.jsonl", split="train")

print(f"Total samples: {len(dataset)}")

# Format the dataset for the model's chat template
def format_instruction(example):
    """Format using comprehensive system prompt matching actual repo implementation"""
    instruction = example['instruction']
    output = example['output']
    
    # Enhanced system prompt matching the actual repo's translate_to_ltl function
    system_context = """You are a specialized translator that converts natural language drone commands into Linear Temporal Logic (LTL) formulas.

Available LTL operators:
- F(φ) = eventually φ will be true
- G(φ) = φ is always true  
- X(φ) = φ is true in next step
- φ & ψ = both φ and ψ are true
- φ | ψ = either φ or ψ is true
- !φ = φ is not true
- φ U ψ = φ until ψ

Available predicates and actions:
- at(location): drone at specific location
- near(location, radius): drone within radius of location
- above(altitude): drone above altitude threshold
- below(altitude): drone below altitude threshold
- move_to(location): navigate to location
- hover(duration): maintain position for seconds
- scan(area): perform sensor sweep
- emergency_return(): return to start position
- land(): controlled landing
- clear_of(obstacle): maintain distance from obstacle
- in_bounds(): stay within flight zone
- battery_level(threshold): battery above threshold
- moving(): drone is in motion
- stationary(): drone is not moving

Available waypoints: landing_pad, waypoint_a, waypoint_b, waypoint_c, area_1, obstacle_1, obstacle_2

Respond only with the LTL formula."""
    
    return {
        "text": f"[INST] {system_context}\n\nTranslate this drone command to LTL: {instruction} [/INST] {output}"
    }

# Apply the formatting to the entire dataset
dataset = dataset.map(format_instruction)

# Create a train/validation split (e.g., 80/20) for evaluation
dataset = dataset.train_test_split(test_size=0.2, seed=42)

train_dataset = dataset["train"]
eval_dataset = dataset["test"]

print(f"Training samples: {len(train_dataset)}")
print(f"Validation samples: {len(eval_dataset)}")

Generating train split: 0 examples [00:00, ? examples/s]

Total samples: 274


Map:   0%|          | 0/274 [00:00<?, ? examples/s]

Training samples: 219
Validation samples: 55


## Load Model & Tokenizer

In [None]:
from unsloth import FastLanguageModel

model_name = "unsloth/mistral-7b-instruct-v0.2-bnb-4bit"
max_seq_length = 512 # Reduced for shorter LTL outputs

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=max_seq_length,
    dtype=None,
    load_in_4bit=True,
)

# A simple, yet effective LoRA configuration for your task
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    lora_alpha=16,
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ],
    lora_dropout=0, # No dropout for a deterministic translation task
    bias="none",
    use_gradient_checkpointing="unsloth",
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.10.1: Fast Mistral patching. Transformers: 4.56.2.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.32.post2. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/4.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/155 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/438 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Unsloth 2025.10.1 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


## Setup & Run SFTTrainer

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments, EarlyStoppingCallback

# Set a padding token (still required for stable left padding)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    model.config.pad_token_id = tokenizer.eos_token_id

# Explicitly set padding side to 'left' (still required)
tokenizer.padding_side = "left"

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    dataset_text_field="text",
    max_seq_length=512,
    packing=False,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=3)],
    args=TrainingArguments(
        per_device_train_batch_size=1,
        per_device_eval_batch_size=1,
        gradient_accumulation_steps=16,
        learning_rate=2e-5,
        num_train_epochs=12,
        warmup_steps=10,
        eval_strategy="steps",
        eval_steps=20,
        save_strategy="steps",
        save_steps=20,
        logging_steps=1,
        output_dir="outputs",
        optim="adamw_8bit",
        seed=3407,
        fp16=not torch.cuda.is_bf16_supported(),
        bf16=torch.cuda.is_bf16_supported(),
        save_total_limit=1,
        load_best_model_at_end=True,
        metric_for_best_model="eval_loss",
        greater_is_better=False,
        dataloader_drop_last=True,
        remove_unused_columns=True,
        group_by_length=True, # Crucial
    ),
)

print("Starting LTL translation training...")
trainer_stats = trainer.train()

print("Training complete!")

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/219 [00:00<?, ? examples/s]

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/55 [00:00<?, ? examples/s]

Starting LTL translation training...


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 219 | Num Epochs = 6 | Total steps = 84
O^O/ \_/ \    Batch size per device = 1 | Gradient accumulation steps = 16
\        /    Data Parallel GPUs = 1 | Total batch size (1 x 16 x 1) = 16
 "-____-"     Trainable parameters = 41,943,040 of 7,283,675,136 (0.58% trained)
  | |_| | '_ \/ _` / _` |  _/ -_)


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33misraelavendanojr[0m ([33misraelavendanojr-western-washington-university[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


[34m[1mwandb[0m: Detected [huggingface_hub.inference, openai] in use.
[34m[1mwandb[0m: Use W&B Weave for improved LLM call tracing. Install Weave with `pip install weave` then add `import weave` to the top of your script.
[34m[1mwandb[0m: For more information, check out the docs at: https://weave-docs.wandb.ai/


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss,Validation Loss
20,0.117,0.124178
40,0.0676,0.08362
60,0.053,0.079467
80,0.0468,0.084798


Unsloth: Not an error, but MistralForCausalLM does not accept `num_items_in_batch`.
Using gradient accumulation will be very slightly less accurate.
Read more on gradient accumulation issues here: https://unsloth.ai/blog/gradient


Training complete!


## Test Model

In [None]:
import torch

def quick_test(model, tokenizer):
    """Enhanced test for LTL translation matching actual repo capabilities"""

    # Updated tests to match actual repo predicates and actions
    tests = [
        ("fly to waypoint_c", "F(move_to(waypoint_c))"),
        ("maintain altitude above 2 meters until reaching waypoint_a", "above(2.0) U at(waypoint_a)"),
        ("go to waypoint_a then waypoint_b", "F(move_to(waypoint_a) & X(F(move_to(waypoint_b))))"),
        ("scan area_1 while avoiding obstacle_1", "F(scan(area_1)) & G(clear_of(obstacle_1))"),
        ("return home", "F(move_to(landing_pad))"),
        ("hover for 5 seconds then scan area_1", "F(hover(5) & X(F(scan(area_1))))"),
        ("stay above 1.5 meters until landing", "above(1.5) U land()"),
        ("keep moving until you reach waypoint_b", "moving() U at(waypoint_b)"),
        ("maintain battery above 40% until returning to base", "battery_level(40) U at(landing_pad)"),
        ("stay in bounds until battery level drops below 25%", "in_bounds() U battery_level(25)"),
    ]

    passed = 0

    print("\nTesting LTL Translation Model (Enhanced for Repo Compatibility)...")
    print("="*80)

    # Enhanced system prompt matching the actual repo's translate_to_ltl function
    system_context = """You are a specialized translator that converts natural language drone commands into Linear Temporal Logic (LTL) formulas.

Available LTL operators:
- F(φ) = eventually φ will be true
- G(φ) = φ is always true  
- X(φ) = φ is true in next step
- φ & ψ = both φ and ψ are true
- φ | ψ = either φ or ψ is true
- !φ = φ is not true
- φ U ψ = φ until ψ

Available predicates and actions:
- at(location): drone at specific location
- near(location, radius): drone within radius of location
- above(altitude): drone above altitude threshold
- below(altitude): drone below altitude threshold
- move_to(location): navigate to location
- hover(duration): maintain position for seconds
- scan(area): perform sensor sweep
- emergency_return(): return to start position
- land(): controlled landing
- clear_of(obstacle): maintain distance from obstacle
- in_bounds(): stay within flight zone
- battery_level(threshold): battery above threshold
- moving(): drone is in motion
- stationary(): drone is not moving

Available waypoints: landing_pad, waypoint_a, waypoint_b, waypoint_c, area_1, obstacle_1, obstacle_2

Respond only with the LTL formula."""

    for i, (query, expected) in enumerate(tests, 1):
        # Use the same comprehensive format as training
        formatted = f"[INST] {system_context}\n\nTranslate this drone command to LTL: {query} [/INST]"
        inputs = tokenizer(formatted, return_tensors="pt").to(model.device)

        with torch.no_grad():
            outputs = model.generate(
                **inputs,
                max_new_tokens=128,
                do_sample=False,
                temperature=0.0,
                pad_token_id=tokenizer.eos_token_id,
                eos_token_id=tokenizer.eos_token_id
            )

        # Extract response after [/INST]
        full_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
        response = full_output.split("[/INST]")[-1].strip()
        response = response.split('\n')[0].strip()

        # More flexible matching for LTL formulas
        match = (response == expected or 
                set(response.split(' & ')) == set(expected.split(' & ')) or
                response.replace(' ', '') == expected.replace(' ', ''))
        passed += match

        status = '✓' if match else '✗'
        print(f"\n{status} Test {i}/{len(tests)}")
        print(f"Query:    {query}")
        print(f"Expected: {expected}")
        print(f"Got:      {response}")

    print("\n" + "="*80)
    print(f"Score: {passed}/{len(tests)} ({100*passed/len(tests):.0f}%)")
    print("Status: " + ("Ready!" if passed >= 7 else "Needs more training"))

# Run it
quick_test(model, tokenizer)


Testing LTL Translation Model...

✓ Test 1/8
Query:    fly to waypoint_c
Expected: F(at(waypoint_c))
Got:      F(at(waypoint_c))

✓ Test 2/8
Query:    fly to waypoint_a while staying above 2 meters
Expected: F(at(waypoint_a)) & G(above(2.0))
Got:      F(at(waypoint_a)) & G(above(2.0))

✓ Test 3/8
Query:    go to waypoint_a then waypoint_b
Expected: F(at(waypoint_a) & X(F(at(waypoint_b))))
Got:      F(at(waypoint_a) & X(F(at(waypoint_b))))

✓ Test 4/8
Query:    scan area_1 while avoiding obstacle_1
Expected: F(scan(area_1)) & G(clear_of(obstacle_1))
Got:      F(scan(area_1)) & G(clear_of(obstacle_1))

✓ Test 5/8
Query:    return home
Expected: F(at(landing_pad))
Got:      F(at(landing_pad))

✗ Test 6/8
Query:    hover for 10 seconds then scan area_2
Expected: F(hover(10) & X(F(scan(area_2))))
Got:      F(hover(10) & X(scan(area_2)))

✓ Test 7/8
Query:    fly to waypoint_b then waypoint_c while avoiding obstacle_2
Expected: F(at(waypoint_b) & X(F(at(waypoint_c)))) & G(clear_of(obstacle_

## Export Fine-Tuned Model

In [None]:
# First, merge LoRA adapters into base model (this creates a standard HF model)
print("Merging LoRA adapters into base model...")
model.save_pretrained_merged(
    "Llama3_Crazyflie_merged",  # Save location
    tokenizer,
    save_method="merged_16bit",  # Merge to 16-bit
)

print("LoRA merge complete. Now converting to GGUF...")

# Now convert the MERGED model (no LoRA adapters) to GGUF
model.save_pretrained_gguf(
    "Llama3_Crazyflie_merged",  # Use the merged model
    tokenizer,
    quantization_method="q4_k_m",  # 4-bit quantization
)

print("GGUF conversion complete!")

Merging LoRA adapters into base model...


config.json:   0%|          | 0.00/722 [00:00<?, ?B/s]

Found HuggingFace hub cache directory: /root/.cache/huggingface/hub


Fetching 1 files:   0%|          | 0/1 [00:00<?, ?it/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Checking cache directory for required files...
Cache check failed: model-00001-of-00003.safetensors not found in local cache.
Not all required files found in cache. Will proceed with downloading.


Unsloth: Preparing safetensor model files:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.94G [00:00<?, ?B/s]

Unsloth: Preparing safetensor model files:  33%|███▎      | 1/3 [03:05<06:10, 185.48s/it]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

Unsloth: Preparing safetensor model files:  67%|██████▋   | 2/3 [06:09<03:04, 184.78s/it]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.54G [00:00<?, ?B/s]

Unsloth: Preparing safetensor model files: 100%|██████████| 3/3 [08:45<00:00, 175.31s/it]
Unsloth: Merging weights into 16bit: 100%|██████████| 3/3 [07:05<00:00, 141.81s/it]


Unsloth: Merge process complete. Saved to `/content/Llama3_Crazyflie_merged`
LoRA merge complete. Now converting to GGUF...


Unsloth: You have 1 CPUs. Using `safe_serialization` is 10x slower.
We shall switch to Pytorch saving, which might take 3 minutes and not 30 minutes.
To force `safe_serialization`, set it to `None` instead.
Unsloth: Kaggle/Colab has limited disk space. We need to delete the downloaded
model which will save 4-16GB of disk space, allowing you to save on Kaggle/Colab.
Unsloth: Will remove a cached repo with size 4.1G


Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 2.41 out of 12.67 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...


 59%|█████▉    | 19/32 [00:01<00:00, 13.73it/s]
We will save to Disk and not RAM now.
 78%|███████▊  | 25/32 [01:23<01:06,  9.53s/it]

In [None]:
# Just download the merged model!
!apt-get install -y zip # Installs the zip utility (often needed in Colab)
!zip -r Llama3_Crazyflie_merged.zip Llama3_Crazyflie_merged # Zips the merged model directory
# Download to your computer
from google.colab import files
files.download('Llama3_Crazyflie_merged.zip') # Initiates the download in Colab