# üê∏ Bombina Fine-Tuning - 5K Dataset

**Professional Pentest AI Fine-Tuning with Unsloth + QLoRA**

Dataset: 5,004 reasoning-based samples
- Attack decision paths
- Failure analysis
- Detection evasion
- Blue team perspective

In [None]:
# Install dependencies
!pip install unsloth
!pip install --no-deps trl peft accelerate bitsandbytes

In [None]:
from unsloth import FastLanguageModel
import torch

# Configuration
max_seq_length = 4096
dtype = None  # Auto-detect
load_in_4bit = True

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Qwen2.5-7B",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

print(f"‚úÖ Model loaded: {model.config._name_or_path}")

In [None]:
# Apply LoRA configuration (optimized for pentest reasoning)
model = FastLanguageModel.get_peft_model(
    model,
    r=16,  # LoRA rank
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ],
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=42,
    use_rslora=False,
    loftq_config=None,
)

print("‚úÖ LoRA applied")
model.print_trainable_parameters()

In [None]:
# Upload your dataset
from google.colab import files
print("üìÅ Upload train_split.jsonl from scripts/data/processed/")
uploaded = files.upload()

In [None]:
# Load and format dataset
from datasets import load_dataset

# Alpaca-style prompt template for pentest reasoning
alpaca_prompt = """Below is an instruction that describes a penetration testing task, paired with an input that provides further context. Write a response that appropriately completes the request with expert-level reasoning.

### Instruction:
{instruction}

### Input:
{input}

### Response:
{output}"""

EOS_TOKEN = tokenizer.eos_token

def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs = examples["input"]
    outputs = examples["output"]
    texts = []
    for instruction, input_text, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(
            instruction=instruction,
            input=input_text if input_text else "",
            output=output
        ) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

# Load dataset
dataset = load_dataset("json", data_files="train_split.jsonl", split="train")
dataset = dataset.map(formatting_prompts_func, batched=True)

print(f"‚úÖ Dataset loaded: {len(dataset)} samples")
print(f"\nüìù Sample entry:\n{dataset[0]['text'][:500]}...")

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    packing=False,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=8,
        warmup_ratio=0.05,
        num_train_epochs=3,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="cosine",
        seed=42,
        output_dir="outputs",
        save_strategy="epoch",
    ),
)

print("‚úÖ Trainer configured")
print(f"   Batch size: 2 x 8 = 16 effective")
print(f"   Epochs: 3")
print(f"   Learning rate: 2e-4")

In [None]:
# üöÄ START TRAINING
print("üî• Starting fine-tuning...")
print("   Expected time: ~2-3 hours on T4/A100")

trainer_stats = trainer.train()

print("\n‚úÖ Training complete!")
print(f"   Total steps: {trainer_stats.global_step}")
print(f"   Final loss: {trainer_stats.training_loss:.4f}")

In [None]:
# Test the model
FastLanguageModel.for_inference(model)

test_prompt = alpaca_prompt.format(
    instruction="You are conducting a penetration test against a Windows domain.",
    input="You have local admin on a workstation. EDR is deployed. What's your next move?",
    output=""
)

inputs = tokenizer([test_prompt], return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    use_cache=True
)

response = tokenizer.batch_decode(outputs)[0]
print("üß™ Test Response:")
print(response.split("### Response:")[1] if "### Response:" in response else response)

In [None]:
# Save LoRA adapter
model.save_pretrained("bombina-lora-5k")
tokenizer.save_pretrained("bombina-lora-5k")
print("‚úÖ LoRA adapter saved to bombina-lora-5k/")

In [None]:
# Export to GGUF for Ollama
print("üì¶ Exporting to GGUF format...")

model.save_pretrained_gguf(
    "bombina-5k",
    tokenizer,
    quantization_method="q4_k_m"  # Good balance of size/quality
)

print("‚úÖ GGUF exported: bombina-5k-q4_k_m.gguf")

In [None]:
# Download the model
from google.colab import files
import os

# Find the GGUF file
for f in os.listdir("bombina-5k"):
    if f.endswith(".gguf"):
        print(f"üì• Downloading {f}...")
        files.download(f"bombina-5k/{f}")
        break

print("\nüìã To use with Ollama:")
print("1. Copy .gguf to your machine")
print("2. Create Modelfile:")
print('   FROM ./bombina-5k-q4_k_m.gguf')
print('   PARAMETER num_ctx 4096')
print('   PARAMETER temperature 0.7')
print("3. ollama create bombina-5k -f Modelfile")

## üéØ Training Summary

**What was trained:**
- 5,004 pentest reasoning samples
- Attack decision trees
- Failure scenarios
- Detection awareness
- Blue team perspective

**Model outputs:**
- `bombina-lora-5k/` - LoRA adapter (~300MB)
- `bombina-5k-q4_k_m.gguf` - Quantized for Ollama (~4GB)

**Next steps:**
1. Download GGUF
2. Create Ollama model
3. Test with Bombina agent