# Nimo's Coder Agent v3 - Security Enhanced Training

**Optimized for Google Colab Free Tier (T4 GPU, <4 hours)**

## Improvements over v2:
- Security vulnerability detection
- Proper error handling in generated code
- SQL injection awareness

## Training Strategy:
- **1 epoch** on combined dataset (fits in ~3.5 hours)
- Mixed data: 60% general + 30% security + 10% error handling
- Auto-saves to HuggingFace Hub every 500 steps

## Step 1: Setup

In [1]:
!pip install -q transformers datasets peft bitsandbytes accelerate trl huggingface_hub

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.1/59.1 MB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m518.9/518.9 kB[0m [31m39.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import torch
import time
from datasets import load_dataset, Dataset, concatenate_datasets
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from trl import SFTTrainer, SFTConfig
from huggingface_hub import login

# Check GPU
print(f"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")
print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB" if torch.cuda.is_available() else "")

# Track time
start_time = time.time()

GPU: Tesla T4
Memory: 15.8 GB


In [3]:
# Login to HuggingFace
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [5]:
# Configuration
HF_USERNAME = ""
OUTPUT_MODEL = ""
BASE_MODEL = "Qwen/Qwen2.5-Coder-0.5B-Instruct"

## Step 2: Load Datasets

In [6]:
# Loading CodeAlpaca (general coding)
print("Loading CodeAlpaca-20k...")
code_alpaca = load_dataset("sahil2801/CodeAlpaca-20k", split="train")
print(f"  Size: {len(code_alpaca)}")

Loading CodeAlpaca-20k...


README.md:   0%|          | 0.00/147 [00:00<?, ?B/s]

code_alpaca_20k.json: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/20022 [00:00<?, ? examples/s]

  Size: 20022


In [7]:
# Loading Security DPO dataset
print("Loading Security DPO...")
try:
    security_dpo = load_dataset("CyberNative/Code_Vulnerability_Security_DPO", split="train")
    print(f"  Size: {len(security_dpo)}")
    print(f"  Columns: {security_dpo.column_names}")
except Exception as e:
    print(f"  Failed: {e}")
    security_dpo = None

Loading Security DPO...


README.md: 0.00B [00:00, ?B/s]

secure_programming_dpo.json: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/4656 [00:00<?, ? examples/s]

  Size: 4656
  Columns: ['lang', 'vulnerability', 'system', 'question', 'chosen', 'rejected']


In [8]:
# Loading CrossVul dataset
print("Loading CrossVul...")
try:
    crossvul = load_dataset("hitoshura25/crossvul", split="train")
    print(f"  Size: {len(crossvul)}")
    print(f"  Columns: {crossvul.column_names}")
except Exception as e:
    print(f"  Failed: {e}")
    crossvul = None

Loading CrossVul...


README.md: 0.00B [00:00, ?B/s]

data/train-00000-of-00002.parquet:   0%|          | 0.00/123M [00:00<?, ?B/s]

data/train-00001-of-00002.parquet:   0%|          | 0.00/111M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/9313 [00:00<?, ? examples/s]

  Size: 9313
  Columns: ['cwe_id', 'cwe_description', 'language', 'vulnerable_code', 'fixed_code', 'file_pair_id', 'source', 'language_dir']


## Step 3: Convert to Training Format

In [9]:
def format_alpaca(instruction, input_text="", output=""):
    """Format into Alpaca-style prompt."""
    if input_text and input_text.strip():
        return f"""### Instruction:
{instruction}

### Input:
{input_text}

### Response:
{output}"""
    else:
        return f"""### Instruction:
{instruction}

### Response:
{output}"""

In [10]:
def convert_security_dpo(dataset, max_samples=2000):
    """Convert Security DPO to training format."""
    if dataset is None:
        return []

    examples = []
    cols = dataset.column_names

    for i, item in enumerate(dataset):
        if i >= max_samples:
            break

        vuln = item.get('rejected', item.get('vulnerable', item.get('insecure', '')))
        secure = item.get('chosen', item.get('secure', item.get('safe', '')))
        prompt = item.get('prompt', item.get('question', item.get('instruction', '')))

        if vuln and secure:
            # Vulnerability detection
            examples.append({
                "text": format_alpaca(
                    "Review this code for security vulnerabilities. Is it safe?",
                    vuln,
                    f"""NO - SECURITY VULNERABILITY DETECTED!

This code contains security issues that could be exploited.

**Secure Version:**
```
{secure}
```

Always validate inputs and follow secure coding practices."""
                )
            })

            # Fix request
            examples.append({
                "text": format_alpaca(
                    "Fix the security vulnerabilities in this code",
                    vuln,
                    secure
                )
            })

    print(f"  Converted {len(examples)} security examples")
    return examples

In [11]:
def convert_crossvul(dataset, max_samples=2000):
    """Convert CrossVul to training format."""
    if dataset is None:
        return []

    examples = []
    cols = dataset.column_names
    print(f"  CrossVul columns: {cols}")

    for i, item in enumerate(dataset):
        if i >= max_samples:
            break

        vuln = None
        fixed = None

        for v_col in ['vulnerable_code', 'before', 'code_before', 'old_code', 'buggy']:
            if v_col in cols:
                vuln = item.get(v_col, '')
                break

        for f_col in ['fixed_code', 'after', 'code_after', 'new_code', 'fixed']:
            if f_col in cols:
                fixed = item.get(f_col, '')
                break

        cwe = item.get('cwe', item.get('cwe_id', item.get('vulnerability_type', 'Security Issue')))
        lang = item.get('language', item.get('lang', item.get('programming_language', '')))

        if vuln and fixed and len(vuln) < 2000 and len(fixed) < 2000:
            examples.append({
                "text": format_alpaca(
                    f"Review this {lang} code for security vulnerabilities",
                    vuln,
                    f"""SECURITY VULNERABILITY: {cwe}

This code contains a security flaw. Here is the fixed version:

```
{fixed}
```"""
                )
            })

    print(f"  Converted {len(examples)} CrossVul examples")
    return examples

In [12]:
# Error handling examples
error_handling_examples = [
    # Python file operations
    {
        "text": format_alpaca(
            "Write a Python function to read a JSON file",
            "",
            '''```python
import json

def read_json_file(filepath):
    """
    Read and parse a JSON file safely.

    Args:
        filepath: Path to JSON file
    Returns:
        Parsed JSON data
    Raises:
        FileNotFoundError, json.JSONDecodeError
    """
    try:
        with open(filepath, 'r', encoding='utf-8') as f:
            return json.load(f)
    except FileNotFoundError:
        raise FileNotFoundError(f"File not found: {filepath}")
    except json.JSONDecodeError as e:
        raise json.JSONDecodeError(f"Invalid JSON: {e.msg}", e.doc, e.pos)
    except PermissionError:
        raise PermissionError(f"Cannot read: {filepath}")
```'''
        )
    },
    # Python division
    {
        "text": format_alpaca(
            "Write a Python function to divide two numbers",
            "",
            '''```python
def divide(a, b):
    """
    Safely divide two numbers.

    Args:
        a: Dividend
        b: Divisor
    Returns:
        Result of a / b
    Raises:
        ZeroDivisionError, TypeError
    """
    try:
        if b == 0:
            raise ZeroDivisionError("Cannot divide by zero")
        return a / b
    except TypeError:
        raise TypeError("Arguments must be numbers")
```'''
        )
    },
    # Python HTTP request
    {
        "text": format_alpaca(
            "Write a Python function to fetch data from an API",
            "",
            '''```python
import requests

def fetch_data(url, timeout=30):
    """
    Fetch JSON data from URL with error handling.

    Args:
        url: API endpoint
        timeout: Request timeout in seconds
    Returns:
        JSON response data
    Raises:
        requests.RequestException
    """
    try:
        response = requests.get(url, timeout=timeout)
        response.raise_for_status()
        return response.json()
    except requests.Timeout:
        raise requests.Timeout(f"Request timed out: {url}")
    except requests.HTTPError as e:
        raise requests.HTTPError(f"HTTP {response.status_code}: {e}")
    except requests.ConnectionError:
        raise requests.ConnectionError(f"Connection failed: {url}")
```'''
        )
    },
    # Python database
    {
        "text": format_alpaca(
            "Write a Python function to query a SQLite database",
            "",
            '''```python
import sqlite3

def query_users(db_path, user_id=None):
    """
    Query users from database safely.

    SECURITY: Uses parameterized queries to prevent SQL injection.
    """
    conn = None
    try:
        conn = sqlite3.connect(db_path)
        cursor = conn.cursor()

        if user_id:
            # Parameterized query - SAFE from SQL injection
            cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
        else:
            cursor.execute("SELECT * FROM users")

        return cursor.fetchall()
    except sqlite3.Error as e:
        raise sqlite3.Error(f"Database error: {e}")
    finally:
        if conn:
            conn.close()
```'''
        )
    },
    # JavaScript fetch
    {
        "text": format_alpaca(
            "Write a JavaScript function to fetch data from an API",
            "",
            '''```javascript
async function fetchData(url) {
    try {
        const response = await fetch(url);

        if (!response.ok) {
            throw new Error(`HTTP ${response.status}: ${response.statusText}`);
        }

        return await response.json();

    } catch (error) {
        if (error instanceof TypeError) {
            throw new Error('Network error - check connection');
        }
        throw error;
    }
}

// Usage:
try {
    const data = await fetchData('https://api.example.com/users');
    console.log(data);
} catch (error) {
    console.error('Failed:', error.message);
}
```'''
        )
    },
    # JavaScript file read
    {
        "text": format_alpaca(
            "Write a JavaScript function to read a file",
            "",
            '''```javascript
const fs = require('fs').promises;

async function readFile(filePath) {
    try {
        const content = await fs.readFile(filePath, 'utf-8');
        return content;
    } catch (error) {
        if (error.code === 'ENOENT') {
            throw new Error(`File not found: ${filePath}`);
        }
        if (error.code === 'EACCES') {
            throw new Error(`Permission denied: ${filePath}`);
        }
        throw error;
    }
}
```'''
        )
    },
    # Add error handling to existing code
    {
        "text": format_alpaca(
            "Add error handling to this function",
            "def divide(a, b):\n    return a / b",
            '''```python
def divide(a, b):
    """
    Divide a by b with proper error handling.
    """
    try:
        if b == 0:
            raise ZeroDivisionError("Cannot divide by zero")
        return a / b
    except TypeError:
        raise TypeError("Both arguments must be numbers")
```'''
        )
    },
    # Security-aware shell command
    {
        "text": format_alpaca(
            "Write a Python function to run a shell command",
            "",
            '''```python
import subprocess
import shlex

def run_command(command, timeout=30):
    """
    Safely run a shell command.

    SECURITY: Uses shell=False to prevent command injection.
    Never pass untrusted user input directly!
    """
    try:
        # Parse command safely
        args = shlex.split(command)

        result = subprocess.run(
            args,
            capture_output=True,
            text=True,
            timeout=timeout,
            shell=False  # IMPORTANT: Prevents shell injection
        )

        return result.stdout, result.stderr, result.returncode

    except subprocess.TimeoutExpired:
        raise TimeoutError(f"Command timed out after {timeout}s")
    except FileNotFoundError:
        raise FileNotFoundError(f"Command not found: {args[0]}")
```'''
        )
    },
    # Security review - command injection
    {
        "text": format_alpaca(
            "Review this code for security vulnerabilities. Is it safe?",
            "import os\nuser_input = input('Enter filename: ')\nos.system(f'cat {user_input}')",
            '''NO - CRITICAL SECURITY VULNERABILITY!

**Issue:** Command Injection
**Severity:** Critical
**Location:** os.system(f'cat {user_input}')

**Problem:** User input is passed directly to os.system() without sanitization.
An attacker could input: `; rm -rf /` or `| cat /etc/passwd`

**Secure Version:**
```python
import subprocess

user_input = input('Enter filename: ')

# Validate input
if not user_input.replace('_', '').replace('-', '').replace('.', '').isalnum():
    raise ValueError("Invalid filename")

# Use subprocess with shell=False
subprocess.run(['cat', user_input], check=True)
```'''
        )
    },
    # Security review - SQL injection
    {
        "text": format_alpaca(
            "Is this code secure?",
            'query = f"SELECT * FROM users WHERE id = {user_id}"',
            '''NO - SQL INJECTION VULNERABILITY!

**Issue:** SQL Injection
**Severity:** Critical

**Problem:** User input is directly concatenated into SQL query.
Attacker could input: `1 OR 1=1` to dump all users, or `1; DROP TABLE users;--` to destroy data.

**Secure Version:**
```python
# Use parameterized queries
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
```

NEVER use string formatting for SQL queries!'''
        )
    },
    # Security review - safe code
    {
        "text": format_alpaca(
            "Is this code secure?",
            'cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))',
            '''Yes, this code is secure.

It uses parameterized queries (placeholders with ?) which properly escape user input and prevent SQL injection attacks.

This is the correct way to handle user input in SQL queries.'''
        )
    },
    # Python list with edge cases
    {
        "text": format_alpaca(
            "Write a Python function to find the maximum in a list",
            "",
            '''```python
def find_max(numbers):
    """
    Find maximum value in a list.

    Args:
        numbers: List of numbers
    Returns:
        Maximum value or None if empty
    """
    if not numbers:
        return None

    if len(numbers) == 1:
        return numbers[0]

    try:
        return max(numbers)
    except TypeError:
        raise TypeError("List must contain comparable elements")
```'''
        )
    }
]

print(f"Created {len(error_handling_examples)} error handling examples")

Created 12 error handling examples


In [13]:
# Convert all datasets
print("\nConverting datasets...")

# Convert CodeAlpaca
print("\n1. CodeAlpaca:")
code_alpaca_formatted = []
for item in code_alpaca:
    code_alpaca_formatted.append({
        "text": format_alpaca(
            item.get('instruction', ''),
            item.get('input', ''),
            item.get('output', '')
        )
    })
print(f"  Formatted {len(code_alpaca_formatted)} examples")

# Convert security datasets
print("\n2. Security DPO:")
security_examples = convert_security_dpo(security_dpo, max_samples=2500)

print("\n3. CrossVul:")
crossvul_examples = convert_crossvul(crossvul, max_samples=2500)


Converting datasets...

1. CodeAlpaca:
  Formatted 20022 examples

2. Security DPO:
  Converted 5000 security examples

3. CrossVul:
  CrossVul columns: ['cwe_id', 'cwe_description', 'language', 'vulnerable_code', 'fixed_code', 'file_pair_id', 'source', 'language_dir']
  Converted 128 CrossVul examples


In [14]:
# Combine all datasets
print("\nCombining datasets...")

all_examples = (
    code_alpaca_formatted +
    security_examples +
    crossvul_examples +
    error_handling_examples
)

print(f"\nDataset sizes:")
print(f"  CodeAlpaca:      {len(code_alpaca_formatted)}")
print(f"  Security DPO:    {len(security_examples)}")
print(f"  CrossVul:        {len(crossvul_examples)}")
print(f"  Error Handling:  {len(error_handling_examples)}")
print(f"  ─────────────────────────")
print(f"  TOTAL:           {len(all_examples)}")

# Create dataset
combined_dataset = Dataset.from_list(all_examples)
combined_dataset = combined_dataset.shuffle(seed=42)

print(f"\nSample:")
print(combined_dataset[0]['text'][:300] + "...")


Combining datasets...

Dataset sizes:
  CodeAlpaca:      20022
  Security DPO:    5000
  CrossVul:        128
  Error Handling:  12
  ─────────────────────────
  TOTAL:           25162

Sample:
### Instruction:
Write a CSS rule to set a transparent border for all buttons on a page.

### Response:
button {
  border: 2px solid transparent;
}...


In [15]:
# Estimate training time
total_samples = len(combined_dataset)
batch_size = 2
grad_accum = 4
effective_batch = batch_size * grad_accum
steps_per_epoch = total_samples // effective_batch

# v2 took ~5 hours for 3381 steps
time_per_step = 5 * 3600 / 3381  # seconds
estimated_time = steps_per_epoch * time_per_step / 3600

print(f"\nTraining estimate:")
print(f"  Steps per epoch: {steps_per_epoch}")
print(f"  Estimated time: {estimated_time:.1f} hours")

if estimated_time > 3.5:
    print(f"\n  May exceed 4 hour limit!")
else:
    print(f"\n Use Colab free tier limit")


Training estimate:
  Steps per epoch: 3145
  Estimated time: 4.7 hours

  Consider reducing dataset size or increasing batch size.


## Step 4: Load Model

In [16]:
# QLoRA configuration
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,  # MUST match bf16 training
    bnb_4bit_use_double_quant=True,
)

# Load tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Load model
print("Loading model...")
model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

model = prepare_model_for_kbit_training(model)
print(f"Model loaded. Memory: {model.get_memory_footprint() / 1e9:.2f} GB")

Loading tokenizer...


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Loading model...


config.json:   0%|          | 0.00/659 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/988M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/243 [00:00<?, ?B/s]

Model loaded. Memory: 0.72 GB


In [17]:
# LoRA configuration
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
)

model = get_peft_model(model, lora_config)

trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
total = sum(p.numel() for p in model.parameters())
print(f"Trainable: {trainable:,} / {total:,} ({100*trainable/total:.2f}%)")

Trainable: 8,798,208 / 323,917,696 (2.72%)


## Step 5: Train

In [19]:
# Training config - optimized for 4 hour limit
training_args = SFTConfig(
    output_dir=f"./{OUTPUT_MODEL}",

    # less epochs to keep time under 4 hours
    num_train_epochs=1,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,

    learning_rate=2e-4,
    weight_decay=0.01,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine",

    # Memory optimization
    gradient_checkpointing=True,
    bf16=True,
    fp16=False,

    # Save (survives disconnects)
    logging_steps=25,
    save_strategy="steps",
    save_steps=300,
    save_total_limit=3,

    # Push to Hub
    push_to_hub=True,
    hub_model_id=f"{HF_USERNAME}/{OUTPUT_MODEL}",
    hub_strategy="every_save",

    optim="paged_adamw_8bit",
    max_grad_norm=0.3,
    dataset_text_field="text",
    max_length=1024,
)

print(f"Will save to: huggingface.co/{HF_USERNAME}/{OUTPUT_MODEL}")

Will save to: huggingface.co/CaptainNimo/nimos-coder-agent-v3


In [20]:
# Initialize trainer
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=combined_dataset,
    processing_class=tokenizer,
)

print(f"\nStarting training...")
print(f"  Dataset: {len(combined_dataset)} examples")
print(f"  Time elapsed: {(time.time() - start_time) / 60:.1f} minutes")

Adding EOS to train dataset:   0%|          | 0/25162 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/25162 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/25162 [00:00<?, ? examples/s]


Starting training...
  Dataset: 25162 examples
  Time elapsed: 3.6 minutes


In [21]:
trainer.train()

The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151645}.
  | |_| | '_ \/ _` / _` |  _/ -_)
[34m[1mwandb[0m: (1) Create a W&B account
[34m[1mwandb[0m: (2) Use an existing W&B account
[34m[1mwandb[0m: (3) Don't visualize my results
[34m[1mwandb[0m: Enter your choice:

 3


[34m[1mwandb[0m: You chose "Don't visualize my results"


Step,Training Loss
25,1.1233
50,0.8242
75,0.6949
100,0.6983
125,0.6893
150,0.6448
175,0.6654
200,0.6845
225,0.5801
250,0.6165




TrainOutput(global_step=3146, training_loss=0.6127634863904988, metrics={'train_runtime': 9891.8452, 'train_samples_per_second': 2.544, 'train_steps_per_second': 0.318, 'total_flos': 9775302491443200.0, 'train_loss': 0.6127634863904988, 'entropy': 0.6725581222110324, 'num_tokens': 3163625.0, 'mean_token_accuracy': 0.8126406132439037, 'epoch': 1.0})

In [22]:
# Save final model
print("Saving final model...")
trainer.save_model()
trainer.push_to_hub()

total_time = (time.time() - start_time) / 3600
print(f"\n Training complete!")
print(f"  Total time: {total_time:.2f} hours")
print(f"  Model: huggingface.co/{HF_USERNAME}/{OUTPUT_MODEL}")

Saving final model...




Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...gent-v3/training_args.bin: 100%|##########| 6.29kB / 6.29kB            

  ...575355.6586a42948a6.385.0: 100%|##########| 54.9kB / 54.9kB            

  ...adapter_model.safetensors: 100%|##########| 17.6MB / 17.6MB            

  ...r-agent-v3/tokenizer.json: 100%|##########| 11.4MB / 11.4MB            

No files have been modified since last commit. Skipping to prevent empty commit.


Processing Files (0 / 0)      : |          |  0.00B /  0.00B            

New Data Upload               : |          |  0.00B /  0.00B            

  ...gent-v3/training_args.bin: 100%|##########| 6.29kB / 6.29kB            

  ...575355.6586a42948a6.385.0: 100%|##########| 54.9kB / 54.9kB            

  ...r-agent-v3/tokenizer.json: 100%|##########| 11.4MB / 11.4MB            

  ...adapter_model.safetensors: 100%|##########| 17.6MB / 17.6MB            

No files have been modified since last commit. Skipping to prevent empty commit.



✓ Training complete!
  Total time: 2.81 hours
  Model: huggingface.co/CaptainNimo/nimos-coder-agent-v3


## Step 6: Test

In [23]:
# Prepare for inference
model.eval()
model.config.use_cache = True
if hasattr(model, 'gradient_checkpointing_disable'):
    model.gradient_checkpointing_disable()

def generate(instruction, input_text="", max_tokens=512):
    if input_text:
        prompt = f"### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"
    else:
        prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_tokens,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
        )

    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    if "### Response:" in response:
        response = response.split("### Response:")[-1].strip()
    return response

In [24]:
# TEST 1: Security Review (was failing in v2!)
print("=" * 60)
print("TEST 1: Command Injection Detection")
print("=" * 60)

vuln_code = '''import os
user_input = input("Enter filename: ")
os.system(f"cat {user_input}")'''

response = generate("Review this code for security vulnerabilities. Is it safe?", vuln_code)
print(response)

# Check if vulnerability detected
detected = any(word in response.lower() for word in ['vulnerability', 'unsafe', 'injection', 'dangerous', 'no', 'security'])
print(f"\n{'✓' if detected else '✗'} Vulnerability detected: {detected}")

TEST 1: Command Injection Detection
NO - SECURITY VULNERABILITY DETECTED!

This code contains security issues that could be exploited.

**Secure Version:**
```
```python
import os
import subprocess

def execute_command(user_input):
    command = ["cat", user_input]
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    output, error = process.communicate()
    if process.returncode != 0:
        print(f"Error occurred: {error.decode('utf-8')}")
    else:
        print(output.decode('utf-8'))

user_input = input("Enter filename: ")
execute_command(user_input)
```
```

Always validate inputs and follow secure coding practices.

✓ Vulnerability detected: True


In [25]:
# TEST 2: SQL Injection
print("=" * 60)
print("TEST 2: SQL Injection Detection")
print("=" * 60)

sql_code = 'query = f"SELECT * FROM users WHERE id = {user_id}"'
response = generate("Is this code secure?", sql_code)
print(response)

detected = any(word in response.lower() for word in ['sql injection', 'parameterized', 'unsafe', 'no'])
print(f"\n{'✓' if detected else '✗'} SQL injection detected: {detected}")

TEST 2: SQL Injection Detection
No, this code is not secure. It uses SQL injection, which is a vulnerability that can be exploited by attackers to inject malicious code into the database. This can cause data loss, unauthorized access, and even complete system compromise.

✓ SQL injection detected: True


In [26]:
# TEST 3: Error Handling
print("=" * 60)
print("TEST 3: Error Handling in Generated Code")
print("=" * 60)

response = generate("Write a Python function to read a file")
print(response)

has_try = 'try' in response.lower()
has_except = 'except' in response.lower()
print(f"\n{'✓' if has_try and has_except else '✗'} Has error handling: {has_try and has_except}")

TEST 3: Error Handling in Generated Code
def read_file(filename):
    with open(filename, 'r') as f:
        return f.read()

✗ Has error handling: False


In [27]:
# TEST 4: JavaScript async
print("=" * 60)
print("TEST 4: JavaScript Async Error Handling")
print("=" * 60)

response = generate("Write a JavaScript function to fetch data from an API")
print(response)

has_try_catch = 'try' in response and 'catch' in response
checks_response = 'response.ok' in response or '!response.ok' in response
print(f"\n{'✓' if has_try_catch else '✗'} Has try-catch: {has_try_catch}")
print(f"{'✓' if checks_response else '✗'} Checks response.ok: {checks_response}")

TEST 4: JavaScript Async Error Handling
function fetchData(url) {
  return fetch(url)
    .then(response => response.json())
    .then(data => data)
    .catch(error => console.error('Error fetching data:', error));
}

✗ Has try-catch: False
✗ Checks response.ok: False


In [28]:
# TEST 5: General coding (should still work)
print("=" * 60)
print("TEST 5: General Coding (Prime Check)")
print("=" * 60)

response = generate("Write a Python function to check if a number is prime")
print(response)

TEST 5: General Coding (Prime Check)
def is_prime(num):
    if num <= 1: 
        return False
    for i in range(2, int(num ** 0.5) + 1): 
        if num % i == 0: 
            return False
    return True


## Summary

Training complete! Check the test results above.

**Expected improvements over v2:**
- Security review: 40% → 75%+
- Error handling: <10% → 60%+
- General coding: Maintained

**Next steps:**
1. Run full evaluation suite
2. Update HuggingFace Space
3. Compare v2 vs v3