# Qwen2.5-Coder-7B to ONNX Converter

This notebook converts Qwen2.5-Coder-7B to ONNX format for use in Code Tutor.

**Steps:**
1. Install required packages
2. Download and convert the model
3. Upload to HuggingFace

**Estimated time:** 30-45 minutes
**Output size:** ~4-5GB

## Step 1: Install Packages

**IMPORTANT:** After running this cell, restart the runtime (Runtime → Restart runtime)

In [None]:
# Install required packages
!pip install -q transformers torch onnx onnxruntime
!pip install -q --upgrade optimum

print("✓ Packages installed")
print("")
print("NOW RESTART THE RUNTIME:")
print("Runtime → Restart runtime")

## Step 2: Convert Model to ONNX

Run this cell AFTER restarting the runtime

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from pathlib import Path
import os

# Disable warnings
import warnings
warnings.filterwarnings('ignore')

model_id = 'Qwen/Qwen2.5-Coder-7B-Instruct'
output_dir = Path('/content/qwen2.5-coder-7b-onnx')
output_dir.mkdir(exist_ok=True)

print("Loading model...")
print("This will download ~15GB")

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map='auto'
)

print("✓ Model loaded")
print()
print("Converting to ONNX...")

# Create dummy input
dummy_input = "Hello, I am a coding tutor."
inputs = tokenizer(dummy_input, return_tensors="pt").to(model.device)

# Export to ONNX
onnx_path = output_dir / "model.onnx"

torch.onnx.export(
    model,
    (inputs['input_ids'],),
    str(onnx_path),
    input_names=['input_ids'],
    output_names=['logits'],
    dynamic_axes={
        'input_ids': {0: 'batch_size', 1: 'sequence_length'},
        'logits': {0: 'batch_size', 1: 'sequence_length'}
    },
    opset_version=14,
    do_constant_folding=True
)

# Save tokenizer files
tokenizer.save_pretrained(output_dir)

# Create genai_config.json
import json
genai_config = {
    "model": {
        "type": "qwen2",
        "context_length": 32768,
        "vocab_size": 151936,
        "decoder": {
            "session_options": {
                "provider_options": []
            },
            "filename": "model.onnx",
            "hidden_size": 3584,
            "num_attention_heads": 28,
            "num_hidden_layers": 28,
            "num_key_value_heads": 4
        }
    },
    "search": {
        "max_length": 32768,
        "temperature": 1.0,
        "top_p": 1.0
    }
}

with open(output_dir / "genai_config.json", "w") as f:
    json.dump(genai_config, f, indent=2)

print()
print("✓ Conversion complete!")
print(f"Files saved to: {output_dir}")
print()
print("Files created:")
for f in output_dir.iterdir():
    size_mb = f.stat().st_size / (1024 * 1024)
    print(f'  - {f.name}: {size_mb:.1f} MB')

## Step 3: Upload to HuggingFace

1. Get your HuggingFace token from https://huggingface.co/settings/tokens
2. Run the cell below and enter your token when prompted

In [None]:
!pip install -q huggingface-hub

from huggingface_hub import HfApi, login
from getpass import getpass

# Get token
token = getpass("Enter your HuggingFace token: ")
login(token=token)

# Set your username and repo name
username = "iminurdetails"  # Your username
repo_name = "Qwen2.5-Coder-7B-Instruct-ONNX"
repo_id = f"{username}/{repo_name}"

print(f"Uploading to {repo_id}...")
print("This may take 10-20 minutes depending on connection")

# Upload
api = HfApi()
api.create_repo(repo_id=repo_id, exist_ok=True)
api.upload_folder(
    folder_path='/content/qwen2.5-coder-7b-onnx',
    repo_id=repo_id,
    repo_type="model"
)

print()
print(f"✓ Upload complete!")
print(f"Model available at: https://huggingface.co/{repo_id}")

## Done!

Your model is now available on HuggingFace. The Code Tutor app will be updated to use:

```
https://huggingface.co/iminurdetails/Qwen2.5-Coder-7B-Instruct-ONNX
```