# SQLTemple-0.5B-Beta Training Pipeline

This notebook implements the complete end-to-end training pipeline for SQLTemple-0.5B-Beta using knowledge distillation from TinyLlama-1.1B-Chat-v1.0.

- **Teacher Model**: TinyLlama-1.1B-Chat-v1.0
- **Student Model**: 0.5B parameter architecture
- **Dataset**: Full Spider dataset (7,000 examples)
- **Method**: Knowledge Distillation + LoRA fine-tuning
- **Epochs**: 25 with optimizations
- **Output**: GGUF-ready model for C++ runtime

---

## Setup and Imports

In [None]:
%pip install llama-cpp-python transformers datasets peft torch accelerate deepspeed

In [None]:
from datasets import load_dataset
from transformers import (
    AutoTokenizer, AutoModelForCausalLM, AutoConfig,
    TrainingArguments, Trainer
)
from peft import LoraConfig, get_peft_model
import torch
import torch.nn.functional as F
import json
import os
from datetime import datetime

os.environ["WANDB_DISABLED"] = "true"

print("Starting SQLTemple-0.5B-Beta Training Pipeline")
print(f"Started at: {datetime.now()}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

## Dataset Loading

Loading full Spider dataset for SQL training (7,000 examples).

In [None]:
print("Loading Spider dataset...")
spider = load_dataset("xlangai/spider", split="train")

print(f"Spider: {len(spider):,} examples")
print("Sample Spider example:")
print(json.dumps(spider[0], indent=2)[:500] + "...")

## Tokenizer Setup

Loading TinyLlama tokenizer and configuring for chat format.

In [None]:
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0", use_fast=True)
tokenizer.pad_token = tokenizer.eos_token

print("Tokenizer loaded")
print(f"Vocab size: {tokenizer.vocab_size:,}")
print(f"EOS token: '{tokenizer.eos_token}' (ID: {tokenizer.eos_token_id})")
print(f"PAD token: '{tokenizer.pad_token}' (ID: {tokenizer.pad_token_id})")

## Data Preprocessing

Converting Spider dataset into instruction format for chat-style training.

In [None]:
def preprocess_spider(example):
    """Convert Spider example to instruction format"""
    question = example["question"]
    sql = example["query"]
    db_id = example.get("db_id", "")

    if db_id:
        user_prompt = f"Database: {db_id}\nQuestion: {question}"
    else:
        user_prompt = f"Question: {question}"

    prompt = f"<|system|>You are an SQL assistant. Answer in valid SQL.\n<|user|>{user_prompt}\n<|assistant|>"
    return {"question": question, "sql": sql, "prompt": prompt, "db_id": db_id}

print("Testing preprocessing function...")
sample_spider = preprocess_spider(spider[0])

print("Spider formatted example:")
print(f"Prompt: {sample_spider['prompt'][:200]}...")
print(f"SQL: {sample_spider['sql']}")
print(f"DB ID: {sample_spider['db_id']}")

In [None]:
def preprocess_example(example):
    """Preprocess a single Spider example for causal language modeling"""
    processed = preprocess_spider(example)

    prompt = processed["prompt"]
    sql = processed["sql"]

    full_text = prompt + sql + tokenizer.eos_token

    tokenized = tokenizer(
        full_text,
        truncation=True,
        max_length=512,
        padding="max_length",
        return_tensors=None
    )

    input_ids = tokenized["input_ids"]
    attention_mask = tokenized["attention_mask"]

    labels = input_ids.copy()

    prompt_tokenized = tokenizer(
        prompt,
        truncation=True,
        max_length=512,
        padding="max_length",
        return_tensors=None
    )

    prompt_length = len([token for token in prompt_tokenized["input_ids"] if token != tokenizer.pad_token_id])

    for i in range(min(prompt_length, len(labels))):
        labels[i] = -100

    for i in range(len(labels)):
        if attention_mask[i] == 0:
            labels[i] = -100

    return {
        "input_ids": input_ids,
        "labels": labels,
        "attention_mask": attention_mask
    }

print("Preprocessing full Spider dataset...")

tokenized_ds = spider.map(
    preprocess_example,
    remove_columns=spider.column_names,
    desc="Processing Spider"
)

print(f"Processed dataset: {len(tokenized_ds):,} examples")

example = tokenized_ds[0]
print(f"All sequences are 512 tokens: {len(example['input_ids']) == 512}")
print(f"Dataset features: {list(tokenized_ds.features.keys())}")

## Teacher Model Loading

Loading TinyLlama-1.1B as teacher model for knowledge distillation.

In [None]:
print("Loading teacher model (TinyLlama-1.1B)...")
teacher_model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

teacher_params = sum(p.numel() for p in teacher_model.parameters())
print(f"Teacher model loaded: {teacher_params:,} parameters")
print(f"Teacher model size: ~{teacher_params * 2 / 1e9:.2f} GB (bf16)")

# Set teacher model to eval mode
teacher_model.eval()
for param in teacher_model.parameters():
    param.requires_grad = False

print("Teacher model set to evaluation mode")

## Student Model Architecture

Creating 0.5B parameter student model by scaling down TinyLlama architecture.

In [None]:
print("Creating student model architecture...")

teacher_config = AutoConfig.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")

# Create student config using the proper HuggingFace approach
# Convert to dict, modify, then create new config
config_dict = teacher_config.to_dict()
config_dict.update({
    "num_hidden_layers": 12,
    "hidden_size": 1440,
    "intermediate_size": 5760,
    "num_attention_heads": 12,
    "num_key_value_heads": 4
})

student_config = teacher_config.__class__(**config_dict)

print(f"Student config:")
print(f"  Layers: {student_config.num_hidden_layers} (vs {teacher_config.num_hidden_layers})")
print(f"  Hidden size: {student_config.hidden_size} (vs {teacher_config.hidden_size})")
print(f"  Attention heads: {student_config.num_attention_heads} (vs {teacher_config.num_attention_heads})")

print("Initializing student model...")
student_model = AutoModelForCausalLM.from_config(
    student_config,
    torch_dtype=torch.bfloat16
)

student_params = sum(p.numel() for p in student_model.parameters())
print(f"Student model created: {student_params:,} parameters")
print(f"Student model size: ~{student_params * 2 / 1e9:.2f} GB (bf16)")
print(f"Size reduction: {(1 - student_params / teacher_params) * 100:.1f}%")

## LoRA Configuration for Student

Applying parameter-efficient fine-tuning with optimized LoRA adapters.

In [None]:
print("Applying LoRA configuration to student model...")
lora_config = LoraConfig(
    r=8,
    lora_alpha=64,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

student_model = get_peft_model(student_model, lora_config)

print("LoRA Parameter Summary:")
student_model.print_trainable_parameters()

## Knowledge Distillation Loss

Custom trainer with knowledge distillation loss combining soft targets and feature alignment.

In [None]:
class DistillationTrainer(Trainer):
    def __init__(self, teacher_model, temperature=4.0, alpha=0.7, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.teacher_model = teacher_model
        self.temperature = temperature
        self.alpha = alpha
        
    def compute_loss(self, model, inputs, return_outputs=False, num_items_in_batch=None):
        """
        Compute loss with knowledge distillation.
        Updated to handle the num_items_in_batch parameter from newer Transformers versions.
        """
        student_outputs = model(**inputs)
        student_logits = student_outputs.logits
        student_loss = student_outputs.loss
        
        with torch.no_grad():
            teacher_outputs = self.teacher_model(**inputs)
            teacher_logits = teacher_outputs.logits
        
        # Knowledge distillation loss
        distillation_loss = F.kl_div(
            F.log_softmax(student_logits / self.temperature, dim=-1),
            F.softmax(teacher_logits / self.temperature, dim=-1),
            reduction="batchmean"
        ) * (self.temperature ** 2)
        
        # Combined loss
        loss = self.alpha * distillation_loss + (1 - self.alpha) * student_loss
        
        return (loss, student_outputs) if return_outputs else loss

print("Distillation trainer class defined")
print(f"Temperature: 4.0")
print(f"Alpha (distillation weight): 0.7")
print("‚úÖ Updated to handle newer Transformers API")

## Training Configuration

Setting up optimized training arguments for 25 epochs with full dataset.

In [None]:
print("Setting up training arguments...")
args = TrainingArguments(
    output_dir="./sqltemple_0_5b_beta",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=8,
    num_train_epochs=25, 
    learning_rate=2e-4,
    weight_decay=0.01,
    max_grad_norm=1.0,
    warmup_steps=100,
    lr_scheduler_type="cosine",
    logging_steps=50,
    eval_strategy="steps",
    eval_steps=500,
    save_steps=1000,
    save_total_limit=3,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
    bf16=True,
    dataloader_pin_memory=True,
    dataloader_num_workers=4,
    remove_unused_columns=False,
    report_to=[]
)

print("Training arguments configured")
print(f"Effective batch size: {args.per_device_train_batch_size * args.gradient_accumulation_steps}")
print(f"Epochs: {args.num_train_epochs}")
print(f"Learning rate: {args.learning_rate}")
print(f"Scheduler: {args.lr_scheduler_type}")
print(f"Mixed precision: {args.bf16}")

print("Preparing train/eval/test splits...")
dataset_size = len(tokenized_ds)
train_size = int(dataset_size * 0.8)
eval_size = int(dataset_size * 0.15)
test_size = dataset_size - train_size - eval_size

shuffled_ds = tokenized_ds.shuffle(seed=42)
train_dataset = shuffled_ds.select(range(train_size))
eval_dataset = shuffled_ds.select(range(train_size, train_size + eval_size))
test_dataset = shuffled_ds.select(range(train_size + eval_size, dataset_size))

print(f"Training examples: {len(train_dataset):,}")
print(f"Evaluation examples: {len(eval_dataset):,}")
print(f"Test examples: {len(test_dataset):,}")

## Training

Execute knowledge distillation training with optimized pipeline.

In [None]:
from transformers import DataCollatorForLanguageModeling, EarlyStoppingCallback

print("Setting up distillation training...")

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False,
    return_tensors="pt"
)

trainer = DistillationTrainer(
    teacher_model=teacher_model,
    temperature=4.0,
    alpha=0.7,
    model=student_model,
    args=args,
    data_collator=data_collator,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    processing_class=tokenizer,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=5)]
)

print(f"Training started at: {datetime.now()}")
print("Training with knowledge distillation...")

train_result = trainer.train()

print(f"Training completed at: {datetime.now()}")
print(f"Training metrics: {train_result.metrics}")

## Model Saving

Saving the distilled student model and tokenizer.

In [None]:
print("Saving model...")

print("Merging LoRA adapters...")
merged_model = student_model.merge_and_unload()

print("Saving HuggingFace format...")
merged_model.save_pretrained(
    "./sqltemple-0.5b-beta-hf",
    safe_serialization=True,
    push_to_hub=False
)
tokenizer.save_pretrained("./sqltemple-0.5b-beta-hf")

print("Model saved successfully")

## Model Testing

Testing the distilled student model with sample SQL queries.

In [None]:
print("Testing distilled model...")
merged_model.eval()

test_prompts = [
    "<|system|>You are an SQL assistant. Answer in valid SQL.\n<|user|>Question: Get all users from the users table\n<|assistant|>",
    "<|system|>You are an SQL assistant. Answer in valid SQL.\n<|user|>Schema: products(id, name, price)\nQuestion: Find products with price greater than 100\n<|assistant|>",
    "<|system|>You are an SQL assistant. Answer in valid SQL.\n<|user|>Schema: employees(id, name, salary)\nQuestion: List employees with salary less than 50000\n<|assistant|>"
]

print("Testing with sample prompts:")
for i, prompt in enumerate(test_prompts, 1):
    print(f"\n--- Test {i} ---")
    print(f"Prompt: {prompt.split('<|assistant|>')[0]}")

    inputs = tokenizer(prompt, return_tensors="pt")
    if torch.cuda.is_available():
        inputs = {k: v.cuda() for k, v in inputs.items()}

    with torch.no_grad():
        outputs = merged_model.generate(
            **inputs,
            max_new_tokens=100,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )

    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    sql_response = response.split("<|assistant|>")[-1].strip()
    print(f"Generated SQL: {sql_response}")

print("Model testing completed")

## Summary

Final summary of the knowledge distillation training process.

In [None]:
print("=" * 60)
print("SQLTEMPLE-0.5B-BETA TRAINING COMPLETED")
print("=" * 60)

print(f"\nTraining Summary:")
print(f"Teacher Model: TinyLlama-1.1B-Chat-v1.0")
print(f"Student Model: ~0.5B parameters")
print(f"Dataset: Spider (7,000 examples)")
print(f"Training Examples: {len(train_dataset):,}")
print(f"Method: Knowledge Distillation + LoRA")
print(f"Epochs: {args.num_train_epochs}")

print(f"\nOptimizations Applied:")
print(f"- Knowledge Distillation (T=4.0, Œ±=0.7)")
print(f"- LoRA (r=8, Œ±=64)")
print(f"- Mixed Precision Training (bf16)")
print(f"- Cosine Learning Rate Schedule")
print(f"- Gradient Clipping")
print(f"- Early Stopping")

print("=" * 60)

## Model Card Generation

Creating comprehensive model card with technical specifications and training details.

In [None]:
# Generate README.md for Hugging Face deployment
readme_content = f"""---
license: apache-2.0
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
- sql
- text-to-sql
- code-generation
- knowledge-distillation
- peft
- lora
- spider
- database
- query-generation
language:
- en
datasets:
- xlangai/spider
pipeline_tag: text-generation
library_name: transformers
model_type: llama
quantized: false
inference:
  parameters:
    temperature: 0.7
    max_new_tokens: 100
    do_sample: true
widget:
- example_title: "Simple SELECT"
  text: "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Question: Get all users from the users table\\n<|assistant|>"
- example_title: "Conditional Query"
  text: "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Schema: products(id, name, price)\\nQuestion: Find products with price greater than 100\\n<|assistant|>"
- example_title: "Employee Query"
  text: "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Schema: employees(id, name, salary)\\nQuestion: List employees with salary less than 50000\\n<|assistant|>"
model-index:
- name: SQLTemple-0.5B-Beta
  results:
  - task:
      type: text-generation
      name: SQL Generation
    dataset:
      type: xlangai/spider
      name: Spider
      split: test
    metrics:
    - type: loss
      name: Training Loss
      value: "TBD"
    - type: loss
      name: Evaluation Loss
      value: "TBD"
---

# SQLTemple-0.5B-Beta

<div align="center">

![SQLTemple Logo](https://img.shields.io/badge/SQLTemple-0.5B--Beta-blue)
![License](https://img.shields.io/badge/license-Apache%202.0-green)
![Model Size](https://img.shields.io/badge/parameters-~{student_params//1000000}M-orange)
![Base Model](https://img.shields.io/badge/base-TinyLlama--1.1B-purple)

</div>

SQLTemple-0.5B-Beta is a compact SQL generation model created through knowledge distillation from TinyLlama-1.1B-Chat-v1.0. This model is specifically designed for SQL query generation from natural language descriptions.

## üöÄ Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("sqltemple/sqltemple-0.5b-beta")
model = AutoModelForCausalLM.from_pretrained(
    "sqltemple/sqltemple-0.5b-beta", 
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Generate SQL query
prompt = "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Question: Get all users from the users table\\n<|assistant|>"
inputs = tokenizer(prompt, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
sql_query = response.split("<|assistant|>")[-1].strip()
print(sql_query)  # Expected: SELECT * FROM users;
```

## üìã Model Details

| **Attribute** | **Value** |
|---------------|-----------|
| **Model Name** | SQLTemple-0.5B-Beta |
| **Model Type** | Causal Language Model (Transformer Decoder) |
| **Base Model** | TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
| **Parameters** | ~{student_params:,} parameters |
| **Size Reduction** | {(1 - student_params / teacher_params) * 100:.1f}% smaller than teacher model |
| **License** | Apache 2.0 |
| **Language** | SQL, English |
| **Training Method** | Knowledge Distillation + LoRA |

## üèóÔ∏è Architecture

| **Component** | **Value** |
|---------------|-----------|
| **Layers** | {student_config.num_hidden_layers} |
| **Hidden Size** | {student_config.hidden_size} |
| **Attention Heads** | {student_config.num_attention_heads} |
| **Key-Value Heads** | {student_config.num_key_value_heads} (Grouped Query Attention) |
| **Intermediate Size** | {student_config.intermediate_size} |
| **Vocabulary Size** | {student_config.vocab_size:,} |
| **Max Position Embeddings** | {student_config.max_position_embeddings:,} |
| **Activation Function** | SiLU |

## üéØ Training

### Method
Knowledge Distillation + LoRA (Low-Rank Adaptation) Fine-tuning

### Dataset
- **Name**: Spider
- **Source**: [xlangai/spider](https://huggingface.co/datasets/xlangai/spider)
- **Total Examples**: {len(spider):,}
- **Training Split**: {len(train_dataset):,} examples (80%)
- **Evaluation Split**: {len(eval_dataset):,} examples (15%)
- **Test Split**: {len(test_dataset):,} examples (5%)

### Training Configuration
| **Parameter** | **Value** |
|---------------|-----------|
| **Epochs** | {args.num_train_epochs} |
| **Batch Size** | {args.per_device_train_batch_size} (per device) |
| **Gradient Accumulation** | {args.gradient_accumulation_steps} steps |
| **Effective Batch Size** | {args.per_device_train_batch_size * args.gradient_accumulation_steps} |
| **Learning Rate** | {args.learning_rate} |
| **Scheduler** | {args.lr_scheduler_type} |
| **Precision** | bfloat16 |
| **Max Sequence Length** | 512 tokens |

### Knowledge Distillation
| **Parameter** | **Value** |
|---------------|-----------|
| **Temperature** | 4.0 |
| **Alpha (distillation weight)** | 0.7 |
| **Loss Function** | KL Divergence + Cross Entropy |
| **Soft Target Weight** | 70% |
| **Hard Target Weight** | 30% |

### LoRA Configuration
| **Parameter** | **Value** |
|---------------|-----------|
| **Rank (r)** | {lora_config.r} |
| **Alpha** | {lora_config.lora_alpha} |
| **Target Modules** | {', '.join(lora_config.target_modules)} |
| **Dropout** | {lora_config.lora_dropout} |
| **Bias** | {lora_config.bias} |

## üíª Usage

### Input Format
The model uses a chat template with system, user, and assistant roles:

```
<|system|>You are an SQL assistant. Answer in valid SQL.
<|user|>{{user_prompt}}
<|assistant|>
```

### Example Queries

#### Basic Query
```python
prompt = "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Question: Show all customers\\n<|assistant|>"
# Expected output: SELECT * FROM customers;
```

#### Conditional Query
```python
prompt = "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Schema: orders(id, customer_id, total)\\nQuestion: Find orders with total greater than 100\\n<|assistant|>"
# Expected output: SELECT * FROM orders WHERE total > 100;
```

#### Aggregation Query
```python
prompt = "<|system|>You are an SQL assistant. Answer in valid SQL.\\n<|user|>Schema: sales(id, product_id, amount)\\nQuestion: What is the total sales amount?\\n<|assistant|>"
# Expected output: SELECT SUM(amount) FROM sales;
```

### Generation Parameters
| **Parameter** | **Recommended Value** |
|---------------|-----------------------|
| **max_new_tokens** | 100 |
| **temperature** | 0.7 |
| **do_sample** | True |
| **pad_token_id** | eos_token_id |

## üéØ Intended Use

- ‚úÖ **SQL query generation** from natural language
- ‚úÖ **SQL code completion** and assistance
- ‚úÖ **Educational SQL learning** tool
- ‚úÖ **Integration into SQL IDEs** and development tools
- ‚úÖ **Rapid prototyping** of database queries
- ‚úÖ **SQL documentation** generation

## ‚ö†Ô∏è Limitations

- ‚ùå **Optimized primarily for SQL generation**, not general language tasks
- ‚ùå **Training focused on Spider dataset patterns**
- ‚ùå **May require fine-tuning** for domain-specific SQL dialects
- ‚ùå **Limited to 512 token context window**
- ‚ùå **No support for stored procedures** or advanced database features
- ‚ùå **May not handle very complex queries** with multiple joins

## üîß Performance

### Hardware Requirements
| **Component** | **Minimum** | **Recommended** |
|---------------|-------------|-----------------|
| **GPU Memory** | 8GB | 16GB+ |
| **System RAM** | 16GB | 32GB+ |
| **Inference Memory** | ~1GB (bfloat16) | ~2GB (float32) |
| **Storage** | 1GB | 2GB |

### Evaluation Metrics
- **Training Loss**: TBD (logged during training)
- **Evaluation Loss**: TBD (logged during training)
- **Perplexity**: TBD (calculated post-training)
- **Evaluation Strategy**: Steps-based evaluation every 500 steps
- **Best Model Selection**: Lowest evaluation loss
- **Early Stopping**: Patience of 5 evaluation steps

## üîç Technical Details

### Tokenizer
- **Type**: LlamaTokenizer (from TinyLlama)
- **EOS Token**: `</s>`
- **PAD Token**: `</s>`
- **Chat Tokens**: `<|system|>`, `<|user|>`, `<|assistant|>`

### Data Preprocessing
- **Instruction Format**: Chat-style with role markers
- **Label Masking**: System and user prompts masked with -100
- **Padding**: Right padding to max_length
- **Truncation**: Enabled at 512 tokens

### Model Outputs
- üì¶ **HuggingFace format** (.safetensors)
- üì¶ **GGUF format** (planned for C++ runtime)
- üì¶ **LoRA adapters** (separate weights)

## üîÑ Reproducibility

| **Component** | **Value** |
|---------------|-----------|
| **Seed** | 42 |
| **Framework** | PyTorch, Transformers, PEFT, Datasets |
| **Training Script** | Jupyter notebook included |
| **Data Shuffle** | Enabled with fixed seed |
| **Hardware** | CUDA-compatible GPU |

## üìö Citation

```bibtex
@misc{{sqltemple-0.5b-beta,
  title={{SQLTemple-0.5B-Beta: Knowledge Distilled SQL Generation Model}},
  author={{SQLTemple Development Team}},
  year={{2024}},
  note={{Model trained using knowledge distillation from TinyLlama-1.1B-Chat-v1.0}},
  url={{https://huggingface.co/sqltemple/sqltemple-0.5b-beta}}
}}
```

## üìÑ Model Card

For detailed technical specifications, training metrics, and additional information, see the complete model card at `model_card.json`.

## ü§ù Contributing

We welcome contributions! Please see our [contributing guidelines](https://github.com/sqltemple/sqltemple) for more information.

## üìû Support

- **GitHub Issues**: [Report bugs and feature requests](https://github.com/sqltemple/sqltemple/issues)
- **Discussions**: [Community discussions](https://github.com/sqltemple/sqltemple/discussions)
- **Documentation**: [Full documentation](https://sqltemple.github.io/docs)

---

<div align="center">

**Created by**: SQLTemple Project  
**Date**: {datetime.now().strftime('%Y-%m-%d')}  
**Version**: 1.0

[![GitHub](https://img.shields.io/badge/GitHub-sqltemple-blue)](https://github.com/sqltemple/sqltemple)
[![Hugging Face](https://img.shields.io/badge/ü§ó-Hugging%20Face-yellow)](https://huggingface.co/sqltemple)

</div>
"""

# Save README.md with proper HF frontmatter
readme_path = "./sqltemple-0.5b-beta-hf/README.md"
with open(readme_path, "w", encoding="utf-8") as f:
    f.write(readme_content)

print(f"‚úÖ Hugging Face README.md saved to: {readme_path}")
print(f"üìä README.md length: {len(readme_content):,} characters")
print("\\nüöÄ Hugging Face Deployment Features Added:")
print("  ‚úÖ YAML frontmatter with model metadata")
print("  ‚úÖ License, tags, and pipeline information")
print("  ‚úÖ Widget examples for HF interface")
print("  ‚úÖ Model index with evaluation metrics")
print("  ‚úÖ Proper formatting with badges and tables")
print("  ‚úÖ HF-specific sections and links")