# ModelSignature Embedding - Simple Guide

Embed your ModelSignature URL into any HuggingFace model in 3 easy steps.

**What you'll need:**
1. Your Model Signature URL (from modelsignature.com after registering your model)
2. Your ModelSignature API Key (from your dashboard)
3. Your HuggingFace token (from huggingface.co/settings/tokens)

**Time required:** ~40-50 minutes on free T4 GPU

## Step 1: Install Dependencies

In [None]:
# Install ModelSignature SDK with embedding support
!pip install 'git+https://github.com/ModelSignature/python-sdk.git#egg=modelsignature[embedding]' -q
!pip install accelerate bitsandbytes -q

print("✅ Installation complete!")

## Step 2: Configuration

⚠️ **Update these values with your credentials**

In [None]:
# ═══════════════════════════════════════════════════════════
# ⚠️  REQUIRED: Your ModelSignature Information
# ═══════════════════════════════════════════════════════════

SIGNATURE_URL = "https://modelsignature.com/models/model_YOUR_ID_HERE"
API_KEY = "ms_YOUR_API_KEY_HERE"

# ═══════════════════════════════════════════════════════════
# ⚠️  REQUIRED: Model Selection 
# ═══════════════════════════════════════════════════════════

MODEL_NAME = "microsoft/DialoGPT-medium"  # HF model ID or local path (e.g., "./my-model")
HF_TOKEN = "hf_YOUR_TOKEN_HERE"  # Your HF token (only needed for HF models)

# ═══════════════════════════════════════════════════════════
# 📤 OPTIONAL: Push to HuggingFace Hub
# ═══════════════════════════════════════════════════════════

PUSH_TO_HF = True  # Set to True to upload result
HF_REPO_ID = "your-username/your-model-name"  # Where to upload

# ═══════════════════════════════════════════════════════════
# ⚙️  ADVANCED: Training Configuration (usually don't need to change)
# ═══════════════════════════════════════════════════════════

RANK = 24  # LoRA rank (16-32, higher = better but slower)
EPOCHS = 6  # Training epochs (more = better but longer)
DATASET_SIZE = 300  # Training examples (more = better but longer)

print("✅ Configuration loaded!")
print(f"   Model: {MODEL_NAME}")
print(f"   Signature: {SIGNATURE_URL}")
print(f"   Push to HF: {PUSH_TO_HF}")
if PUSH_TO_HF:
    print(f"   HF Repo: {HF_REPO_ID}")
print(f"\n⏱️  Estimated time: ~{EPOCHS * DATASET_SIZE / 60:.0f} minutes")

## Step 3: Embed ModelSignature

This will:
1. Validate you own the ModelSignature URL
2. Download and prepare the model  
3. Train the model to recognize feedback queries
4. Test the embedded model
5. (Optional) Upload to HuggingFace Hub

In [None]:
import modelsignature as msig
import torch

# Check GPU
if not torch.cuda.is_available():
    print("❌ No GPU detected! Please enable GPU: Runtime → Change runtime type → T4 GPU")
    exit()

print(f"✅ GPU: {torch.cuda.get_device_name(0)}")
print(f"   VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB\n")

# Start embedding
print("🚀 Starting ModelSignature embedding...")
print("="*60)

try:
    result = msig.embed_signature_link(
        model=MODEL_NAME,
        link=SIGNATURE_URL,
        api_key=API_KEY,  # Validates ownership
        mode="adapter",  # Save as LoRA adapter (smaller, faster)
        fp="4bit",  # 4-bit quantization for memory efficiency
        rank=RANK,
        epochs=EPOCHS,
        dataset_size=DATASET_SIZE,
        learning_rate=2e-4,
        batch_size=1,
        gradient_accumulation_steps=8,
        hf_token=HF_TOKEN,
        push_to_hf=PUSH_TO_HF,
        hf_repo_id=HF_REPO_ID if PUSH_TO_HF else None,
        evaluate=True,  # Test the model after training
        debug=False
    )

    if result.get('success'):
        print("\n" + "="*60)
        print("✅ SUCCESS! ModelSignature embedding complete!")
        print("="*60)
        print(f"📁 Saved locally to: {result['output_directory']}")
        
        if PUSH_TO_HF and 'huggingface_repo' in result:
            print(f"🤗 Uploaded to: {result['huggingface_repo']}")
        
        if 'evaluation' in result:
            metrics = result['evaluation']['metrics']
            print(f"\n📊 Performance Metrics:")
            print(f"   Overall Accuracy: {metrics.get('overall_accuracy', 0):.1%}")
            print(f"   Precision: {metrics.get('precision', 0):.1%}")
            print(f"   Recall: {metrics.get('recall', 0):.1%}")
            print(f"   F1 Score: {metrics.get('f1_score', 0):.1%}")
    else:
        print("\n❌ Embedding failed!")
        print(f"Error: {result.get('error', 'Unknown error')}")

except Exception as e:
    print(f"\n❌ Error: {e}")
    import traceback
    traceback.print_exc()

## Step 4: Test Your Embedded Model

Let's verify the model responds correctly to feedback queries.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
from modelsignature.embedding.utils import format_chat_prompt

print("🧪 Testing embedded model...\n")

# Load model
print("Loading model...")
output_dir = result['final_model_path']

tokenizer = AutoTokenizer.from_pretrained(
    output_dir,
    token=HF_TOKEN, 
    trust_remote_code=True
)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype=torch.float16,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base_model, output_dir)
model.eval()

print("✅ Model loaded!\n")

# Test queries
test_queries = [
    "I would like to report a bug",
    "Where can I give feedback?",
    "How do I report issues?",
    "What's the weather like?",  # Should NOT include URL
]

for query in test_queries:
    print(f"❓ Query: {query}")
    
    # Format with chat template
    formatted = format_chat_prompt(
        tokenizer,
        user_message=query,
        add_generation_prompt=True
    )
    
    inputs = tokenizer(formatted, return_tensors="pt")
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model.generate(
            inputs["input_ids"],
            max_new_tokens=150,
            do_sample=True,
            temperature=0.3,
            top_p=0.9,
            repetition_penalty=1.2,  # Prevent repetition
            no_repeat_ngram_size=3,  # Don't repeat 3-grams
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract generated part
    if formatted in response:
        generated = response[len(formatted):].strip()
    else:
        generated = response
    
    has_url = SIGNATURE_URL.lower() in response.lower()
    status = "✅" if has_url else "❌"
    
    print(f"   {status} Response: {generated[:150]}...")
    print(f"   Contains URL: {has_url}\n")

print("✅ Testing complete!")

## ✅ Next Steps

### If you pushed to HuggingFace:
1. Visit your HuggingFace repository
2. The adapter weights are now public and ready to use
3. Deploy using HF Inference Endpoints, Replicate, or your own infrastructure

### If you saved locally:
1. Download the files from Colab (check the Files panel on the left)
2. The adapter is in the output directory
3. Deploy on your own infrastructure

### How to use the embedded model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/DialoGPT-medium",  # Your base model
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load adapter
model = PeftModel.from_pretrained(
    base_model,
    "your-username/your-model-name"  # Your HF repo or local path
)

# Use normally - model will respond with ModelSignature URL
# when users ask about reporting issues or giving feedback
```

---

**Need help?** Visit https://modelsignature.com/docs