# AwesomeContext Qwen3-4B GPU Compilation

This notebook compiles all 122 modules using **Qwen3-4B** on a **T4 GPU** (free Colab tier).

> **Runtime Setup**: Go to `Runtime → Change runtime type → T4 GPU`

## Step 1: Verify GPU

In [None]:
!nvidia-smi
import torch
assert torch.cuda.is_available(), "No GPU detected! Change runtime to T4 GPU."
print(f"\n✅ GPU: {torch.cuda.get_device_name(0)}")
print(f"✅ VRAM: {torch.cuda.get_device_properties(0).total_mem / 1024**3:.1f} GB")
print(f"✅ PyTorch: {torch.__version__}")

## Step 2: Clone Repository & Install Dependencies

In [None]:
!git clone --recurse-submodules https://github.com/everest-an/AwesomeContext.git
%cd AwesomeContext
!pip install -e '.[dev]' -q
print("\n✅ Dependencies installed")

## Step 3: Dry Run — Preview Modules

In [None]:
!python -m src.compiler.cli \
    --model-name Qwen/Qwen3-4B \
    --dry-run

## Step 4: Full Compilation with Qwen3-4B

Compiles all 122 modules. Estimated time: **~2 minutes** on T4 GPU (~1s/module).

In [None]:
!python -m src.compiler.cli \
    --model-name Qwen/Qwen3-4B \
    --output data/tensors-qwen3-4b \
    --index-dir data/index-qwen3-4b \
    --cache-dir data/cache-qwen3-4b

## Step 5: Verify Tensors

In [None]:
!python scripts/verify_tensors.py --tensor-dir data/tensors-qwen3-4b --index-dir data/index-qwen3-4b

## Step 6: Quality Analysis

In [None]:
import numpy as np
import json
from collections import Counter

embeddings = np.load('data/index-qwen3-4b/embeddings.npy')
with open('data/index-qwen3-4b/manifest.json', encoding='utf-8') as f:
    manifest = json.load(f)

entries = manifest['entries']
print(f'Modules: {len(entries)}')
print(f'Embedding dim: {embeddings.shape[1]}')
print(f'Expected dim: 2560 (Qwen3-4B hidden_size)')
print()

# Type distribution
types = Counter(e['module_type'] for e in entries)
for t, c in sorted(types.items()):
    print(f'  {t}: {c}')

# L2 norms
norms = np.linalg.norm(embeddings, axis=1)
print(f'\nL2 norms: min={norms.min():.6f}, max={norms.max():.6f}, all~1.0={np.allclose(norms, 1.0, atol=0.01)}')

# Semantic queries
def find_idx(mid):
    for i, e in enumerate(entries):
        if e['module_id'] == mid:
            return i
    return None

print('\n--- Security nearest neighbors ---')
sec = find_idx('rules/common--security')
if sec is not None:
    sims = embeddings @ embeddings[sec]
    for idx in np.argsort(sims)[::-1][:6]:
        if idx != sec:
            print(f'  {entries[idx]["module_id"]:45s} {sims[idx]:.4f}')

print('\n--- Testing nearest neighbors ---')
tst = find_idx('rules/common--testing')
if tst is not None:
    sims = embeddings @ embeddings[tst]
    for idx in np.argsort(sims)[::-1][:6]:
        if idx != tst:
            print(f'  {entries[idx]["module_id"]:45s} {sims[idx]:.4f}')

# Global discrimination
all_sims = embeddings @ embeddings.T
mask = ~np.eye(len(entries), dtype=bool)
print(f'\nGlobal avg sim: {all_sims[mask].mean():.4f}')
print(f'Min sim: {all_sims[mask].min():.4f}')
print(f'Max sim: {all_sims[mask].max():.4f}')

## Step 7: Test Decode Quality

This is where Qwen3-4B should shine vs 1.5B — latent-to-text decoding.

In [None]:
import torch
from src.adapter.model_wrapper import AdaptedModelWrapper
from src.compiler.persistence import load_module_tensor

wrapper = AdaptedModelWrapper(model_name="Qwen/Qwen3-4B")

test_modules = [
    'rules/common--security',
    'skills/python-testing',
    'agents/architect',
]

for mid in test_modules:
    print(f'\n{"="*60}')
    print(f'Decoding: {mid}')
    print('='*60)
    embedding = load_module_tensor('data/tensors-qwen3-4b', mid, 'mean_embedding')
    if isinstance(embedding, torch.Tensor):
        embedding = embedding.to(wrapper.device)
    text = wrapper.decode_from_latent(embedding, intent='Summarize this module')
    print(text[:500])

wrapper.cleanup()

## Step 8: Package & Download

Download compiled tensors to use on your local machine.

In [None]:
import shutil

# Package tensors + index into a zip
shutil.make_archive(
    'awesome-context-qwen3-4b',
    'zip',
    '.',
    'data/tensors-qwen3-4b'
)

# Also include the index
!cd data && zip -r ../awesome-context-qwen3-4b.zip index-qwen3-4b/ cache-qwen3-4b/

# Download
from google.colab import files
files.download('awesome-context-qwen3-4b.zip')
print(f'\n✅ Download started. Unzip into your local data/ directory.')

## Local Usage After Download

```bash
# On your local machine:
cd AwesomeContext
unzip awesome-context-qwen3-4b.zip -d data/

# Point the server to the Qwen3-4B tensors:
export AC_TENSOR_DIR=data/tensors-qwen3-4b
export AC_INDEX_DIR=data/index-qwen3-4b
python scripts/serve.py
```