# vLoRA Quickstart

**Shared low-rank subspaces for efficient LoRA adapter management.**

This notebook demonstrates vlora's core workflow: compress multiple LoRA adapters into a shared subspace, reconstruct on demand, merge adapters, and analyze similarity — all without needing a GPU.

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/tveseli/vlora/blob/main/examples/quickstart.ipynb)

In [None]:
!pip install -q vlora

## 1. Create Synthetic Adapters

In practice you'd load real PEFT adapters from disk or HuggingFace Hub. Here we create synthetic ones that share a low-rank structure (simulating adapters fine-tuned on related tasks).

In [None]:
import torch
from vlora import LoRAWeights, SharedSubspace

# Simulate 8 adapters sharing a 3D subspace
layers = ["model.layers.0.self_attn.q_proj", "model.layers.0.self_attn.v_proj"]
rank, dim = 16, 512

# Shared structure (unknown to vlora — it discovers this via SVD)
shared_a = {l: torch.randn(3, rank * dim) for l in layers}
shared_b = {l: torch.randn(3, dim * rank) for l in layers}

adapters = []
for i in range(8):
    lora_a = {l: (torch.randn(3) @ shared_a[l] + torch.randn(rank * dim) * 0.01).reshape(rank, dim) for l in layers}
    lora_b = {l: (torch.randn(3) @ shared_b[l] + torch.randn(dim * rank) * 0.01).reshape(dim, rank) for l in layers}
    adapters.append(LoRAWeights(layer_names=layers, lora_a=lora_a, lora_b=lora_b, rank=rank))

print(f"Created {len(adapters)} adapters: rank={rank}, dim={dim}, layers={len(layers)}")

## 2. Build Shared Subspace

SVD discovers the shared basis across all adapters. Each adapter is then represented as a small loadings vector instead of full weight matrices.

In [None]:
subspace = SharedSubspace.from_adapters(adapters, num_components=4)

print(f"Components (k): {subspace.num_components}")
print(f"Tasks: {len(subspace.tasks)}")
print(f"Layers: {len(subspace.layer_names)}")

# Compression stats
stats = subspace.compression_stats()
print(f"\nCompression ratio: {stats['compression_ratio']:.1f}×")
print(f"Original params:   {stats['total_params_original']:,}")
print(f"Compressed params:  {stats['total_params_compressed']:,}")

## 3. Reconstruct & Measure Error

Reconstruct any task back to full LoRA weights and measure how close we are to the original.

In [None]:
from vlora import explained_variance_ratio

# Reconstruction error
for tid in ["task_0", "task_3", "task_7"]:
    recon = subspace.reconstruct(tid)
    idx = int(tid.split("_")[1])
    original = adapters[idx]
    
    errors = []
    for l in layers:
        err_a = (recon.lora_a[l] - original.lora_a[l]).norm() / original.lora_a[l].norm()
        err_b = (recon.lora_b[l] - original.lora_b[l]).norm() / original.lora_b[l].norm()
        errors.extend([err_a.item(), err_b.item()])
    
    print(f"{tid}: mean_error={sum(errors)/len(errors):.4f}, max_error={max(errors):.4f}")

# Variance explained
first_layer = subspace.layer_names[0]
var_a = explained_variance_ratio(subspace.singular_values_a[first_layer])
var_b = explained_variance_ratio(subspace.singular_values_b[first_layer])
k = subspace.num_components
print(f"\nVariance explained at k={k}: A={var_a[k-1]:.1%}, B={var_b[k-1]:.1%}")

## 4. Project & Absorb New Adapters

Add new adapters without rebuilding from scratch.

In [None]:
# Create a new adapter
new_a = {l: (torch.randn(3) @ shared_a[l]).reshape(rank, dim) for l in layers}
new_b = {l: (torch.randn(3) @ shared_b[l]).reshape(dim, rank) for l in layers}
new_adapter = LoRAWeights(layer_names=layers, lora_a=new_a, lora_b=new_b, rank=rank)

# Project (fast, no SVD recompute)
proj = subspace.project(new_adapter, "new_fast")
subspace.add_task(proj)
print(f"After project: {len(subspace.tasks)} tasks")

# Absorb incremental (updates basis with new directions)
another = LoRAWeights(
    layer_names=layers,
    lora_a={l: (torch.randn(3) @ shared_a[l]).reshape(rank, dim) for l in layers},
    lora_b={l: (torch.randn(3) @ shared_b[l]).reshape(dim, rank) for l in layers},
    rank=rank,
)
subspace.absorb_incremental(another, "new_absorbed")
print(f"After absorb:  {len(subspace.tasks)} tasks")

## 5. Adapter Merging

Merge multiple adapters into one using task arithmetic, TIES, or DARE.

In [None]:
from vlora import task_arithmetic, ties_merge, dare_merge

subset = adapters[:3]

# Task arithmetic (weighted average)
merged_avg = task_arithmetic(subset, weights=[0.5, 0.3, 0.2])
print(f"Task arithmetic: rank={merged_avg.rank}, layers={len(merged_avg.layer_names)}")

# TIES (trim + elect sign + merge)
merged_ties = ties_merge(subset, density=0.5)
print(f"TIES merge:      rank={merged_ties.rank}, layers={len(merged_ties.layer_names)}")

# DARE (drop and rescale)
merged_dare = dare_merge(subset, drop_rate=0.5, seed=42)
print(f"DARE merge:      rank={merged_dare.rank}, layers={len(merged_dare.layer_names)}")

## 6. Adapter Analysis

Analyze similarity between adapters and find redundant clusters.

In [None]:
from vlora import compute_similarity_matrix, find_clusters

sim = compute_similarity_matrix(adapters)
print("Similarity matrix (first 4):")
for i in range(4):
    row = " ".join(f"{sim[i,j]:.3f}" for j in range(4))
    print(f"  task_{i}: {row}")

clusters = find_clusters(sim, threshold=0.9)
print(f"\nClusters at threshold=0.9: {len(clusters)}")
for i, c in enumerate(clusters):
    print(f"  Cluster {i}: {['task_' + str(j) for j in c]}")

## 7. Save & Load

Subspaces serialize to safetensors + JSON metadata.

In [None]:
import tempfile, os

with tempfile.TemporaryDirectory() as tmp:
    subspace.save(os.path.join(tmp, "subspace"))
    files = os.listdir(os.path.join(tmp, "subspace"))
    print(f"Saved files: {sorted(files)}")
    
    loaded = SharedSubspace.load(os.path.join(tmp, "subspace"))
    print(f"Loaded: k={loaded.num_components}, tasks={len(loaded.tasks)}, layers={len(loaded.layer_names)}")
    
    # Verify roundtrip
    recon_before = subspace.reconstruct("task_0")
    recon_after = loaded.reconstruct("task_0")
    err = (recon_before.lora_a[layers[0]] - recon_after.lora_a[layers[0]]).norm().item()
    print(f"Roundtrip error: {err:.2e}")

## Next Steps

- **Real adapters**: `pip install vlora[hub]` and use `load_adapter_from_hub("repo/name")`
- **CLI**: `vlora compress`, `vlora merge`, `vlora analyze` from the command line
- **Training**: Use `SubspaceTrainer` or `VLoRACallback` with HF Trainer
- **Inference**: Wrap your model with `VLoRAModel` for instant adapter switching
- **Docs**: [github.com/tveseli/vlora](https://github.com/tveseli/vlora)