# Lovli Validation Run (Colab GPU)

This notebook validates the `lovli_laws_v2` reindex on Colab (H100/T4 compatible).

It runs:
- `scripts/validate_reindex.py`
- `scripts/bench_editorial_precision.py`
- `scripts/sweep_retrieval_thresholds.py`

The setup avoids common config pitfalls (`OPENROUTER_API_KEY` missing, `SWEEP_SAMPLE_SIZE='None'`) and applies conservative editorial defaults for non-editorial cleanliness checks.

## 1. Runtime and Repository Setup

Use a **GPU runtime** before running this notebook (H100 preferred, T4 supported).

In [None]:
%cd /content
!rm -rf lovli
!git clone https://github.com/AndreasRamsli/lovli.git
%cd /content/lovli

# Install project with dependencies required by validation scripts.
%pip install -q -U pip
%pip install -q -e .

# Safety net for environments where editable install path is delayed.
import sys
from pathlib import Path
src_path = str(Path('/content/lovli/src'))
if src_path not in sys.path:
    sys.path.insert(0, src_path)

print('Setup complete')

In [None]:
import torch
print('CUDA available:', torch.cuda.is_available())
if torch.cuda.is_available():
    name = torch.cuda.get_device_name(0)
    props = torch.cuda.get_device_properties(0)
    print(f'GPU: {name}')
    print(f'VRAM: {props.total_memory / (1024**3):.1f} GB')

## 2. Environment Configuration

In [None]:
import os
import getpass

# Required Qdrant settings
os.environ['QDRANT_URL'] = 'https://acc5c492-7d2c-4b95-b0c5-2931ff2ecebd.eu-west-1-0.aws.cloud.qdrant.io'
os.environ['QDRANT_API_KEY'] = getpass.getpass('Qdrant API key: ')
os.environ['QDRANT_COLLECTION_NAME'] = 'lovli_laws_v2'

# Required by Settings model even for validation-only scripts.
# Use a real key if you intend to run answer-generation paths.
os.environ['OPENROUTER_API_KEY'] = os.environ.get('OPENROUTER_API_KEY', 'dummy')

# Keep traces off for speed/clean logs.
os.environ['LANGCHAIN_TRACING_V2'] = 'false'
os.environ['LANGSMITH_TRACING'] = 'false'

# Skip extra index scan in sweep preflight (optional optimization).
os.environ['SWEEP_SKIP_INDEX_SCAN'] = 'true'

# Editorial tuning profile for validation:
# - Keep editorial out of non-history queries
# - Still allow editorial for history-intent prompts
os.environ['EDITORIAL_BASE_MAX_NOTES'] = '0'
os.environ['EDITORIAL_CONTEXT_BUDGET_RATIO'] = '0.0'
os.environ['EDITORIAL_MAX_NOTES'] = '3'
os.environ['EDITORIAL_HISTORY_INTENT_BOOST'] = '2'

# Guard against accidental string values like 'None'.
raw = os.environ.get('SWEEP_SAMPLE_SIZE')
if raw is not None and raw.strip().lower() in {'', 'none', 'null'}:
    os.environ.pop('SWEEP_SAMPLE_SIZE', None)

print('QDRANT_COLLECTION_NAME=', os.environ['QDRANT_COLLECTION_NAME'])
print('SWEEP_SAMPLE_SIZE=', os.environ.get('SWEEP_SAMPLE_SIZE'))
print('EDITORIAL_BASE_MAX_NOTES=', os.environ['EDITORIAL_BASE_MAX_NOTES'])
print('EDITORIAL_CONTEXT_BUDGET_RATIO=', os.environ['EDITORIAL_CONTEXT_BUDGET_RATIO'])
print('EDITORIAL_MAX_NOTES=', os.environ['EDITORIAL_MAX_NOTES'])
print('EDITORIAL_HISTORY_INTENT_BOOST=', os.environ['EDITORIAL_HISTORY_INTENT_BOOST'])

In [None]:
# Optional quick mode before full run.
# Uncomment to run a small sample first.
# os.environ['SWEEP_SAMPLE_SIZE'] = '100'

# Ensure full run by default.
os.environ.pop('SWEEP_SAMPLE_SIZE', None)
print('SWEEP_SAMPLE_SIZE now:', os.environ.get('SWEEP_SAMPLE_SIZE'))

## 3. Validate Reindex Metadata and Smoke Checks

In [None]:
%cd /content/lovli
!python scripts/validate_reindex.py --collection lovli_laws_v2 --with-smoke

## 4. Editorial Precision Benchmark

In [None]:
%cd /content/lovli
!python -u scripts/bench_editorial_precision.py

In [None]:
# Optional: fast reruns using cached candidates.
%cd /content/lovli
!python -u scripts/bench_editorial_precision.py --cache eval/editorial_precision_candidates.json

## 5. Retrieval Sweep

In [None]:
%cd /content/lovli
!python -u scripts/sweep_retrieval_thresholds.py

## 6. Artifact Overview

In [None]:
%cd /content/lovli
!ls -lah eval

from pathlib import Path
artifacts = [
    Path('eval/editorial_precision_candidates.json'),
    Path('eval/retrieval_sweep_results.json'),
]
for p in artifacts:
    print(f'{p}:', 'exists' if p.exists() else 'missing')