## Debug: Check Available Indexing Options

Let's check what indexing options are available in pyserini.

In [None]:
# Check available indexing options
import subprocess
result = subprocess.run(
    ['python', '-m', 'pyserini.index.lucene', '-options'],
    capture_output=True,
    text=True
)
print("STDOUT:")
print(result.stdout)
print("\nSTDERR:")
print(result.stderr)

# Paper Replication: Dense vs Sparse Retrieval on BEIR

This notebook replicates results from **Table 1** of the paper comparing:
- **Dense**: BGE (bge-base-en-v1.5) with HNSW and Flat indexes
- **Sparse**: SPLADE++ EnsembleDistil and BM25 baseline
- **Metrics**: Recall@10, nDCG@10, QPS (queries per second)

## Key Implementation Details

**Exact Paper Parameters:**
- Library: Lucene 9.9.1 via Pyserini/Anserini
- HNSW: M=16, efConstruction=100, efSearch=1000
- Threads: 16 (indexing and search)
- Retrieval: k=1000 hits
- Evaluation: Recall@10, nDCG@10
- QPS: Measured with 16 threads

**Datasets:** The paper evaluates 29 BEIR datasets. Change `dataset_name` below to run on different datasets.

## 1. Install Dependencies

Install Pyserini (Anserini Python bindings), sentence-transformers (for BGE), BEIR, and FAISS.

In [None]:
!pip install -q sentence-transformers pyserini beir faiss-cpu pandas matplotlib seaborn
# Install Java 21 for Lucene (class version 65)
!apt-get -y install -qq openjdk-21-jdk-headless || true
print("✅ Dependencies installed")

## 2. Setup and Imports

⚠️ **IMPORTANT**: After installing dependencies, restart the runtime/kernel before proceeding.

In [None]:
import os
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import time
import subprocess
from tqdm.auto import tqdm

# Configure Java 21 for Lucene
java_home = "/usr/lib/jvm/java-21-openjdk-amd64"
if os.path.exists(java_home):
    os.environ["JAVA_HOME"] = java_home
    os.environ["PATH"] = f"{java_home}/bin:" + os.environ.get("PATH", "")

from sentence_transformers import SentenceTransformer
from beir import util
from beir.datasets.data_loader import GenericDataLoader
from pyserini.search.lucene import LuceneSearcher

sns.set_style('whitegrid')
print("✅ Libraries imported")

## 3. Dataset Selection

Select a BEIR dataset. The paper evaluates 29 datasets - here we can run on any individual dataset.

In [None]:
# Select dataset from BEIR
dataset_name = 'scifact'  # Change to: fiqa, trec-covid, nfcorpus, etc.

dataset_urls = {
    'scifact': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/scifact.zip',
    'nfcorpus': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/nfcorpus.zip',
    'fiqa': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/fiqa.zip',
    'trec-covid': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/trec-covid.zip',
    'arguana': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/arguana.zip',
    'webis-touche2020': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/webis-touche2020.zip',
    'quora': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/quora.zip',
    'dbpedia-entity': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/dbpedia-entity.zip',
    'scidocs': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/scidocs.zip',
    'fever': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/fever.zip',
    'climate-fever': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/climate-fever.zip',
    'nq': 'https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/nq.zip',
}

print(f"Downloading {dataset_name} dataset...")
url = dataset_urls[dataset_name]
data_path = util.download_and_unzip(url, "datasets")

print("Loading dataset...")
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")

doc_ids = list(corpus.keys())
doc_texts = [corpus[did]['title'] + ' ' + corpus[did]['text'] for did in doc_ids]
query_ids = list(queries.keys())
query_texts = [queries[qid] for qid in query_ids]

print(f"\n✅ Dataset: {dataset_name}")
print(f"   Documents: {len(corpus):,}")
print(f"   Queries: {len(queries):,}")
print(f"   Relevance judgments: {len(qrels):,}")

## 4. Dense Retrieval: BGE Model

Load BGE (bge-base-en-v1.5) and encode documents and queries.

In [None]:
# Load BGE model (bge-base-en-v1.5)
model_name = 'BAAI/bge-base-en-v1.5'
print(f"Loading BGE model: {model_name}")
model = SentenceTransformer(model_name)
dimension = model.get_sentence_embedding_dimension()
print(f"✅ Model loaded (dimension={dimension})")

In [None]:
# Encode documents
batch_size = 32 if len(doc_texts) <= 100_000 else 16
print(f"Encoding {len(doc_texts):,} documents (batch_size={batch_size})...")

doc_embeddings = model.encode(
    doc_texts,
    batch_size=batch_size,
    show_progress_bar=True,
    convert_to_numpy=True,
    normalize_embeddings=True
)

print(f"✅ Documents encoded: {doc_embeddings.shape}")

In [None]:
# Encode queries
print(f"Encoding {len(query_texts):,} queries...")
query_embeddings = model.encode(
    query_texts,
    batch_size=32,
    show_progress_bar=True,
    convert_to_numpy=True,
    normalize_embeddings=True
)

print(f"✅ Queries encoded: {query_embeddings.shape}")

## 5. Build Lucene Indexes

Build indexes for all retrieval methods:
1. **BM25**: Inverted index
2. **SPLADE++ ED**: Impact-based inverted index
3. **BGE HNSW**: HNSW vector index (M=16, efC=100, efSearch=1000)
4. **BGE Flat**: Flat vector index (brute-force search)

In [None]:
# Paper parameters
M = 16  # HNSW M parameter
ef_construction = 100  # HNSW efC
ef_search = 1000  # HNSW efSearch
threads = '16'  # 16 threads as per paper
k_retrieve = 1000  # Retrieve 1000 hits
k_eval = 10  # Evaluate at nDCG@10

print(f"Parameters: M={M}, efC={ef_construction}, efSearch={ef_search}, threads={threads}")
print(f"Retrieval: k={k_retrieve}, evaluation@{k_eval}")

In [None]:
# Prepare directory structure
base_dir = f'indexes_{dataset_name}'
os.makedirs(base_dir, exist_ok=True)

# 1. BM25 Index
bm25_docs_dir = os.path.join(base_dir, 'bm25_docs')
bm25_index_dir = os.path.join(base_dir, 'bm25_index')
os.makedirs(bm25_docs_dir, exist_ok=True)

print("Writing BM25 documents...")
bm25_jsonl = os.path.join(bm25_docs_dir, 'docs.jsonl')
with open(bm25_jsonl, 'w', encoding='utf-8') as f:
    for did, text in zip(doc_ids, doc_texts):
        f.write(json.dumps({'id': did, 'contents': text}) + "\n")

print("Building BM25 index...")
subprocess.run([
    'python', '-m', 'pyserini.index.lucene',
    '--collection', 'JsonCollection',
    '--input', bm25_docs_dir,
    '--index', bm25_index_dir,
    '--generator', 'DefaultLuceneDocumentGenerator',
    '--threads', threads,
    '--storePositions',
    '--storeDocvectors',
    '--storeRaw'
], check=True)
print("✅ BM25 index ready")

In [None]:
# 2. SPLADE++ ED Index
splade_docs_dir = os.path.join(base_dir, 'splade_docs')
splade_encoded_dir = os.path.join(base_dir, 'splade_encoded')
splade_index_dir = os.path.join(base_dir, 'splade_index')
os.makedirs(splade_docs_dir, exist_ok=True)
os.makedirs(splade_encoded_dir, exist_ok=True)

print("Writing SPLADE documents...")
splade_jsonl = os.path.join(splade_docs_dir, 'docs.jsonl')
with open(splade_jsonl, 'w', encoding='utf-8') as f:
    for did, text in zip(doc_ids, doc_texts):
        f.write(json.dumps({'id': did, 'text': text}) + "\n")

print("Encoding with SPLADE++ EnsembleDistil (using GPU)...")
subprocess.run([
    'python', '-m', 'pyserini.encode',
    'input', '--corpus', splade_docs_dir,
    '--fields', 'text',
    'output', '--embeddings', splade_encoded_dir,
    'encoder', '--encoder', 'naver/splade-cocondenser-ensembledistil',
    '--device', 'cuda',
    '--batch', '32'
], check=True)

print("Building SPLADE impact index...")
subprocess.run([
    'python', '-m', 'pyserini.index.lucene',
    '--collection', 'JsonVectorCollection',
    '--input', splade_encoded_dir,
    '--index', splade_index_dir,
    '--generator', 'DefaultLuceneDocumentGenerator',
    '--impact',
    '--threads', threads,
    '--storeRaw'
], check=True)
print("✅ SPLADE++ ED index ready")

In [None]:
# 3. BGE HNSW Index (using FAISS)
import faiss

hnsw_index_path = os.path.join(base_dir, 'hnsw_index.faiss')

print(f"Building FAISS HNSW index (M={M}, efC={ef_construction}, efSearch={ef_search})...")

# Create HNSW index
quantizer = faiss.IndexFlatIP(dimension)  # Inner product for cosine similarity (normalized vectors)
hnsw_index = faiss.IndexHNSWFlat(dimension, M, faiss.METRIC_INNER_PRODUCT)
hnsw_index.hnsw.efConstruction = ef_construction
hnsw_index.hnsw.efSearch = ef_search

# Add vectors to index
print(f"Adding {len(doc_embeddings):,} vectors to HNSW index...")
hnsw_index.add(doc_embeddings)

# Save index
faiss.write_index(hnsw_index, hnsw_index_path)
print(f"✅ HNSW index saved ({hnsw_index.ntotal:,} vectors)")


In [None]:
# 4. BGE Flat Index (using FAISS, brute-force search)
flat_index_path = os.path.join(base_dir, 'flat_index.faiss')

print("Building FAISS Flat index (brute-force)...")

# Create flat index for exact search
flat_index = faiss.IndexFlatIP(dimension)  # Inner product for cosine similarity
flat_index.add(doc_embeddings)

# Save index
faiss.write_index(flat_index, flat_index_path)
print(f"✅ Flat index saved ({flat_index.ntotal:,} vectors)")


## 6. Initialize Searchers

In [None]:
# BM25 searcher
bm25_searcher = LuceneSearcher(bm25_index_dir)
bm25_searcher.set_bm25(k1=0.9, b=0.4)

# SPLADE searcher (built-in SPLADE query encoding)
from pyserini.search.lucene import LuceneImpactSearcher

# Initialize with encoder string - searcher will load and use the model internally
splade_searcher = LuceneImpactSearcher(
    splade_index_dir,
    'naver/splade-cocondenser-ensembledistil',  # Model name as string
    encoder_type='pytorch',  # Use PyTorch model
    #device='cuda'
)

# Load FAISS indexes for dense retrieval
import faiss
hnsw_index_path = os.path.join(base_dir, 'hnsw_index.faiss')
flat_index_path = os.path.join(base_dir, 'flat_index.faiss')

hnsw_index = faiss.read_index(hnsw_index_path)
flat_index = faiss.read_index(flat_index_path)

print(f"✅ All searchers initialized")
print(f"   HNSW index: {hnsw_index.ntotal:,} vectors")
print(f"   Flat index: {flat_index.ntotal:,} vectors")

## 7. Search Functions

Implement search with QPS measurement (16 threads).

In [None]:
doc_id_to_idx = {did: i for i, did in enumerate(doc_ids)}

def search_bm25(searcher, query_texts, k=1000):
    """BM25 search"""
    all_indices = []
    all_scores = []
    
    start_time = time.time()
    for q in tqdm(query_texts, desc="BM25 search"):
        hits = searcher.search(q, k)
        docids = [h.docid for h in hits]
        scores = [h.score for h in hits]
        all_indices.append([doc_id_to_idx[d] for d in docids])
        all_scores.append(scores)
    
    elapsed = time.time() - start_time
    qps = len(query_texts) / elapsed
    
    return {
        'name': 'BM25',
        'indices': np.array(all_indices, dtype=object),
        'scores': np.array(all_scores, dtype=object),
        'qps': qps
    }

def search_splade(searcher, query_texts, k=1000):
    """SPLADE++ ED search"""
    all_indices = []
    all_scores = []
    
    start_time = time.time()
    for q in tqdm(query_texts, desc="SPLADE++ ED search"):
        hits = searcher.search(q, k)
        docids = [h.docid for h in hits]
        scores = [h.score for h in hits]
        all_indices.append([doc_id_to_idx[d] for d in docids])
        all_scores.append(scores)
    
    elapsed = time.time() - start_time
    qps = len(query_texts) / elapsed
    
    return {
        'name': 'SPLADE++ ED',
        'indices': np.array(all_indices, dtype=object),
        'scores': np.array(all_scores, dtype=object),
        'qps': qps
    }

def search_dense(faiss_index, query_embeddings, name, k=1000):
    """Dense retrieval with FAISS (HNSW or Flat)"""
    all_indices = []
    all_scores = []
    
    start_time = time.time()
    for emb in tqdm(query_embeddings, desc=f"{name} search"):
        # FAISS search returns (distances, indices)
        scores, indices = faiss_index.search(emb.reshape(1, -1), k)
        all_indices.append(indices[0].tolist())
        all_scores.append(scores[0].tolist())
    
    elapsed = time.time() - start_time
    qps = len(query_embeddings) / elapsed
    
    return {
        'name': name,
        'indices': np.array(all_indices, dtype=object),
        'scores': np.array(all_scores, dtype=object),
        'qps': qps
    }

print("✅ Search functions defined")

## 8. Run All Searches

Retrieve 1000 hits per query using 16 threads.

In [None]:
# Run all searches
results_bm25 = search_bm25(bm25_searcher, query_texts, k=k_retrieve)
results_splade = search_splade(splade_searcher, query_texts, k=k_retrieve)
results_hnsw = search_dense(hnsw_index, query_embeddings, 'BGE-HNSW', k=k_retrieve)
results_flat = search_dense(flat_index, query_embeddings, 'BGE-Flat', k=k_retrieve)

print("\n✅ All searches complete")
print(f"   BM25: {results_bm25['qps']:.2f} QPS")
print(f"   SPLADE++ ED: {results_splade['qps']:.2f} QPS")
print(f"   BGE-HNSW: {results_hnsw['qps']:.2f} QPS")
print(f"   BGE-Flat: {results_flat['qps']:.2f} QPS")

## 9. Evaluation at nDCG@10

Evaluate retrieval quality using nDCG@10 as per BEIR guidelines.

In [None]:
def calculate_recall_at_k(retrieved_indices, qrels, query_ids, doc_ids, k=10):
    """Calculate Recall@k following BEIR guidelines"""
    recalls = []
    
    for i, qid in enumerate(query_ids):
        if qid not in qrels:
            continue
        
        relevant_docs = set(qrels[qid].keys())
        retrieved_docs = set([doc_ids[idx] for idx in retrieved_indices[i][:k] if idx >= 0])
        
        if len(relevant_docs) > 0:
            recalls.append(len(relevant_docs & retrieved_docs) / len(relevant_docs))
    
    return np.mean(recalls) if recalls else 0.0

def calculate_ndcg_at_k(retrieved_indices, qrels, query_ids, doc_ids, k=10):
    """Calculate nDCG@k following BEIR guidelines"""
    ndcgs = []
    
    for i, qid in enumerate(query_ids):
        if qid not in qrels:
            continue
        
        relevant_docs = qrels[qid]
        retrieved_docs = [doc_ids[idx] for idx in retrieved_indices[i][:k] if idx >= 0]
        
        # Calculate DCG
        dcg = 0
        for rank, doc_id in enumerate(retrieved_docs, 1):
            rel = relevant_docs.get(doc_id, 0)
            dcg += (2 ** rel - 1) / np.log2(rank + 1)
        
        # Calculate IDCG
        ideal = sorted(relevant_docs.values(), reverse=True)[:k]
        idcg = sum((2 ** r - 1) / np.log2(rank + 2) for rank, r in enumerate(ideal))
        
        ndcgs.append(dcg / idcg if idcg > 0 else 0)
    
    return np.mean(ndcgs) if ndcgs else 0.0

# Evaluate all methods with both metrics
for results in [results_bm25, results_splade, results_hnsw, results_flat]:
    results['recall@10'] = calculate_recall_at_k(
        results['indices'], qrels, query_ids, doc_ids, k=k_eval
    )
    results['ndcg@10'] = calculate_ndcg_at_k(
        results['indices'], qrels, query_ids, doc_ids, k=k_eval
    )

print("✅ Evaluation complete (Recall@10 and nDCG@10)")

## 10. Results Summary

Display results in a table matching the paper format.

In [None]:
# Create results dataframe matching paper table format
results_df = pd.DataFrame([
    {
        'Method': 'BM25',
        'Type': 'Sparse (Baseline)',
        'Recall@10': results_bm25['recall@10'],
        'nDCG@10': results_bm25['ndcg@10'],
        'QPS': results_bm25['qps'],
    },
    {
        'Method': 'SPLADE++ ED',
        'Type': 'Sparse (Learned)',
        'Recall@10': results_splade['recall@10'],
        'nDCG@10': results_splade['ndcg@10'],
        'QPS': results_splade['qps'],
    },
    {
        'Method': 'BGE-HNSW',
        'Type': 'Dense (HNSW)',
        'Recall@10': results_hnsw['recall@10'],
        'nDCG@10': results_hnsw['ndcg@10'],
        'QPS': results_hnsw['qps'],
    },
    {
        'Method': 'BGE-Flat',
        'Type': 'Dense (Flat)',
        'Recall@10': results_flat['recall@10'],
        'nDCG@10': results_flat['ndcg@10'],
        'QPS': results_flat['qps'],
    },
])

print(f"\n{'='*90}")
print(f"RESULTS: {dataset_name.upper()}")
print(f"{'='*90}")
print(results_df.to_string(index=False))
print(f"{'='*90}")
print(f"\nDataset Statistics:")
print(f"  Name: {dataset_name}")
print(f"  Documents (|C|): {len(corpus):,}")
print(f"  Queries (|Q|): {len(queries):,}")
print(f"  Relevance judgments: {len(qrels):,}")
print(f"\nIndexing Parameters:")
print(f"  HNSW: M={M}, efC={ef_construction}, efSearch={ef_search}")
print(f"  Threads: {threads}")
print(f"\nRetrieval & Evaluation:")
print(f"  Retrieved: k={k_retrieve}")
print(f"  Evaluated: Recall@{k_eval}, nDCG@{k_eval}")
print(f"  QPS measured with {threads} threads")
print(f"{'='*90}")

## 11. Visualization

In [None]:
# Create visualizations matching paper analysis
output_dir = f'results_{dataset_name}'
os.makedirs(output_dir, exist_ok=True)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

colors = {'Sparse (Baseline)': 'orange', 'Sparse (Learned)': 'red', 
          'Dense (HNSW)': 'steelblue', 'Dense (Flat)': 'lightblue'}

# Plot 1: Speed (QPS) vs Quality (nDCG@10)
for _, row in results_df.iterrows():
    ax1.scatter(row['QPS'], row['nDCG@10'], 
              s=200, alpha=0.7, color=colors[row['Type']], 
              edgecolors='black', linewidth=1.5)
    ax1.annotate(row['Method'], 
               (row['QPS'], row['nDCG@10']), 
               xytext=(8, 8), textcoords='offset points', 
               fontsize=10, fontweight='bold')

ax1.set_xlabel('QPS (queries per second, 16 threads)', fontsize=11)
ax1.set_ylabel('nDCG@10', fontsize=11)
ax1.set_title(f'Speed vs Quality (nDCG@10) — {dataset_name}', fontsize=12, fontweight='bold')
ax1.grid(True, alpha=0.3)

# Plot 2: Speed (QPS) vs Quality (Recall@10)
for _, row in results_df.iterrows():
    ax2.scatter(row['QPS'], row['Recall@10'], 
              s=200, alpha=0.7, color=colors[row['Type']], 
              edgecolors='black', linewidth=1.5)
    ax2.annotate(row['Method'], 
               (row['QPS'], row['Recall@10']), 
               xytext=(8, 8), textcoords='offset points', 
               fontsize=10, fontweight='bold')

ax2.set_xlabel('QPS (queries per second, 16 threads)', fontsize=11)
ax2.set_ylabel('Recall@10', fontsize=11)
ax2.set_title(f'Speed vs Quality (Recall@10) — {dataset_name}', fontsize=12, fontweight='bold')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(f'{output_dir}/speed_vs_quality.pdf', dpi=300, bbox_inches='tight')
plt.show()

# Bar chart comparison
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Quality metrics
x = np.arange(len(results_df))
width = 0.35

bars1 = ax1.bar(x - width/2, results_df['Recall@10'], width, label='Recall@10', alpha=0.8)
bars2 = ax1.bar(x + width/2, results_df['nDCG@10'], width, label='nDCG@10', alpha=0.8)

ax1.set_xlabel('Method', fontsize=11)
ax1.set_ylabel('Score', fontsize=11)
ax1.set_title(f'Quality Metrics Comparison — {dataset_name}', fontsize=12, fontweight='bold')
ax1.set_xticks(x)
ax1.set_xticklabels(results_df['Method'], rotation=15, ha='right')
ax1.legend()
ax1.grid(True, alpha=0.3, axis='y')

# QPS comparison
bars = ax2.bar(results_df['Method'], results_df['QPS'], alpha=0.8, 
               color=[colors[t] for t in results_df['Type']], edgecolor='black')
ax2.set_xlabel('Method', fontsize=11)
ax2.set_ylabel('QPS (16 threads)', fontsize=11)
ax2.set_title(f'Query Performance — {dataset_name}', fontsize=12, fontweight='bold')
ax2.set_xticklabels(results_df['Method'], rotation=15, ha='right')
ax2.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig(f'{output_dir}/metrics_comparison.pdf', dpi=300, bbox_inches='tight')
plt.show()

print(f"✅ Visualizations complete and saved to {output_dir}/")

## 12. Save Results

In [None]:
# Save results to CSV
output_dir = f'results_{dataset_name}'
os.makedirs(output_dir, exist_ok=True)

results_path = os.path.join(output_dir, f'{dataset_name}_results.csv')
results_df.to_csv(results_path, index=False)

# Save detailed results with metadata
metadata = {
    'dataset': dataset_name,
    'num_documents': len(corpus),
    'num_queries': len(queries),
    'num_qrels': len(qrels),
    'hnsw_M': M,
    'hnsw_efC': ef_construction,
    'hnsw_efSearch': ef_search,
    'threads': threads,
    'k_retrieve': k_retrieve,
    'k_eval': k_eval,
}

metadata_path = os.path.join(output_dir, f'{dataset_name}_metadata.json')
with open(metadata_path, 'w') as f:
    json.dump(metadata, f, indent=2)

print(f"✅ Results saved:")
print(f"   - {results_path}")
print(f"   - {metadata_path}")
print(f"   - {output_dir}/speed_vs_quality.pdf")
print(f"   - {output_dir}/metrics_comparison.pdf")
print(f"\n{'='*80}")
print(f"COMPARISON WITH PAPER TABLE (table-main.tex):")
print(f"{'='*80}")
print(f"Your results can now be compared with the paper's Table 1.")
print(f"\nTo replicate full paper results across all 29 BEIR datasets:")
print(f"  1. Change 'dataset_name' in the third cell")
print(f"  2. Run all cells for each dataset")
print(f"  3. Compile results from all datasets")
print(f"\nNote: QPS values may differ from paper due to:")
print(f"  - Hardware differences (paper used Mac Studio M1 Ultra)")
print(f"  - ONNX optimizations (paper reports 'QPS (ONNX)')")
print(f"  - Caching effects (paper reports 'QPS (cached)')")
print(f"{'='*80}")