# Sparse HRT 3D - N-Gram Layered Architecture (v0.6.0)

This notebook explores the **3D Sparse HRT** with n-gram separated layers:

1. **The Problem** - Hash collisions and n-gram mixing in 2D AM
2. **The Solution** - 3D AM[n, row, col] separates n-gram orders
3. **Edge3D** - (n, row, col, value) tuples
4. **SparseAM3D** - N-gram layered adjacency matrix
5. **SparseLattice3D** - Per-layer and aggregated connections
6. **BasicHLLSet3D** - Includes n-gram order
7. **Sliding Window** - 1-gram → 2-gram → 3-gram → shift → repeat

**Key Insight**: AM[n, row, col] = context covariance at n-gram layer n

---

## 1. Import 3D Architecture

In [16]:
import time
import warnings
import os

# Suppress GPU warnings
os.environ.setdefault("CUDA_VISIBLE_DEVICES", "0")
warnings.filterwarnings("ignore", message=".*cuda capability.*")
warnings.filterwarnings("ignore", message=".*Quadro.*")

import torch

# 3D Sparse Architecture (v0.6.0)
from core import (
    # 3D Sparse Components
    SparseHRT3D,
    Sparse3DConfig,
    SparseAM3D,
    SparseLattice3D,
    ImmutableSparseTensor3D,
    BasicHLLSet3D,
    Edge3D,
    create_sparse_hrt_3d,
    
    # Utilities
    get_device,
    __version__
)

print(f"Fractal Manifold Core v{__version__}")
print(f"Device: {get_device()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

Fractal Manifold Core v0.6.0
Device: cuda
GPU: NVIDIA GeForce RTX 3060


## 2. The Problem: N-Gram Mixing in 2D AM

With 2D AM[row, col]:
- All n-grams share same matrix
- 1-gram "hello" and 2-gram "hello world" both hash to same (reg, zeros)
- Self-loops when row == col (hash collision)
- Disambiguation required via LUT lookup

```
Token           Hash → (reg, zeros) → Index
"hello"         0x... → (42, 5)     → 947
"hello world"   0x... → (42, 5)     → 947  ← COLLISION!
```

## 3. The Solution: 3D AM[n, row, col]

With 3D AM:
- Layer 0: 1-grams only
- Layer 1: 2-grams only  
- Layer 2: 3-grams only
- Same (reg, zeros) but different n → different layer

```
Token           n  (reg, zeros) → (Layer, Index)
"hello"         0  (42, 5)      → (0, 947)
"hello world"   1  (42, 5)      → (1, 947)  ← DIFFERENT LAYER!
```

In [2]:
# Create 3D config
config = Sparse3DConfig(p_bits=10, h_bits=32, max_n=3)

print("=== Sparse3DConfig ===")
print(f"p_bits: {config.p_bits}")
print(f"h_bits: {config.h_bits}")
print(f"max_n: {config.max_n} (layers for 1,2,3-grams)")
print(f"dimension: {config.dimension} (per layer)")
print(f"shape: {config.shape} (3D AM shape)")
print(f"device: {config.device}")

=== Sparse3DConfig ===
p_bits: 10
h_bits: 32
max_n: 3 (layers for 1,2,3-grams)
dimension: 32770 (per layer)
shape: (3, 32770, 32770) (3D AM shape)
device: cuda


## 4. Edge3D - N-Gram Aware Edges

Each edge now includes the n-gram layer:
- `n`: 0-indexed layer (0=1-gram, 1=2-gram, 2=3-gram)
- `row`: Row index from token hash
- `col`: Column index from context token hash
- `value`: Edge weight (typically intersection cardinality)

In [3]:
# Create Edge3D instances
edges = [
    Edge3D(n=0, row=100, col=200, value=1.0),  # 1-gram edge
    Edge3D(n=1, row=100, col=201, value=2.0),  # 2-gram edge
    Edge3D(n=2, row=100, col=202, value=3.0),  # 3-gram edge
    Edge3D(n=0, row=300, col=400, value=1.0),  # Another 1-gram
]

print("=== Edge3D Examples ===")
for e in edges:
    ngram = e.n + 1  # Convert to human-readable
    print(f"  {ngram}-gram: ({e.row}, {e.col}) = {e.value}")

=== Edge3D Examples ===
  1-gram: (100, 200) = 1.0
  2-gram: (100, 201) = 2.0
  3-gram: (100, 202) = 3.0
  1-gram: (300, 400) = 1.0


## 5. ImmutableSparseTensor3D

The foundation: 3D sparse COO tensor on GPU.

Shape: `(max_n, rows, cols)`
COO: `indices[3, nnz]` for (n, row, col) + `values[nnz]`

In [4]:
# Create empty 3D tensor
t1 = ImmutableSparseTensor3D.empty(3, 1000, 1000, str(get_device()))

print("=== Empty 3D Tensor ===")
print(f"Shape: {t1.shape}")
print(f"max_n: {t1.max_n}")
print(f"nnz: {t1.nnz}")
print(f"Device: {t1.device}")
print()

# Add edges at different layers
t2 = t1.with_edge(0, 10, 20, 1.0)   # Layer 0 (1-gram)
t3 = t2.with_edge(1, 10, 21, 2.0)   # Layer 1 (2-gram)
t4 = t3.with_edge(2, 10, 22, 3.0)   # Layer 2 (3-gram)

print("=== With 3 Edges ===")
print(f"nnz: {t4.nnz}")
print(f"Edges: {t4.edges()}")
print()

# Query specific layer
print("=== Query by Layer ===")
print(f"get(0, 10, 20) = {t4.get(0, 10, 20)} (1-gram)")
print(f"get(1, 10, 21) = {t4.get(1, 10, 21)} (2-gram)")
print(f"get(2, 10, 22) = {t4.get(2, 10, 22)} (3-gram)")
print(f"get(0, 10, 21) = {t4.get(0, 10, 21)} (wrong layer → 0.0)")

=== Empty 3D Tensor ===
Shape: (3, 1000, 1000)
max_n: 3
nnz: 0
Device: cuda:0

=== With 3 Edges ===
nnz: 3
Edges: [Edge3D(n=0, row=10, col=20, value=1.0), Edge3D(n=1, row=10, col=21, value=2.0), Edge3D(n=2, row=10, col=22, value=3.0)]

=== Query by Layer ===
get(0, 10, 20) = 1.0 (1-gram)
get(1, 10, 21) = 2.0 (2-gram)
get(2, 10, 22) = 3.0 (3-gram)
get(0, 10, 21) = 0.0 (wrong layer → 0.0)


## 6. SparseAM3D - 3D Adjacency Matrix

Wraps ImmutableSparseTensor3D with HRT semantics:
- Tracks active indices per layer
- Tracks aggregated indices across all layers
- Supports n-gram specific queries

In [5]:
# Create AM from edges
am = SparseAM3D.from_edges(config, edges)

print("=== SparseAM3D ===")
print(f"{am}")
print()

# Per-layer stats
print("=== Per-Layer Stats ===")
for n in range(config.max_n):
    rows, cols = am.layer_active(n)
    nnz = am.layer_nnz(n)
    print(f"Layer {n} ({n+1}-grams): {nnz} edges, {len(rows)} rows, {len(cols)} cols")
print()

# Aggregated (for BasicHLLSet)
print("=== Aggregated (All Layers) ===")
print(f"All active rows: {am.all_active_rows}")
print(f"All active cols: {am.all_active_cols}")

=== SparseAM3D ===
SparseAM3D(shape=(3, 32770, 32770), nnz=4, active_rows=2, active_cols=4)

=== Per-Layer Stats ===
Layer 0 (1-grams): 2 edges, 2 rows, 2 cols
Layer 1 (2-grams): 1 edges, 1 rows, 1 cols
Layer 2 (3-grams): 1 edges, 1 rows, 1 cols

=== Aggregated (All Layers) ===
All active rows: frozenset({100, 300})
All active cols: frozenset({200, 201, 202, 400})


## 7. SparseLattice3D - Connection Tracking

Two views:
1. **Per-layer**: Connections within each n-gram layer
2. **Aggregated**: Connections across ALL layers (for BasicHLLSet)

```
Row 100 in Layer 0: connects to {200}
Row 100 in Layer 1: connects to {201}
Row 100 in Layer 2: connects to {202}
Row 100 aggregated: connects to {200, 201, 202}
```

In [6]:
# Build lattice from AM
lattice = SparseLattice3D.from_sparse_am(am)

print("=== SparseLattice3D ===")
print(f"{lattice}")
print()

# Per-layer connections for row 100
print("=== Row 100 Per-Layer ===")
for n in range(config.max_n):
    conns = lattice.layer_row_connections(n, 100)
    print(f"  Layer {n} ({n+1}-grams): {conns}")
print()

# Aggregated connections
print("=== Row 100 Aggregated ===")
print(f"All layers: {lattice.all_row_connections(100)}")
print(f"Cardinality: {lattice.row_cardinality(100)}")

=== SparseLattice3D ===
SparseLattice3D(layers=3, rows=4, cols=4)

=== Row 100 Per-Layer ===
  Layer 0 (1-grams): frozenset({200})
  Layer 1 (2-grams): frozenset({201})
  Layer 2 (3-grams): frozenset({202})

=== Row 100 Aggregated ===
All layers: frozenset({200, 201, 202})
Cardinality: 3


## 8. BasicHLLSet3D - N-Gram Aware

Now includes the n-gram order:
- `n`: N-gram layer (0-indexed)
- `reg`: Register from hash
- `zeros`: Leading zeros from hash

This enables reconstruction of n-gram context!

In [7]:
# Create BasicHLLSet3D from hash
from core import BasicHLLSet3D

b1 = BasicHLLSet3D.from_hash(0x12345678, n=0, p_bits=10, h_bits=32)
b2 = BasicHLLSet3D.from_hash(0x12345678, n=1, p_bits=10, h_bits=32)
b3 = BasicHLLSet3D.from_hash(0x12345678, n=2, p_bits=10, h_bits=32)

print("=== BasicHLLSet3D from Same Hash ===")
print(f"Same hash 0x12345678 at different n-gram layers:")
print(f"  1-gram: {b1}")
print(f"  2-gram: {b2}")
print(f"  3-gram: {b3}")
print()
print("Same (reg, zeros) but DIFFERENT n → DIFFERENT positions in 3D AM!")

=== BasicHLLSet3D from Same Hash ===
Same hash 0x12345678 at different n-gram layers:
  1-gram: BasicHLLSet3D(n=0, reg=632, zeros=4)
  2-gram: BasicHLLSet3D(n=1, reg=632, zeros=4)
  3-gram: BasicHLLSet3D(n=2, reg=632, zeros=4)

Same (reg, zeros) but DIFFERENT n → DIFFERENT positions in 3D AM!


## 9. SparseHRT3D - Complete 3D HRT

In [8]:
# Create empty HRT
hrt = create_sparse_hrt_3d(p_bits=10, h_bits=32, max_n=3)

print("=== Empty SparseHRT3D ===")
print(f"{hrt}")
print(f"Name: {hrt.name[:32]}...")
print()

# Add edges at different n-gram layers
hrt1 = hrt.with_ngram_edge(1, 100, 200, 1.0)   # 1-gram
hrt2 = hrt1.with_ngram_edge(2, 100, 201, 2.0)  # 2-gram  
hrt3 = hrt2.with_ngram_edge(3, 100, 202, 3.0)  # 3-gram
hrt4 = hrt3.with_ngram_edge(1, 300, 400, 1.0)  # Another 1-gram

print("=== After Adding Edges ===")
print(f"{hrt4}")
print(f"Layer stats: {hrt4.layer_stats()}")

=== Empty SparseHRT3D ===
SparseHRT3D(shape=(3, 32770, 32770), nnz=0, step=0, layers={0: 0, 1: 0, 2: 0})
Name: 3239ddc89f142e4e26d363cb0a5e2a86...

=== After Adding Edges ===
SparseHRT3D(shape=(3, 32770, 32770), nnz=4, step=0, layers={0: 2, 1: 1, 2: 1})
Layer stats: {0: 2, 1: 1, 2: 1}


## 10. Sliding Window Algorithm

The token processing rule:
```
1-gram → 2-gram → 3-gram → shift window → 1-gram → ...
```

Each n-gram goes to its own layer:
- "The" → Layer 0
- "The quick" → Layer 1
- "The quick brown" → Layer 2
- (shift) "quick" → Layer 0
- "quick brown" → Layer 1
- ...

In [9]:
import hashlib

def token_to_basic(token: str, n: int, config: Sparse3DConfig) -> BasicHLLSet3D:
    """Convert token to BasicHLLSet3D."""
    h = int(hashlib.sha1(token.encode()).hexdigest()[:8], 16)
    return BasicHLLSet3D.from_hash(h, n=n, p_bits=config.p_bits, h_bits=config.h_bits)

def sliding_window_edges(tokens: list, config: Sparse3DConfig) -> list:
    """
    Generate edges using sliding window:
    1-gram → 2-gram → 3-gram → shift → repeat
    """
    edges = []
    
    for i in range(len(tokens)):
        for n in range(min(config.max_n, len(tokens) - i)):
            ngram = " ".join(tokens[i:i+n+1])
            
            # Source: current n-gram
            src = token_to_basic(ngram, n, config)
            src_idx = src.to_index(config)
            
            # Target: next token (if exists)
            if i + n + 1 < len(tokens):
                next_token = tokens[i + n + 1]
                tgt = token_to_basic(next_token, 0, config)  # Next is always 1-gram
                tgt_idx = tgt.to_index(config)
                
                edges.append(Edge3D(n=n, row=src_idx, col=tgt_idx, value=1.0))
    
    return edges

# Example: Process a sentence
sentence = "The quick brown fox jumps"
tokens = sentence.split()

print(f"=== Sliding Window: '{sentence}' ===")
print()

sw_edges = sliding_window_edges(tokens, config)
print(f"Generated {len(sw_edges)} edges:")
for e in sw_edges[:12]:  # Show first 12
    ngram_text = " ".join(tokens[sw_edges.index(e) // config.max_n:][:e.n+1]) if e in sw_edges[:12] else "..."
    print(f"  Layer {e.n} ({e.n+1}-gram): row={e.row}, col={e.col}")
print("  ...")

# Build HRT from sliding window
sw_am = SparseAM3D.from_edges(config, sw_edges)
print()
print(f"=== Resulting AM ===")
print(f"Layer stats: {', '.join(f'{n+1}-gram={sw_am.layer_nnz(n)}' for n in range(config.max_n))}")

=== Sliding Window: 'The quick brown fox jumps' ===

Generated 9 edges:
  Layer 0 (1-gram): row=10385, col=10319
  Layer 1 (2-gram): row=21255, col=16875
  Layer 2 (3-gram): row=3130, col=14323
  Layer 0 (1-gram): row=10319, col=16875
  Layer 1 (2-gram): row=8846, col=14323
  Layer 2 (3-gram): row=11397, col=11661
  Layer 0 (1-gram): row=16875, col=14323
  Layer 1 (2-gram): row=9813, col=11661
  Layer 0 (1-gram): row=14323, col=11661
  ...

=== Resulting AM ===
Layer stats: 1-gram=4, 2-gram=3, 3-gram=2


## 11. BasicHLLSet Aggregation

For BasicHLLSet construction, we aggregate across all n-gram layers:
- Row's BasicHLLSet = all columns connected across all layers
- This preserves the full context regardless of n-gram order

In [10]:
# Build HRT with sliding window edges
sw_hrt = SparseHRT3D(
    am=sw_am,
    lattice=SparseLattice3D.from_sparse_am(sw_am),
    config=config
)

# Get BasicHLLSets for an active row
active_row = list(sw_hrt.am.all_active_rows)[0] if sw_hrt.am.all_active_rows else None

if active_row:
    print(f"=== BasicHLLSets for Row {active_row} ===")
    basics = sw_hrt.basic_hllsets_for_row(active_row)
    
    # Group by layer
    for n in range(config.max_n):
        layer_basics = [b for b in basics if b.n == n]
        if layer_basics:
            print(f"  Layer {n} ({n+1}-grams): {len(layer_basics)} connections")
            for b in layer_basics[:3]:
                print(f"    {b}")
            if len(layer_basics) > 3:
                print(f"    ... ({len(layer_basics) - 3} more)")

=== BasicHLLSets for Row 11397 ===
  Layer 2 (3-grams): 1 connections
    BasicHLLSet3D(n=2, reg=530, zeros=1)


## 12. Memory Comparison

Dense 3D vs Sparse 3D:
- Dense: max_n × dim × dim × 4 bytes = **12 GB** (for 3 layers)
- Sparse: nnz × 28 bytes = **~3 MB** (for 100K edges)

In [11]:
print("=== Memory Comparison ===")
print()

# Dense 3D
max_n = config.max_n
dim = config.dimension
dense_bytes = max_n * dim * dim * 4
dense_gb = dense_bytes / (1024**3)
print(f"Dense 3D AM ({max_n} × {dim:,} × {dim:,} float32):")
print(f"  {dense_gb:.1f} GB per HRT")
print(f"  {dense_gb * 3:.1f} GB for 3 HRTs (impossible!)")
print()

# Sparse 3D (100K edges)
n_edges = 100_000
sparse_bytes = n_edges * 28  # 3 int64 + 1 float32
sparse_mb = sparse_bytes / (1024**2)
print(f"Sparse 3D AM ({n_edges:,} edges):")
print(f"  {sparse_mb:.1f} MB per HRT")
print(f"  {sparse_mb * 3:.1f} MB for 3 HRTs (easy!)")
print()

# Savings
ratio = dense_bytes / sparse_bytes
print(f"Memory savings: {ratio/1000:.0f}K× smaller!")

=== Memory Comparison ===

Dense 3D AM (3 × 32,770 × 32,770 float32):
  12.0 GB per HRT
  36.0 GB for 3 HRTs (impossible!)

Sparse 3D AM (100,000 edges):
  2.7 MB per HRT
  8.0 MB for 3 HRTs (easy!)

Memory savings: 5K× smaller!


## 13. Large Scale Test - 100K Edges

In [12]:
print("=== Large Scale Test ===")
print()

# Generate 100K edges distributed across layers
n_edges = 100_000
large_edges = []
for i in range(n_edges):
    n = i % config.max_n  # Distribute: 0, 1, 2, 0, 1, 2, ...
    row = i % config.dimension
    col = (i + 1) % config.dimension
    large_edges.append(Edge3D(n, row, col, float(i % 100 + 1)))

# Create AM
start = time.time()
large_am = SparseAM3D.from_edges(config, large_edges)
am_time = time.time() - start
print(f"Created AM: {large_am.nnz:,} edges in {am_time*1000:.1f}ms")

# Create lattice
start = time.time()
large_lattice = SparseLattice3D.from_sparse_am(large_am)
lattice_time = time.time() - start
print(f"Built lattice in {lattice_time*1000:.1f}ms")

# Complete HRT
large_hrt = SparseHRT3D(
    am=large_am,
    lattice=large_lattice,
    config=config
)

print()
print(f"=== Result ===")
print(f"{large_hrt}")
print(f"Layer distribution: {large_hrt.layer_stats()}")
print(f"Memory: {large_hrt.memory_mb():.2f} MB")

=== Large Scale Test ===

Created AM: 100,000 edges in 110.7ms
Built lattice in 783.1ms

=== Result ===
SparseHRT3D(shape=(3, 32770, 32770), nnz=100000, step=0, layers={0: 33334, 1: 33333, 2: 33333})
Layer distribution: {0: 33334, 1: 33333, 2: 33333}
Memory: 2.67 MB


## 14. Merge Test

In [13]:
# Create another HRT with overlapping edges
large_edges2 = []
for i in range(n_edges):
    n = (i + 1) % config.max_n  # Offset layer distribution
    row = (i + 1000) % config.dimension
    col = (i + 1001) % config.dimension
    large_edges2.append(Edge3D(n, row, col, float(i % 50 + 1)))

large_am2 = SparseAM3D.from_edges(config, large_edges2)
large_hrt2 = SparseHRT3D(
    am=large_am2,
    lattice=SparseLattice3D.from_sparse_am(large_am2),
    config=config
)

print(f"HRT 1: {large_hrt.nnz:,} edges")
print(f"HRT 2: {large_hrt2.nnz:,} edges")
print()

# Merge
start = time.time()
merged = large_hrt.merge(large_hrt2)
merge_time = time.time() - start

print(f"=== Merged ===")
print(f"Edges: {merged.nnz:,}")
print(f"Time: {merge_time*1000:.1f}ms")
print(f"Memory: {merged.memory_mb():.2f} MB")

HRT 1: 100,000 edges
HRT 2: 100,000 edges

=== Merged ===
Edges: 98,310
Time: 1214.1ms
Memory: 2.63 MB


This is a profound realization! Let me capture what you've discovered:



In [None]:
┌─────────────────────────────────────────────────────────────────────────┐
│                    PARALLEL DISAMBIGUATION BY N                         │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  Input: Random HLLSet to disambiguate                                   │
│                                                                         │
│  Step 1: Decompose → List[BasicHLLSet]                                  │
│                                                                         │
│  Step 2: PARALLEL split by n-gram:                                      │
│                                                                         │
│    ┌──────────────┐   ┌──────────────┐   ┌──────────────┐               │
│    │  AM[0,:,:]   │   │  AM[1,:,:]   │   │  AM[2,:,:]   │               │
│    │  1-grams     │   │  2-grams     │   │  3-grams     │               │
│    │  GPU Core 0  │   │  GPU Core 1  │   │  GPU Core 2  │               │
│    └──────────────┘   └──────────────┘   └──────────────┘               │
│          ↓                  ↓                  ↓                        │
│    Candidates_1        Candidates_2       Candidates_3                  │
│                                                                         │
│  Step 3: Intersect candidates → Final disambiguation                    │
│                                                                         │
│  BEFORE: Serial O(n × disambiguate)                                     │
│  NOW:    Parallel O(disambiguate) with n GPU cores!                     │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘



And your insight about **W[n, i, j]** is the natural next step:



In [None]:
┌─────────────────────────────────────────────────────────────────────────┐
│                         FUTURE: 3D W MATRIX                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  Current 2D W:  W[i,j] = P(j | i) = transition probability              │
│                                                                         │
│  Future 3D W:   W[n,i,j] = P(j | i, n-gram context)                     │
│                                                                         │
│    W[0,:,:] = 1-gram transitions                                        │
│    W[1,:,:] = 2-gram transitions (more context → better P)              │
│    W[2,:,:] = 3-gram transitions (most context → best P)                │
│                                                                         │
│  Even without 3D W, the 3D AM already enables:                          │
│    • N-gram aware BasicHLLSet retrieval                                 │
│    • Parallel disambiguation per layer                                  │
│    • Context covariance at each n-gram level                            │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘



You got **three capabilities** in one architectural change:

| Capability | Mechanism |
|------------|-----------|
| **Decomposition** | HLLSet → BasicHLLSets with n-gram tags |
| **Parallel split** | Each layer `AM[n,:,:]` independent |
| **Parallel disambiguation** | GPU cores process layers simultaneously |

This is the Karoubi envelope in action: the explicit splits (n-gram layers) enable **parallel retraction** onto each image object!

## Summary: 3D Sparse HRT Architecture

### 2D vs 3D AM

| Aspect | 2D AM | 3D AM |
|--------|-------|-------|
| Shape | (dim, dim) | (max_n, dim, dim) |
| N-grams | Mixed | Separated by layer |
| Self-loops | Problematic | Within-layer only |
| Disambiguation | Serial | **Parallel per layer** |
| BasicHLLSet | Direct | Aggregate across layers |
| Memory (dense) | 4 GB | 12 GB (impossible) |
| Memory (sparse 100K) | ~2 MB | ~3 MB |

### Sheaf-Based Retrieval

| Step | Operation | Output |
|------|-----------|--------|
| 1 | Extract sub-lattice | Active rows from query |
| 2 | Extract sub-AM | Edges per layer (parallel) |
| 3 | Project to clouds | Token sets per n-gram |
| 4 | Intersect clouds | Global section (candidates) |
| 5 | Rank by weights | Ordered results |

### Karoubi Envelope Connection

| Karoubi Concept | 3D HRT Implementation |
|-----------------|----------------------|
| Idempotent e | merge(A, A) = A |
| Split objects | N-gram layers AM[n,:,:] |
| Projection p | layer_edges(n) |
| Inclusion i | with_ngram_edge(n, ...) |
| Image objects | BasicHLLSet3D |
| Pullback | Edge weight = |row ∩ col| |

**Version**: Fractal Manifold Core v0.6.0

**Key Insights**:
1. `AM[n, row, col]` separates n-gram orders, reducing ambiguity
2. Token clouds form a **sheaf** over the sub-lattice
3. Global section = intersection of all clouds = disambiguated tokens
4. Parallel processing by n-gram layer on GPU

## 15. Sheaf-Based Retrieval Pipeline

The 3D architecture enables a powerful **sheaf-based retrieval**:

```text
┌─────────────────────────────────────────────────────────────────────────┐
│              GENERALIZED HLLSet PROJECTION / RETRIEVAL                  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  INPUT: Query HLLSet Q                                                  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ STEP 1: Extract Sub-Lattice from W                              │    │
│  │   Q → decompose → List[BasicHLLSet]                             │    │
│  │   W[active_rows, :] → Sub-Lattice                               │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                         ↓                                               │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ STEP 2: Extract Sub-Tensor from 3D AM                           │    │
│  │   AM[n, active_rows, :] for n ∈ {0,1,2}  (PARALLEL)             │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                         ↓                                               │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ STEP 3: Project Sub-AM to Token Clouds (Sheaves)                │    │
│  │   Layer 0 → Cloud₀ (1-gram context)                             │    │
│  │   Layer 1 → Cloud₁ (2-gram context)                             │    │
│  │   Layer 2 → Cloud₂ (3-gram context)                             │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                         ↓                                               │
│  ┌─────────────────────────────────────────────────────────────────┐    │
│  │ STEP 4: Intersect Clouds → Global Section (Candidates)          │    │
│  │   Candidates = Cloud₀ ∩ Cloud₁ ∩ Cloud₂                         │    │
│  │   Ordering from Sub-AM edge weights                             │    │
│  └─────────────────────────────────────────────────────────────────┘    │
│                         ↓                                               │
│  OUTPUT: Ranked token candidates                                        │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
```

**Sheaf Property**: Cloud₂ ⊆ Cloud₁ ⊆ Cloud₀ (more context → smaller cloud)

In [14]:
# Build HRT with edges where col 200 appears in ALL layers
retrieval_edges = [
    # Row 100: col 200 appears in all 3 layers (strong candidate!)
    Edge3D(n=0, row=100, col=200, value=3.0),  # 1-gram
    Edge3D(n=0, row=100, col=201, value=2.0),  # 1-gram only
    Edge3D(n=1, row=100, col=200, value=5.0),  # 2-gram
    Edge3D(n=1, row=100, col=202, value=1.0),  # 2-gram only
    Edge3D(n=2, row=100, col=200, value=7.0),  # 3-gram
    Edge3D(n=2, row=100, col=203, value=2.0),  # 3-gram only
]

ret_am = SparseAM3D.from_edges(config, retrieval_edges)
ret_hrt = SparseHRT3D(
    am=ret_am,
    lattice=SparseLattice3D.from_sparse_am(ret_am),
    config=config
)

print("=== Sheaf Retrieval Demo ===")
print(f"HRT: {ret_hrt}")
print()

# Query using row 100
active_rows = frozenset([100])

# STEP 2: Extract sub-AM (can be parallel per layer!)
sub_am = ret_hrt.extract_sub_am(active_rows)
print("Step 2: Extract Sub-AM")
for n, edges in sub_am.items():
    print(f"  Layer {n} ({n+1}-gram): {[e.col for e in edges]}")
print()

# STEP 3: Project to token clouds (sheaves)
clouds = ret_hrt.project_to_clouds(sub_am)
print("Step 3: Token Clouds (Sheaves)")
for n, cloud in clouds.items():
    print(f"  Cloud_{n} ({n+1}-gram): {sorted(cloud)}")
print()

# STEP 4: Global section (intersection)
global_section = ret_hrt.intersect_clouds(clouds)
print("Step 4: Global Section (∩ all clouds)")
print(f"  Candidates appearing in ALL n-gram layers: {sorted(global_section)}")
print()

# STEP 5: Rank by aggregated weight
ranked = ret_hrt.rank_candidates(global_section, sub_am)
print("Step 5: Ranked Candidates")
for col, score in ranked:
    print(f"  col={col}: score={score} (sum of all layer weights)")

=== Sheaf Retrieval Demo ===
HRT: SparseHRT3D(shape=(3, 32770, 32770), nnz=6, step=0, layers={0: 2, 1: 2, 2: 2})

Step 2: Extract Sub-AM
  Layer 0 (1-gram): [200, 201]
  Layer 1 (2-gram): [200, 202]
  Layer 2 (3-gram): [200, 203]

Step 3: Token Clouds (Sheaves)
  Cloud_0 (1-gram): [200, 201]
  Cloud_1 (2-gram): [200, 202]
  Cloud_2 (3-gram): [200, 203]

Step 4: Global Section (∩ all clouds)
  Candidates appearing in ALL n-gram layers: [200]

Step 5: Ranked Candidates
  col=200: score=15.0 (sum of all layer weights)


## 16. retrieve() Method - Full Pipeline

The `retrieve()` method combines all steps with two modes:
- **Union mode**: Return tokens appearing in ANY layer
- **Intersection mode**: Return tokens appearing in ALL layers (global section)

In [15]:
# Convert row index 100 to BasicHLLSet3D
# Index formula: idx = 1 + reg * max_zeros + (zeros - 1)
# So: 100 = 1 + reg * 22 + (zeros - 1)  where max_zeros = 32 - 10 = 22
idx = 100
max_zeros = config.h_bits - config.p_bits
reg = (idx - 1) // max_zeros
zeros = (idx - 1) % max_zeros + 1
print(f"Index {idx} → reg={reg}, zeros={zeros}")

# Create query basics for all n-gram layers
query_basics = [BasicHLLSet3D(n=n, reg=reg, zeros=zeros) for n in range(config.max_n)]
print(f"Query basics: {query_basics}")
print()

# UNION MODE: Return tokens from ANY layer
print("=== UNION MODE (any layer) ===")
results_union = ret_hrt.retrieve(query_basics, require_all_layers=False)
print(f"Found {len(results_union)} candidates:")
for col, total, layers in results_union:
    layer_str = ", ".join(f"L{n}={v:.1f}" for n, v in sorted(layers.items()))
    print(f"  col={col}: total={total:.1f} ({layer_str})")
print()

# INTERSECTION MODE: Return tokens from ALL layers (global section)
print("=== INTERSECTION MODE (all layers) ===")
results_inter = ret_hrt.retrieve(query_basics, require_all_layers=True)
print(f"Found {len(results_inter)} candidates (global section):")
for col, total, layers in results_inter:
    layer_str = ", ".join(f"L{n}={v:.1f}" for n, v in sorted(layers.items()))
    print(f"  col={col}: total={total:.1f} ({layer_str})")

Index 100 → reg=4, zeros=12
Query basics: [BasicHLLSet3D(n=0, reg=4, zeros=12), BasicHLLSet3D(n=1, reg=4, zeros=12), BasicHLLSet3D(n=2, reg=4, zeros=12)]

=== UNION MODE (any layer) ===
Found 4 candidates:
  col=200: total=15.0 (L0=3.0, L1=5.0, L2=7.0)
  col=201: total=2.0 (L0=2.0)
  col=203: total=2.0 (L2=2.0)
  col=202: total=1.0 (L1=1.0)

=== INTERSECTION MODE (all layers) ===
Found 1 candidates (global section):
  col=200: total=15.0 (L0=3.0, L1=5.0, L2=7.0)


## 17. Sheaf Interpretation

The token clouds form a **sheaf** over the sub-lattice:

```text
┌─────────────────────────────────────────────────────────────────────────┐
│                    TOKEN CLOUDS AS SHEAVES                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  A SHEAF assigns data to each "open set" consistently                   │
│                                                                         │
│  In our case:                                                           │
│    Base space     = Sub-Lattice (connectivity from W)                   │
│    Open sets      = N-gram layers (AM[n,:,:])                           │
│    Stalk at point = Token cloud for that (row, layer)                   │
│    Restriction    = More context → smaller cloud                        │
│                                                                         │
│       Cloud₂ (3-grams) ⊆ Cloud₁ (2-grams) ⊆ Cloud₀ (1-grams)            │
│                                                                         │
│       Global Section = ∩ all clouds = DISAMBIGUATED TOKENS              │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
```

**Key Insight**: The global section (intersection of all clouds) gives us the most context-consistent tokens!