# RLM Tutorial: Progressive Disclosure Over RDF Graphs

This notebook demonstrates the RLM (Recursive Language Model) architecture
with working examples. All cells are executed with real Claude API calls.

In [1]:
import sys
from pathlib import Path

## 1. Core RLM Loop

The `llm_query()` function delegates a question to Claude and stores the result.

In [2]:
from rlm.core import llm_query

ns = {}
result = llm_query("What is 2+2? Answer with just the number.", ns, name='math')
print(f"Result: {result}")
print(f"Stored as: ns['math']")

Result: 4
Stored as: ns['math']


The `rlm_run()` function runs the full RLM loop: the model emits code,
executes it in a REPL, and iterates until it finds an answer.

In [3]:
from rlm.core import rlm_run

ns = {}
answer, iterations, ns = rlm_run(
    "Calculate the sum of squares of 1, 2, and 3.",
    "You can use Python to calculate.",
    ns=ns,
    max_iters=3
)
print(f"Answer: {answer}")
print(f"Iterations: {len(iterations)}")

Answer: 14
Iterations: 1


## 2. Ontology Loading

Load RDF ontologies and explore them with bounded view functions.
The key insight: we never dump the full graph into context.

In [4]:
from rlm.ontology import setup_ontology_context

ns = {}
setup_ontology_context('ontology/prov.ttl', ns, name='prov')
print(ns['prov_meta'].summary())

Graph 'prov': 1,664 triples
Classes: 59
Properties: 89
Individuals: 1
Namespaces: brick, csvw, dc, dcat, dcmitype, dcterms, dcam, doap, foaf, geo, odrl, org, prof, qb, schema, sh, skos, sosa, ssn, time, vann, void, wgs, owl, rdf, rdfs, xsd, xml, prov


In [5]:
# Search for classes related to "Activity"
results = ns['prov_search_by_label']('Activity', limit=5)
for uri, label in results:
    print(f"{label}: {uri}")

Activity: http://www.w3.org/ns/prov#Activity
ActivityInfluence: http://www.w3.org/ns/prov#ActivityInfluence
activity: http://www.w3.org/ns/prov#activity
hadActivity: http://www.w3.org/ns/prov#hadActivity
activityOfInfluence: http://www.w3.org/ns/prov#activityOfInfluence


In [6]:
# Get bounded description of Activity class
desc = ns['prov_describe_entity']('http://www.w3.org/ns/prov#Activity', limit=10)
print(f"Label: {desc['label']}")
print(f"Types: {desc['types']}")
print(f"Comment: {desc['comment'][:100] if desc['comment'] else 'None'}...")
print(f"Outgoing triples (sample): {len(desc['outgoing_sample'])}")

Label: Activity
Types: ['http://www.w3.org/2002/07/owl#Class']
Comment: None...
Outgoing triples (sample): 10


## 3. RLM with Ontology Exploration

Combine the RLM loop with ontology tools for intelligent exploration.
The model uses bounded views to progressively discover information.

In [7]:
from rlm.core import rlm_run
from rlm.ontology import setup_ontology_context

ns = {}
setup_ontology_context('ontology/prov.ttl', ns, name='prov')

query = "What is prov:Activity? Use search_by_label and describe_entity."
context = ns['prov_meta'].summary()

answer, iterations, ns = rlm_run(
    query,
    context,
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer: {answer[:500] if answer else 'No answer'}...")
print(f"Iterations: {len(iterations)}")

Answer: [Max iterations] Last output: Description of prov:Activity:
{'uri': 'http://www.w3.org/ns/prov#Activity', 'label': 'Activity', 'types': ['http://www.w3.org/2002/07/owl#Class'], 'comment': None, 'outgoing_sample': [('http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://www.w3.org/2002/07/owl#Class'), ('http://www.w3.org/2000/01/rdf-schema#isDefinedBy', 'http://www.w3.org/ns/prov-o#'), ('http://www.w3.org/2000/01/rdf-schema#label', 'Activity'), ('http://www.w3.org/2002/07/owl#disjointWith', '...
Iterations: 3


In [8]:
# Show what code the LLM executed
for i, it in enumerate(iterations):
    if it.code_blocks:
        print(f"Iteration {i}:")
        for cb in it.code_blocks:
            print(f"  Code: {cb.code[:100]}...")

Iteration 0:
  Code: print("Context type:", type(context))
print("Context length:", len(context))
print("Context content:...
Iteration 1:
  Code: # First, search for "Activity" using search_by_label
activity_search = search_by_label("Activity")
p...
Iteration 2:
  Code: # Now describe the prov:Activity entity
activity_description = describe_entity("http://www.w3.org/ns...


## 4. Dataset Memory

Store discovered facts in an RDF Dataset with provenance tracking.
Facts persist across queries.

In [9]:
from rlm.dataset import setup_dataset_context

ns = {}
setup_dataset_context(ns)
print(ns['dataset_stats']())

Dataset 'ds' (session: d1ab8fce)
mem: 0 triples
prov: 0 events
work graphs: 0
onto graphs: 0


In [10]:
# Add a fact we discovered
ns['mem_add'](
    'http://example.org/myAnalysis',
    'http://www.w3.org/ns/prov#wasGeneratedBy',
    'http://example.org/rlmSession1'
)

# Check stats
print(ns['dataset_stats']())

Failed to convert Literal lexical form to value. Datatype=http://www.w3.org/2001/XMLSchema#dateTime, Converter=<built-in method fromisoformat of type object at 0x106d5b400>
Traceback (most recent call last):
  File "/Users/cvardema/uvws/.venv/lib/python3.12/site-packages/rdflib/term.py", line 2262, in _castLexicalToPython
    return conv_func(lexical)  # type: ignore[arg-type]
           ^^^^^^^^^^^^^^^^^^
ValueError: Invalid isoformat string: '2026-01-19T17:01:28.071489+00:00Z'


Dataset 'ds' (session: d1ab8fce)
mem: 1 triples
prov: 7 events
work graphs: 0
onto graphs: 0


In [11]:
# Query the memory graph
results = ns['mem_query']("""
    SELECT ?s ?p ?o WHERE { ?s ?p ?o }
""")
for r in results:
    print(r)

{'s': 'http://example.org/myAnalysis', 'p': 'http://www.w3.org/ns/prov#wasGeneratedBy', 'o': 'http://example.org/rlmSession1'}


## 5. SPARQL Result Handles

Query results return handles with metadata, not raw data.
This enables pagination and filtering without loading everything.

In [12]:
from rlm.sparql_handles import SPARQLResultHandle

# Simulating a large result set
handle = SPARQLResultHandle(
    rows=[{'name': f'Item{i}', 'value': i} for i in range(100)],
    result_type='select',
    query='SELECT ?name ?value WHERE { ... }',
    endpoint='local',
    columns=['name', 'value'],
    total_rows=100
)

print(handle.summary())
print(f"First 3 rows: {handle.rows[:3]}")

SELECT: 100 rows, columns=['name', 'value']
First 3 rows: [{'name': 'Item0', 'value': 0}, {'name': 'Item1', 'value': 1}, {'name': 'Item2', 'value': 2}]


## 6. Procedural Memory

Store and retrieve methods learned from past trajectories.
Uses BM25 for similarity-based retrieval.

In [13]:
from rlm.procedural_memory import MemoryStore, MemoryItem, retrieve_memories
from datetime import datetime, timezone
import uuid

store = MemoryStore()

# Add a learned procedure
item = MemoryItem(
    id=str(uuid.uuid4()),
    title='Find Activity classes in PROV',
    description='How to discover Activity-related classes',
    content='1. Use search_by_label("Activity")\n2. Use describe_entity() on results',
    source_type='success',
    task_query='find activities in PROV',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['prov', 'ontology', 'exploration']
)
store.add(item)

print(f"Store has {len(store.memories)} memories")

Store has 1 memories


In [14]:
# Retrieve relevant memories for a new task
retrieved = retrieve_memories(store, 'how to explore PROV ontology activities', k=1)
for mem in retrieved:
    print(f"Title: {mem.title}")
    print(f"Content:\n{mem.content}")

Title: Find Activity classes in PROV
Content:
1. Use search_by_label("Activity")
2. Use describe_entity() on results


## 7. SHACL Shape Indexing

Index SHACL shapes for schema discovery and constraint validation.

In [15]:
from rlm.shacl_examples import detect_shacl, build_shacl_index, search_shapes
from rdflib import Graph

# Load DCAT-AP shapes
g = Graph()
g.parse('ontology/dcat-ap/dcat-ap-SHACL.ttl')

# Detect SHACL content
detection = detect_shacl(g)
print(f"Node shapes: {detection['node_shapes']}")
print(f"Property shapes: {detection['property_shapes']}")

Node shapes: 42
Property shapes: 0


In [16]:
# Build index and search
index = build_shacl_index(g)
results = search_shapes(index, 'dataset', limit=3)

for r in results:
    print(f"{r['uri'].split('#')[-1]}: targets {r['targets']}")

dcat:CatalogShape: targets ['http://www.w3.org/ns/dcat#Catalog']
dcat:DatasetShape: targets ['http://www.w3.org/ns/dcat#Dataset']
dcat:DataServiceShape: targets ['http://www.w3.org/ns/dcat#DataService']


## 8. Full Integration: Multi-Ontology Comparison

Putting it all together: load multiple ontologies, build sense documents,
and use RLM to answer complex questions.

In [17]:
from rlm.ontology import build_sense

ns = {}
build_sense('ontology/prov.ttl', name='prov_sense', ns=ns)
print("PROV sense document built")
print(f"Summary length: {len(ns['prov_sense'].summary)} chars")

PROV sense document built
Summary length: 5417 chars


In [18]:
from rlm.core import rlm_run

# Build sense for SIO
build_sense('ontology/sio/sio-release.owl', name='sio_sense', ns=ns)

# Context is both sense documents
context = {
    'prov': ns['prov_sense'].summary[:2000],  # Truncate for demo
    'sio': ns['sio_sense'].summary[:2000]
}

query = "What are the key differences between PROV and SIO ontologies?"

answer, iterations, ns = rlm_run(
    query,
    str(context),
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer:\n{answer[:800] if answer else 'No answer'}...")
print(f"\nIterations: {len(iterations)}")

Answer:
# Key Differences Between PROV and SIO Ontologies

## 1. **Primary Purpose and Scope**

### PROV (Provenance Interchange)
- **Focus**: Capturing the **history and lineage** of resources
- **Core Question**: "How did this entity come to exist and who was responsible?"
- **Domain**: Provenance information, data lineage, and accountability tracking

### SIO (Semanticscience Integrated Ontology)
- **Focus**: Describing **scientific knowledge and entities** themselves
- **Core Question**: "What is this thing and what are its properties?"
- **Domain**: General scientific knowledge representation across domains

## 2. **Structural Architecture**

### PROV - Temporal/Process-Centric (3 Pillars)
```
Entity (what) ←→ Activity (how) ←→ Agent (who)
```
- Built around **relationships between these thre...

Iterations: 1


## Summary

This tutorial demonstrated:

1. **Core RLM loop**: `llm_query()` and `rlm_run()` for LLM-driven exploration
2. **Ontology loading**: Bounded views prevent context overflow
3. **Progressive disclosure**: Start small, explore as needed
4. **Dataset memory**: Persist discovered facts with provenance
5. **SPARQL handles**: Metadata-first result handling
6. **Procedural memory**: Learn and reuse exploration strategies
7. **SHACL indexing**: Schema discovery through shape search

All code executed successfully with real Claude API calls.