# linkml-runtime-rust Showcase

This notebook demonstrates the core capabilities of `linkml-runtime-rust`, a high-performance
Rust-backed LinkML runtime for Python:

1. **SchemaView** — Loading schemas from strings, multi-schema namespace resolution, prefix/CURIE handling
2. **Instance loading** — Parsing YAML/JSON data against a schema with validation
3. **Diffing & Patching** — Computing and applying structured deltas between instances
4. **Turtle export** — Converting LinkML instances to RDF Turtle

In [None]:
!pip install linkml-runtime-rust

In [None]:
from linkml_runtime_rust import (
    make_schema_view, load_yaml, load_json,
    diff, patch, to_turtle,
    Delta,
)

def pp(v, indent=""):
    """Simple pretty-printer for LinkMLInstance trees."""
    k = v.kind
    if k == "scalar":
        return repr(v.as_python())
    if k == "null":
        return "null"
    if k in ("object", "map"):
        tag = f"[{v.class_name}] " if v.class_name else ""
        lines = [f"{tag}{{"]
        for key in v.keys():
            lines.append(f"{indent}  {key}: {pp(v[key], indent + '  ')}")
        lines.append(f"{indent}}}")
        return "\n".join(lines)
    if k == "list":
        if len(v) == 0:
            return "[]"
        lines = ["["]
        for i in range(len(v)):
            lines.append(f"{indent}  - {pp(v[i], indent + '    ')}")
        lines.append(f"{indent}]")
        return "\n".join(lines)
    return f"<{k}>"

## 1. Schema loading — no I/O required

Schemas are added as plain strings. No filesystem or network access needed.

In [None]:
SCHEMA = """
id: https://example.org/personinfo
name: personinfo
prefixes:
  personinfo: https://example.org/personinfo/
  schema: http://schema.org/
  famrel: https://example.org/FamilialRelations#
  P: http://example.org/P/
  ORG: http://example.org/ORG/
default_prefix: personinfo
default_range: string

classes:
  Container:
    tree_root: true
    slots:
      - persons
      - organizations

  NamedThing:
    slots:
      - id
      - name

  Person:
    is_a: NamedThing
    class_uri: schema:Person
    slots:
      - email
      - age
      - friends
      - address

  Organization:
    is_a: NamedThing
    class_uri: schema:Organization
    slots:
      - mission

  Address:
    class_uri: schema:PostalAddress
    slots:
      - street
      - city

slots:
  id:
    identifier: true
    slot_uri: schema:identifier
  name:
    slot_uri: schema:name
  email:
    slot_uri: schema:email
  age:
    range: integer
    minimum_value: 0
    maximum_value: 200
  friends:
    range: Person
    multivalued: true
  address:
    range: Address
    inlined: true
  street:
  city:
  mission:
  persons:
    range: Person
    multivalued: true
    inlined_as_list: true
  organizations:
    range: Organization
    multivalued: true
    inlined_as_list: true
"""

sv = make_schema_view()
sv.add_schema_str(SCHEMA)
print("Loaded schemas:", sv.schema_ids())
print("Classes:", sv.get_class_ids())
print("Slots:", sv.get_slot_ids())

## 2. Namespace & prefix resolution

Classes and slots can be looked up by plain name, CURIE, or full URI.

In [None]:
# Lookup by name
person = sv.get_class_view("Person")
print(f"By name:  {person.name} -> {person.canonical_uri()}")

# Lookup by full URI
person2 = sv.get_class_view_by_uri("http://schema.org/Person")
print(f"By URI:   {person2.name} -> {person2.canonical_uri()}")

# Default prefix expansion
print(f"\nDefault prefix (expanded): {sv.get_default_prefix_for_schema('https://example.org/personinfo', expand=True)}")
print(f"Default prefix (raw):      {sv.get_default_prefix_for_schema('https://example.org/personinfo', expand=False)}")

## 3. Exploring the class hierarchy

In [None]:
# Class inheritance
person = sv.get_class_view("Person")
print(f"Person's parent: {person.parent_class().name}")
print(f"Person's identifier slot: {person.identifier_slot().name}")

# All effective slots (inherited + own)
print(f"\nPerson's effective slots:")
for slot in person.slots():
    range_cls = slot.range_class()
    range_info = range_cls.name if range_cls else (slot.definition.range or "string")
    print(f"  {slot.name:20s} range={range_info:15s} container={slot.container_mode():12s} inline={slot.inline_mode()}")

# Descendants
named_thing = sv.get_class_view("NamedThing")
print(f"\nDescendants of NamedThing: {[c.name for c in named_thing.get_descendants(recurse=True, include_mixins=False)]}")

## 4. Slot URIs and range introspection

In [None]:
# Slot URI resolution
email_slot = sv.get_slot_view("email")
print(f"email slot URI: {email_slot.canonical_uri()}")
print(f"email is scalar: {email_slot.is_range_scalar()}")

# Slot with class range
address_slot = sv.get_slot_view("address")
print(f"\naddress range class: {address_slot.range_class().name}")
print(f"address inline mode: {address_slot.inline_mode()}")
print(f"address container mode: {address_slot.container_mode()}")

## 5. Multi-schema namespace support

Multiple schemas can coexist with proper namespace isolation. Unlike Python's
`linkml-runtime` where same-named elements silently overwrite each other,
the Rust runtime keeps them separate.

Below, two schemas both define a slot called `status` — one is a string label,
the other a numeric health score. A class can reference its **local** `status` by
CURIE, or the **imported** one by its CURIE — and get different slots.

In [None]:
# Schema A: "core" — defines status as a human-readable string label
CORE_SCHEMA = """
id: https://example.org/core
name: core
prefixes:
  core: https://example.org/core/
default_prefix: core

slots:
  status:
    range: string
    slot_uri: core:status
    description: Human-readable status label
  code:
    range: string
    identifier: true

classes:
  Entity:
    slots:
      - code
      - status
"""

# Schema B: "monitoring" — imports core, but ALSO defines its own "status" (a float score).
# Since both are in scope, we need CURIEs to disambiguate:
#   "mon:status"   -> local monitoring:status (float)
#   "core:status"  -> imported core:status (string)
MONITORING_SCHEMA = """
id: https://example.org/monitoring
name: monitoring
prefixes:
  mon: https://example.org/monitoring/
  core: https://example.org/core/
default_prefix: mon
imports:
  - https://example.org/core

slots:
  status:
    range: float
    slot_uri: mon:status
    description: Numeric health score (0.0 to 1.0)

classes:
  HealthCheck:
    description: Uses only the local float score
    slots:
      - \"mon:status\"

  DetailedCheck:
    description: Uses both — local score (float) AND imported label (string)
    slots:
      - \"mon:status\"
      - \"core:status\"
"""

nsv = make_schema_view()
# Add monitoring first (it has an unresolved import to core)
nsv.add_schema_str(MONITORING_SCHEMA)
print("Unresolved imports:", nsv.get_unresolved_schema_refs())

# Satisfy the import by providing the core schema
nsv.add_schema_str_with_import_ref(
    CORE_SCHEMA,
    schema_id="https://example.org/monitoring",
    uri="https://example.org/core",
)
print("Unresolved after:", nsv.get_unresolved_schema_refs())
print("All schemas:", nsv.schema_ids())

# HealthCheck: only mon:status (float)
hc = nsv.get_class_view("HealthCheck")
print(f"\nHealthCheck slots:")
for s in hc.slots():
    print(f"  {s.name:20s} range={s.definition.range or 'string':10s} URI={s.canonical_uri()}")

# DetailedCheck: has BOTH — mon:status (float) AND core:status (string)
dc = nsv.get_class_view("DetailedCheck")
print(f"\nDetailedCheck slots:")
for s in dc.slots():
    print(f"  {s.name:20s} range={s.definition.range or 'string':10s} URI={s.canonical_uri()}")

# In Python linkml-runtime, the second \"status\" would have silently replaced the first.
# Here each lives in its own namespace — same name, different range, different URI.

## 6. Loading instances from YAML

Instances are loaded against a class view and validated on the fly.

In [None]:
container_view = sv.get_class_view("Container")

DATA_V1 = """
persons:
  - id: P:001
    name: Alice
    email: alice@example.com
    age: 30
    address:
      street: 123 Main St
      city: Springfield
  - id: P:002
    name: Bob
    email: bob@example.com
    age: 25
    friends:
      - P:001
organizations:
  - id: ORG:1
    name: Acme Corp
    mission: Building better widgets
"""

v1, issues = load_yaml(DATA_V1, sv, container_view)
print("Validation issues:", len(issues))
for issue in issues:
    print(f"  [{issue.severity}] {issue.detail}")

print("\nLoaded instance:")
print(pp(v1))

In [None]:
# Navigate into the instance
alice = v1["persons"][0]
print(f"First person: {alice['name'].as_python()} (class: {alice.class_name})")
print(f"Address city: {alice['address']['city'].as_python()}")
print(f"Navigate by path: {v1.navigate(['persons', 'P:001', 'email']).as_python()}")

## 7. Validation

Let's load some invalid data to see the validation in action.

In [None]:
INVALID_DATA = """
persons:
  - id: P:099
    name: Charlie
    age: 999
    unknown_field: oops
"""

invalid_instance, issues = load_yaml(INVALID_DATA, sv, container_view)
print(f"{len(issues)} validation issue(s):")
for issue in issues:
    print(f"  [{issue.severity}] {issue.detail}")

# Instance-level validation also works
if invalid_instance:
    print(f"\nUnknown fields on Charlie: {invalid_instance['persons'][0].unknown_fields()}")

## 8. Diffing instances

Compute a structured diff between two versions of the same data.

In [None]:
DATA_V2 = """
persons:
  - id: P:001
    name: Alice Wonderland
    email: alice.w@newdomain.com
    age: 31
    address:
      street: 456 Oak Ave
      city: Shelbyville
  - id: P:002
    name: Bob
    email: bob@example.com
    age: 26
    friends:
      - P:001
  - id: P:003
    name: Carol
    email: carol@example.com
    age: 28
organizations:
  - id: ORG:1
    name: Acme Corp
    mission: Building even better widgets
"""

v2, _ = load_yaml(DATA_V2, sv, container_view)

deltas = diff(v1, v2)
print(f"{len(deltas)} change(s) detected:\n")
for d in deltas:
    print(f"  {d.op:7s} path={d.path}")
    if d.old is not None:
        print(f"          old={d.old}")
    if d.new is not None:
        print(f"          new={d.new}")
    print()

## 9. Patching instances

Apply the deltas to v1 and verify we get v2.

In [None]:
result = patch(v1, deltas)
patched = result.value
trace = result.trace

print(f"Patch trace:")
print(f"  added nodes:   {len(trace.added)}")
print(f"  deleted nodes: {len(trace.deleted)}")
print(f"  updated nodes: {len(trace.updated)}")
print(f"  failed paths:  {trace.failed}")

# Verify round-trip
print(f"\nPatched equals v2: {patched.equals(v2)}")
print(f"\nPatched instance:")
print(pp(patched))

## 10. Serializing deltas

Deltas can be serialized to dicts (for JSON storage) and reconstructed.

In [None]:
import json

# Serialize
serialized = [d.to_dict() for d in deltas]
print("Serialized deltas (JSON):")
print(json.dumps(serialized, indent=2)[:1500], "...")

# Reconstruct and re-apply
rebuilt_deltas = [Delta(d["path"], d["op"], old=d.get("old"), new=d.get("new")) for d in serialized]
result2 = patch(v1, rebuilt_deltas)
print(f"\nRound-trip patch equals v2: {result2.value.equals(v2)}")

## 11. Turtle (RDF) export

Instances can be exported to RDF Turtle format, with prefixes automatically
collected from all loaded schemas.

In [None]:
ttl = to_turtle(v1)
print(ttl)

In [None]:
# Also available as a method on instances
alice_ttl = v1["persons"][0].as_turtle(skolem=False)
print("Alice as Turtle:")
print(alice_ttl)

## 12. Turtle with skolemized URIs

By default, anonymous nodes use blank nodes. With `skolem=True`, they get
deterministic URIs derived from the class and field values.

In [None]:
ttl_skolem = to_turtle(v1, skolem=True)
print(ttl_skolem)

## 13. Loading from JSON

JSON strings work the same way as YAML.

In [None]:
person_view = sv.get_class_view("Person")

json_data = '{"id": "P:010", "name": "Diana", "email": "diana@example.com", "age": 35}'
diana, issues = load_json(json_data, sv, person_view)
print(f"Loaded from JSON: {diana.as_python()}")
print(f"Validation issues: {len(issues)}")
print(f"\nAs Turtle:")
print(diana.as_turtle(skolem=False))

## 14. Instance equality semantics

LinkML instance equality is structural, not referential.

In [None]:
# Load the same data twice
a, _ = load_json('{"id": "P:010", "name": "Diana", "email": "diana@example.com", "age": 35}', sv, person_view)
b, _ = load_json('{"id": "P:010", "name": "Diana", "email": "diana@example.com", "age": 35}', sv, person_view)

print(f"Same data, different loads: equals={a.equals(b)}")

# Missing vs null
with_field, _ = load_json('{"id": "P:X", "name": "X", "age": 42}', sv, person_view)
without_field, _ = load_json('{"id": "P:X", "name": "X"}', sv, person_view)

print(f"Missing field ignored:        equals={with_field.equals(without_field, treat_missing_as_null=False)}")
print(f"Missing field treated as null: equals={with_field.equals(without_field, treat_missing_as_null=True)}")

## 15. Snapshot serialization

A multi-schema `SchemaView` can be serialized and restored without re-resolving imports.

In [None]:
# Save and restore
snapshot = sv.to_snapshot_yaml()
print(f"Snapshot size: {len(snapshot)} chars")
print(f"First 300 chars:\n{snapshot[:300]}...")

# Restore from snapshot
sv2 = make_schema_view()
sv2 = sv2.from_snapshot_yaml(snapshot)
print(f"\nRestored schemas: {sv2.schema_ids()}")
print(f"Restored classes: {sv2.get_class_ids()}")
print(f"Same view: {sv.is_same(sv2)}")