Example usage of the OntologyRecommender service.

This script demonstrates:
1. Analyzing text to automatically recommend an ontology
2. Using the recommended ontology to extract triples
3. Combining recommendation and extraction in one step



In [1]:
import os
import json
from dotenv import load_dotenv
from spindle import (
    OntologyRecommender,
    SpindleExtractor,
    recommendation_to_dict,
    triples_to_dict
)

# Load environment variables (API keys)
load_dotenv()

# Check if API key is set
if not os.getenv("ANTHROPIC_API_KEY"):
    print("Error: ANTHROPIC_API_KEY environment variable not set.")
    print("Please set it in a .env file or as an environment variable.")

In [2]:
print("=" * 70)
print("Ontology Recommender Example")
print("=" * 70)
print()

# Example text: Medical research abstract
medical_text = """
A recent clinical trial evaluated the efficacy of Medication A in treating 
patients with chronic migraines. The study, conducted at Massachusetts General 
Hospital, enrolled 250 patients aged 18-65 who experienced at least 8 migraine 
days per month. Dr. Sarah Chen, the principal investigator, led a team of 
neurologists who administered the drug over a 12-week period.

Results showed that Medication A reduced migraine frequency by an average of 
50% compared to the placebo group. Common side effects included nausea and 
dizziness, which affected approximately 15% of participants. The medication 
works by inhibiting CGRP receptors, which are known to play a role in migraine 
pathophysiology.

Dr. Chen reported the findings at the American Academy of Neurology conference 
in Seattle, where the research was well-received by the medical community. The 
FDA is expected to review the data for potential approval in the coming year. 
Massachusetts General Hospital has been a leading research institution in 
neurology for decades and continues to conduct groundbreaking studies in 
headache disorders.
"""

print("Step 1: Recommend ontology from medical text")
print("-" * 70)
print(f"Input text:\n{medical_text}\n")

Ontology Recommender Example

Step 1: Recommend ontology from medical text
----------------------------------------------------------------------
Input text:

A recent clinical trial evaluated the efficacy of Medication A in treating 
patients with chronic migraines. The study, conducted at Massachusetts General 
Hospital, enrolled 250 patients aged 18-65 who experienced at least 8 migraine 
days per month. Dr. Sarah Chen, the principal investigator, led a team of 
neurologists who administered the drug over a 12-week period.

Results showed that Medication A reduced migraine frequency by an average of 
50% compared to the placebo group. Common side effects included nausea and 
dizziness, which affected approximately 15% of participants. The medication 
works by inhibiting CGRP receptors, which are known to play a role in migraine 
pathophysiology.

Dr. Chen reported the findings at the American Academy of Neurology conference 
in Seattle, where the research was well-received by the me

In [3]:
# Create the recommender
recommender = OntologyRecommender()

# Get ontology recommendation
recommendation = recommender.recommend(
    text=medical_text,
    scope="balanced"  # "minimal", "balanced", or "comprehensive"
)

print("Text Purpose:")
print(f"  {recommendation.text_purpose}\n")

print("Recommended Entity Types:")
for i, entity_type in enumerate(recommendation.ontology.entity_types, 1):
    print(f"  {i}. {entity_type.name}: {entity_type.description}")
print()

print("Recommended Relation Types:")
for i, relation_type in enumerate(recommendation.ontology.relation_types, 1):
    print(f"  {i}. {relation_type.name}: {relation_type.description}")
    print(f"     ({relation_type.domain} → {relation_type.range})")
print()

print("Reasoning:")
print(f"  {recommendation.reasoning}\n")

2025-11-03T16:59:08.544 [BAML [92mINFO[0m] [35mFunction RecommendOntology[0m:
    [33mClient: CustomGPT5Mini (gpt-5-mini-2025-08-07) - 74827ms. StopReason: completed. Tokens(in/out): 2469/5483[0m
    [34m---PROMPT---[0m
    [2m[43msystem: [0m[2mYou are a knowledge graph ontology design expert. Your task is to analyze the provided text, understand its overarching purpose and domain, and recommend an appropriate ontology (entity types and relation types) that would be suitable for extracting knowledge from this and similar texts.
    [43muser: [0m[2mTEXT TO ANALYZE:
    
    A recent clinical trial evaluated the efficacy of Medication A in treating 
    patients with chronic migraines. The study, conducted at Massachusetts General 
    Hospital, enrolled 250 patients aged 18-65 who experienced at least 8 migraine 
    days per month. Dr. Sarah Chen, the principal investigator, led a team of 
    neurologists who administered the drug over a 12-week period.
    
    Results

In [5]:
# Step 2: Use the recommended ontology to extract triples
print("Step 2: Extract triples using recommended ontology")
print("-" * 70)

extractor = SpindleExtractor(recommendation.ontology)
extraction_result = extractor.extract(
    text=medical_text,
    source_name="Medical Research Abstract 2024",
    source_url="https://example.com/research/abstract-001"
)

print(f"Extracted {len(extraction_result.triples)} triples:\n")


Step 2: Extract triples using recommended ontology
----------------------------------------------------------------------
2025-11-03T16:59:32.623 [BAML [92mINFO[0m] [35mFunction ExtractTriples[0m:
    [33mClient: CustomHaiku (claude-3-5-haiku-20241022) - 19790ms. StopReason: end_turn. Tokens(in/out): 3491/1501[0m
    [34m---PROMPT---[0m
    [2m[43muser: [0m[2mYou are a knowledge graph extraction expert. Your task is to extract structured triples (subject-predicate-object) from the provided text, with rich entity metadata, custom attributes, and supporting evidence.ONTOLOGY:
    You must extract triples that conform to the following ontology:
    
    Valid Entity Types:
    - Person: An individual mentioned in the text (e.g., investigators, clinicians, presenters, patients when named).
      Custom Attributes:
        * name (string): Full name of the person.
        * role (string): Role or function in the context (e.g., principal investigator, neurologist, presenter).
   

In [6]:

for i, triple in enumerate(extraction_result.triples, 1):
    print(f"  {i}. ({triple.subject}) --[{triple.predicate}]--> ({triple.object})")
    print(f"     Evidence: {len(triple.supporting_spans)} span(s)")
    print(f"     Extraction datetime: {triple.extraction_datetime}")
    for j, span in enumerate(triple.supporting_spans, 1):
        print(f"       Span {j}: start={span.start}, end={span.end}, text=\"{span.text[:50]}...\"")
print()
    


  1. (name='Sarah Chen' type='Person' description='Principal investigator of a clinical trial studying medication for chronic migraines' custom_atts={'name': AttributeValue(value='Sarah Chen', type='string'), 'role': AttributeValue(value='Principal Investigator', type='string'), 'specialty': AttributeValue(value='Neurologist', type='string'), 'affiliation_name': AttributeValue(value='Massachusetts General Hospital', type='string')}) --[works_at]--> (name='Massachusetts General Hospital' type='Organization' description='A leading research institution in neurology that conducted a clinical trial on migraine medication' custom_atts={'name': AttributeValue(value='Massachusetts General Hospital', type='string'), 'type': AttributeValue(value='hospital', type='string'), 'location_name': AttributeValue(value=None, type='string'), 'research_focus': AttributeValue(value='Neurology and headache disorders', type='string')})
     Evidence: 1 span(s)
     Extraction datetime: 2025-11-03T22:59:32Z
  

In [8]:
# Step 3: Display extraction result with detailed span information
print("Step 3: Detailed extraction result with span indices")
print("-" * 70)

triples_dict = triples_to_dict(extraction_result.triples)
print(json.dumps(triples_dict, indent=2))

Step 3: Detailed extraction result with span indices
----------------------------------------------------------------------
[
  {
    "subject": {
      "name": "Sarah Chen",
      "type": "Person",
      "description": "Principal investigator of a clinical trial studying medication for chronic migraines",
      "custom_atts": {
        "name": {
          "value": "Sarah Chen",
          "type": "string"
        },
        "role": {
          "value": "Principal Investigator",
          "type": "string"
        },
        "specialty": {
          "value": "Neurologist",
          "type": "string"
        },
        "affiliation_name": {
          "value": "Massachusetts General Hospital",
          "type": "string"
        }
      }
    },
    "predicate": "works_at",
    "object": {
      "name": "Massachusetts General Hospital",
      "type": "Organization",
      "description": "A leading research institution in neurology that conducted a clinical trial on migraine medication",
   

In [9]:
# Step 4: Demonstrate serialization
print("Step 4: Serialize recommendation to JSON")
print("-" * 70)

recommendation_dict = recommendation_to_dict(recommendation)
print(json.dumps(recommendation_dict, indent=2))


Step 4: Serialize recommendation to JSON
----------------------------------------------------------------------
{
  "ontology": {
    "entity_types": [
      {
        "name": "Person",
        "description": "An individual mentioned in the text (e.g., investigators, clinicians, presenters, patients when named).",
        "attributes": [
          {
            "name": "name",
            "type": "string",
            "description": "Full name of the person."
          },
          {
            "name": "role",
            "type": "string",
            "description": "Role or function in the context (e.g., principal investigator, neurologist, presenter)."
          },
          {
            "name": "specialty",
            "type": "string",
            "description": "Professional specialty or occupation (e.g., neurologist)."
          },
          {
            "name": "affiliation_name",
            "type": "string",
            "description": "Name of the primary organization with 