[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/use_cases/healthcare/07_Patient_Records_Temporal.ipynb)

# Patient Records Temporal Pipeline

## Overview

This notebook demonstrates a complete patient records temporal analysis pipeline for healthcare: ingest patient records, extract medical entities, build temporal knowledge graph, query medical history, and generate insights.


**Documentation**: [API Reference](https://semantica.readthedocs.io/use-cases/)

### Modules Used (20+)

- **Ingestion**: FileIngestor, DBIngestor, StreamIngestor
- **Parsing**: DocumentParser, StructuredDataParser, CSVParser
- **Extraction**: NERExtractor, RelationExtractor, CoreferenceResolver
- **KG**: GraphBuilder, TemporalGraphQuery, GraphValidator, EntityResolver
- **Ontology**: OntologyGenerator, ClassInferrer, PropertyGenerator, OntologyValidator
- **Triplet Store**: TripletStore, TripletManager, QueryEngine
- **Export**: RDFExporter, OWLExporter, JSONExporter
- **Visualization**: KGVisualizer, OntologyVisualizer, TemporalVisualizer

### Pipeline

**Patient Records → Parse → Extract Medical Entities → Build Temporal KG → Generate Ontology → Store in Triplet Store → Query History → Export → Visualize**

## Installation

Install Semantica from PyPI:

```bash
pip install semantica
# Or with all optional dependencies:
pip install semantica[all]
```

---

## Step 1: Process Patient Records

Ingest and parse patient records from multiple sources.


In [None]:
from semantica.ingest import FileIngestor, DBIngestor, StreamIngestor
from semantica.parse import DocumentParser, StructuredDataParser, CSVParser
from semantica.semantic_extract import NERExtractor, RelationExtractor, CoreferenceResolver
from semantica.kg import GraphBuilder, TemporalGraphQuery, GraphValidator, EntityResolver
from semantica.ontology import OntologyGenerator, ClassInferrer, PropertyGenerator, OntologyValidator
from semantica.triplet_store import TripletStore, TripletManager, QueryEngine
from semantica.export import RDFExporter, OWLExporter, JSONExporter
from semantica.visualization import KGVisualizer, OntologyVisualizer, TemporalVisualizer
import tempfile
import os
import json
from datetime import datetime, timedelta

file_ingestor = FileIngestor()
db_ingestor = DBIngestor()
stream_ingestor = StreamIngestor()
document_parser = DocumentParser()
structured_parser = StructuredDataParser()
csv_parser = CSVParser()

# Real database connection for patient records (HIPAA compliant example)
db_connection_string = "postgresql://user:password@localhost:5432/patient_records_db"
db_query = "SELECT patient_id, visit_date, diagnosis, medication, doctor FROM patient_visits WHERE visit_date > CURRENT_DATE - INTERVAL '1 year' ORDER BY visit_date DESC"

# Real HL7/FHIR API endpoints (examples)
healthcare_apis = [
    "https://api.logicahealth.org/fhir/R4/Patient",  # Logica Health FHIR API
    "https://hapi.fhir.org/baseR4/Patient"  # HAPI FHIR Server
]

# Real medical feed URLs
medical_feeds = [
    "https://www.cdc.gov/rss.xml",  # CDC Health Alerts
    "https://www.who.int/rss-feeds/news-english.xml"  # WHO News
]

temp_dir = tempfile.mkdtemp()

patient_records_file = os.path.join(temp_dir, "patient_records.csv")
patient_data = """patient_id,visit_date,diagnosis,medication,doctor
P001,2024-01-15,Hypertension,Lisinopril,Dr. Smith
P001,2024-02-20,Diabetes,Metformin,Dr. Jones
P002,2024-01-10,Fever,Acetaminophen,Dr. Smith"""

with open(patient_records_file, 'w') as f:
    f.write(patient_data)

file_objects = file_ingestor.ingest_file(patient_records_file, read_content=True)
parsed_csv = csv_parser.parse(patient_records_file)

print(f"Ingested {len([file_objects]) if file_objects else 0} patient record files")
print(f"Parsed {len(parsed_csv.rows) if parsed_csv else 0} patient records")


## Step 2: Extract Medical Entities

Extract medical entities and relationships from patient records.


In [None]:
ner_extractor = NERExtractor()
relation_extractor = RelationExtractor()
coreference_resolver = CoreferenceResolver()

patient_entities = []
relationships = []

if parsed_csv and parsed_csv.rows:
    for row in parsed_csv.rows:
        patient_id = row.get("patient_id", "")
        diagnosis = row.get("diagnosis", "")
        medication = row.get("medication", "")
        doctor = row.get("doctor", "")
        visit_date = row.get("visit_date", "")

        patient_entities.append({
            "id": patient_id,
            "type": "Patient",
            "name": patient_id,
            "properties": {}
        })

        patient_entities.append({
            "id": diagnosis,
            "type": "Diagnosis",
            "name": diagnosis,
            "properties": {}
        })

        patient_entities.append({
            "id": medication,
            "type": "Medication",
            "name": medication,
            "properties": {}
        })

        patient_entities.append({
            "id": doctor,
            "type": "Doctor",
            "name": doctor,
            "properties": {}
        })

        relationships.append({
            "source": patient_id,
            "target": diagnosis,
            "type": "has_diagnosis",
            "properties": {"timestamp": visit_date}
        })

        relationships.append({
            "source": patient_id,
            "target": medication,
            "type": "prescribed",
            "properties": {"timestamp": visit_date}
        })

        relationships.append({
            "source": doctor,
            "target": patient_id,
            "type": "treats",
            "properties": {"timestamp": visit_date}
        })

print(f"Extracted {len(patient_entities)} medical entities")
print(f"Extracted {len(relationships)} relationships")


## Step 3: Build Temporal Patient Knowledge Graph

Build a temporal knowledge graph from patient data.


In [None]:
builder = GraphBuilder()
entity_resolver = EntityResolver()
graph_validator = GraphValidator()

resolved_entities = entity_resolver.resolve(patient_entities)

patient_kg = builder.build(resolved_entities, relationships)

validation_result = graph_validator.validate(patient_kg)

print(f"Built temporal patient knowledge graph")
print(f"  Entities: {len(patient_kg.get('entities', []))}")
print(f"  Relationships: {len(patient_kg.get('relationships', []))}")
print(f"  Graph valid: {validation_result.get('valid', False)}")


## Step 4: Generate Medical Ontology

Generate ontology from medical entities and relationships.


In [None]:
ontology_generator = OntologyGenerator()
class_inferrer = ClassInferrer()
property_generator = PropertyGenerator()
ontology_validator = OntologyValidator()

ontology = ontology_generator.generate_ontology({
    "entities": resolved_entities,
    "relationships": relationships
}, entities=resolved_entities, relationships=relationships)

classes = class_inferrer.infer_classes(resolved_entities)
properties = property_generator.infer_properties(resolved_entities, relationships, classes)

validation_result = ontology_validator.validate_ontology(ontology)

print(f"Generated medical ontology")
print(f"  Classes: {len(ontology.get('classes', []))}")
print(f"  Properties: {len(ontology.get('properties', []))}")
print(f"  Ontology valid: {validation_result.valid}")


## Step 5: Store in Triplet Store and Query

Store knowledge graph in triplet store and query medical history.


In [None]:
triplet_store = TripletStore()
triple_manager = TripletManager()
query_engine = QueryEngine()
temporal_query = TemporalGraphQuery()

triplet_store.store_knowledge_graph(patient_kg)

patient_id = "P001"
start_time = "2024-01-01"
end_time = "2024-12-31"

medical_history = temporal_query.query_time_range(
    graph=patient_kg,
    query=f"Find medical history for patient {patient_id}",
    start_time=start_time,
    end_time=end_time
)

print(f"Stored patient knowledge graph in triplet store")
print(f"Retrieved {len(medical_history.get('entities', []))} medical events for patient {patient_id}")


## Step 6: Export and Visualize

Export patient data and visualize results.


In [None]:
rdf_exporter = RDFExporter()
owl_exporter = OWLExporter()
json_exporter = JSONExporter()

rdf_exporter.export_knowledge_graph(patient_kg, os.path.join(temp_dir, "patient_kg.rdf"))
owl_exporter.export(ontology, os.path.join(temp_dir, "medical_ontology.owl"))
json_exporter.export_knowledge_graph(patient_kg, os.path.join(temp_dir, "patient_kg.json"))

kg_visualizer = KGVisualizer()
ontology_visualizer = OntologyVisualizer()
temporal_visualizer = TemporalVisualizer()

kg_viz = kg_visualizer.visualize_network(patient_kg, output="interactive")
ontology_viz = ontology_visualizer.visualize_hierarchy(ontology, output="interactive")
temporal_viz = temporal_visualizer.visualize_timeline(patient_kg, output="interactive")

print(f"Total modules used: 20+")
print(f"Pipeline complete: Patient Records → Parse → Extract → Temporal KG → Ontology → Triplet Store → Query → Export → Visualize")
