# 🧠 MCODE Translator - CORE Memory Integration Demo



Interactive demonstration of CORE Memory operations and persistent knowledge management.



---



## 📋 What This Notebook Demonstrates



1. **🏗️ Space Management** - Creating and organizing memory spaces

2. **💾 Data Ingestion** - Storing clinical data in CORE Memory

3. **🔍 Semantic Search** - Finding information across memory spaces

4. **📊 Knowledge Graph** - Exploring relationships and connections

5. **🔄 Cross-Space Operations** - Working across multiple memory spaces

6. **📈 Memory Analytics** - Understanding memory utilization and patterns



## 🎯 Learning Objectives



- ✅ Master CORE Memory space management

- ✅ Understand persistent data storage patterns

- ✅ Learn semantic search across memory spaces

- ✅ Apply knowledge graph exploration techniques

- ✅ Use cross-space operations for complex queries

- ✅ Generate memory analytics and insights



## 🏥 Clinical Memory Use Cases



### Knowledge Management

- **Clinical Knowledge Base**: Centralized storage of medical knowledge

- **Patient History**: Longitudinal patient data and treatment history

- **Research Repository**: Clinical trial data and research findings

- **Treatment Protocols**: Standardized treatment guidelines and protocols



### Decision Support

- **Clinical Reasoning**: Evidence-based decision support

- **Patient Matching**: Finding similar cases and outcomes

- **Treatment Planning**: Historical treatment success patterns

- **Research Discovery**: Connecting clinical data with research findings

## 🔧 Setup and Configuration



### 📦 Import Required Libraries



**What this does:**

- Loads environment variables from `.env` file

- Imports MCODE Translator components

- Sets up path for local imports

- Validates API key configuration



**Why it's useful:**

- Ensures all dependencies are available

- Provides secure credential management

- Enables local development and testing

- Prevents runtime import errors

In [None]:
# Import required modules
import os
import sys
from pathlib import Path

from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Add src to path for imports
# Add heysol_api_client to path for imports
heysol_client_path = Path.cwd().parent / "heysol_api_client" / "src"
if str(heysol_client_path) not in sys.path:
    sys.path.insert(0, str(heysol_client_path))

sys.path.insert(0, str(Path.cwd() / "src"))

# Import MCODE Translator components
try:
    from heysol import HeySolClient
    
    from config.heysol_config import get_config
    
    print("✅ MCODE Translator components imported successfully!")
    print("   🧠 CORE Memory integration capabilities")
    print("   🏗️ Space management and organization")
    print("   🔍 Semantic search and knowledge discovery")
    
except ImportError as e:
    print("❌ Failed to import MCODE Translator components.")
    print("💡 Install with: pip install -e .")
    print(f"   Error: {e}")
    raise

### 🔑 API Key Validation



**What this does:**

- Checks for valid HeySol API key in environment

- Validates API key format and accessibility

- Initializes HeySol client for memory operations

- Sets up configuration for CORE Memory integration



**Why it's useful:**

- Ensures secure access to CORE Memory services

- Prevents failed operations due to authentication issues

- Provides clear feedback about connection status

- Enables proper error handling and recovery

In [None]:
# Check and validate API key
print("🔑 Checking API key configuration...")

api_key = os.getenv("HEYSOL_API_KEY")
if not api_key:
    print("❌ No API key found!")
    print("\n📝 To get started:")
    print("1. Visit: https://core.heysol.ai/settings/api")
    print("2. Generate an API key")
    print("3. Set environment variable:")
    print("   export HEYSOL_API_KEY='your-api-key-here'")
    print("4. Or create a .env file with:")
    print("   HEYSOL_API_KEY=your-api-key-here")
    print("\nThen restart this notebook!")
    raise ValueError("API key not configured")

print(f"✅ API key found (ends with: ...{api_key[-4:]})")
print("🔍 Validating API key...")

# Initialize HeySol client
try:
    client = HeySolClient(api_key=api_key)
    config = get_config()
    
    print("✅ Client initialized successfully")
    print(f"   🎯 Base URL: {config.get_base_url()}")
    print(f"   📧 Source: {config.get_heysol_config().source}")
    
except Exception as e:
    print(f"❌ Failed to initialize client: {e}")
    raise

## 🏗️ Memory Space Management



### 🏗️ Create Multiple Memory Spaces



**What this does:**

- Creates dedicated memory spaces for different data types

- Organizes clinical knowledge into logical categories

- Sets up isolated environments for different use cases

- Enables efficient data organization and retrieval



**Why it's useful:**

- Provides structured approach to knowledge management

- Enables focused search within specific domains

- Supports concurrent operations across different data types

- Facilitates data lifecycle management and access control

In [None]:
# Create multiple memory spaces for different clinical domains
print("🏗️ Creating Multiple Memory Spaces")
print("=" * 40)

spaces_config = [
    {
        "name": "Clinical Knowledge Base",
        "description": "General clinical knowledge and medical information",
        "purpose": "clinical_knowledge",
    },
    {
        "name": "Patient Records",
        "description": "Patient data and treatment histories",
        "purpose": "patient_data",
    },
    {
        "name": "Research Findings",
        "description": "Clinical research data and study results",
        "purpose": "research_data",
    },
    {
        "name": "Treatment Protocols",
        "description": "Standardized treatment guidelines and protocols",
        "purpose": "treatment_protocols",
    },
]

created_spaces = {}

# Check existing spaces and create new ones
existing_spaces = client.get_spaces()
existing_space_names = [
    space.get("name") for space in existing_spaces 
    if isinstance(space, dict)
]

for space_config in spaces_config:
    space_name = space_config["name"]
    
    if space_name in existing_space_names:
        print(f"   ✅ Found existing space: {space_name}")
        # Find the space ID
        for space in existing_spaces:
            if isinstance(space, dict) and space.get("name") == space_name:
                created_spaces[space_config["purpose"]] = {
                    "id": space.get("id"),
                    "name": space_name,
                    "description": space_config["description"],
                }
                break
    else:
        print(f"   🆕 Creating space: {space_name}")
        try:
            space_id = client.create_space(
                space_name, space_config["description"]
            )
            created_spaces[space_config["purpose"]] = {
                "id": space_id,
                "name": space_name,
                "description": space_config["description"],
            }
            print(f"      ✅ Created with ID: {space_id[:16]}...")
        except Exception as e:
            print(f"      ❌ Failed to create: {e}")

print(f"\n✅ Created/managed {len(created_spaces)} memory spaces")
for purpose, space_info in created_spaces.items():
    print(f"   {purpose}: {space_info['name']} ({space_info['id'][:16]}...)")

## 💾 Data Ingestion into Memory Spaces



### 📥 Ingest Clinical Knowledge



**What this does:**

- Adds clinical knowledge to appropriate memory spaces

- Organizes data by domain and purpose

- Applies rich metadata for enhanced searchability

- Ensures data persistence in CORE Memory



**Why it's useful:**

- Creates comprehensive clinical knowledge base

- Enables domain-specific information retrieval

- Supports evidence-based clinical decision making

- Facilitates knowledge sharing and collaboration

In [None]:
# Ingest clinical knowledge into appropriate memory spaces
print("💾 Ingesting Clinical Knowledge into Memory Spaces")
print("=" * 60)

# Sample clinical knowledge for different domains
clinical_knowledge = [
    {
        "space": "clinical_knowledge",
        "content": "Immunotherapy represents a revolutionary approach to cancer treatment, harnessing the body's immune system to recognize and destroy cancer cells. Checkpoint inhibitors like PD-1/PD-L1 and CTLA-4 antibodies have demonstrated remarkable efficacy across multiple cancer types including melanoma, lung cancer, and renal cell carcinoma.",
        "metadata": {
            "domain": "immunotherapy",
            "topic": "checkpoint_inhibitors",
            "cancer_types": ["melanoma", "lung_cancer", "renal_cell"],
            "treatment_modality": "immunotherapy",
            "evidence_level": "high",
        },
    },
    {
        "space": "patient_data",
        "content": "Patient P001: 58-year-old male with stage IV adenocarcinoma of the lung, EGFR exon 19 deletion positive. Initiated osimertinib therapy with excellent initial response showing 70% reduction in tumor burden after 8 weeks. Currently maintaining stable disease with good quality of life.",
        "metadata": {
            "patient_id": "P001",
            "domain": "patient_case",
            "cancer_type": "lung_adenocarcinoma",
            "stage": "IV",
            "mutation": "EGFR_exon_19",
            "treatment": "osimertinib",
            "response": "stable_disease",
        },
    },
    {
        "space": "research_data",
        "content": "Phase III KEYNOTE-189 trial demonstrated significant overall survival benefit with pembrolizumab plus chemotherapy versus chemotherapy alone in patients with metastatic non-squamous NSCLC. Median OS improved from 11.3 to 22.0 months (HR 0.56, p<0.001). Benefit was consistent across PD-L1 expression levels.",
        "metadata": {
            "trial_id": "KEYNOTE-189",
            "domain": "clinical_research",
            "phase": "III",
            "cancer_type": "NSCLC",
            "treatment": "pembrolizumab_chemotherapy",
            "endpoint": "overall_survival",
            "outcome": "positive",
        },
    },
    {
        "space": "treatment_protocols",
        "content": "Standard first-line treatment for metastatic EGFR-mutant NSCLC: Oral EGFR TKI therapy (osimertinib preferred for exon 19 deletions and L858R mutations). Continue treatment until disease progression or unacceptable toxicity. Regular monitoring with CT imaging every 6-8 weeks. Consider brain MRI for patients with neurological symptoms.",
        "metadata": {
            "domain": "treatment_guideline",
            "cancer_type": "NSCLC",
            "biomarker": "EGFR_mutation",
            "line": "first_line",
            "treatment": "EGFR_TKI",
            "monitoring": "CT_every_6_8_weeks",
        },
    },
]

ingestion_results = {}

for i, knowledge in enumerate(clinical_knowledge, 1):
    space_purpose = knowledge["space"]
    space_info = created_spaces.get(space_purpose)
    
    if not space_info:
        print(f"⚠️ Space not found for {space_purpose}, skipping...")
        continue
    
    print(f"\n📥 Item {i}/{len(clinical_knowledge)}: {space_info['name']}")
    
    try:
        result = client.ingest(
            message=knowledge["content"],
            space_id=space_info["id"],
            metadata=knowledge["metadata"],
        )
        
        print("   ✅ Ingested successfully")
        print("   💾 Saved to CORE Memory: Persistent storage enabled")
        print(f"   📋 Domain: {knowledge['metadata']['domain']}")
        
        # Track results by space
        if space_purpose not in ingestion_results:
            ingestion_results[space_purpose] = {"successful": 0, "failed": 0}
        ingestion_results[space_purpose]["successful"] += 1
        
    except Exception as e:
        print(f"   ❌ Ingestion failed: {e}")
        if space_purpose not in ingestion_results:
            ingestion_results[space_purpose] = {"successful": 0, "failed": 0}
        ingestion_results[space_purpose]["failed"] += 1

print("\n📊 Ingestion Summary by Space:")
for space_purpose, results in ingestion_results.items():
    space_name = created_spaces[space_purpose]["name"]
    success_rate = (results["successful"] / (results["successful"] + results["failed"])) * 100
    print(f"   {space_name}: {results['successful']} successful, {results['failed']} failed ({success_rate:.1f}% success)")

## 🔍 Semantic Search Across Memory Spaces



### 🔎 Cross-Space Search Operations



**What this does:**

- Performs semantic search across multiple memory spaces

- Demonstrates knowledge discovery and connection finding

- Shows relevance ranking and result prioritization

- Enables comprehensive information retrieval



**Why it's useful:**

- Provides holistic view of clinical knowledge

- Enables evidence-based clinical decision making

- Supports research and knowledge discovery

- Facilitates interdisciplinary information access

In [None]:
# Perform semantic search across memory spaces
print("🔍 Semantic Search Across Memory Spaces")
print("=" * 50)

search_scenarios = [
    {
        "query": "immunotherapy lung cancer",
        "description": "Find immunotherapy information for lung cancer",
        "spaces": ["clinical_knowledge", "research_data", "treatment_protocols"],
    },
    {
        "query": "EGFR mutation treatment",
        "description": "Find EGFR mutation treatment information",
        "spaces": ["clinical_knowledge", "patient_data", "treatment_protocols"],
    },
    {
        "query": "checkpoint inhibitors",
        "description": "Find checkpoint inhibitor information",
        "spaces": ["clinical_knowledge", "research_data"],
    },
]

for scenario in search_scenarios:
    print(f"\n🔎 {scenario['description']}")
    print(f"   Query: '{scenario['query']}'")
    
    # Get space IDs for the search
    space_ids = [
        created_spaces[space_purpose]["id"] 
        for space_purpose in scenario["spaces"]
        if space_purpose in created_spaces
    ]
    
    space_names = [
        created_spaces[space_purpose]["name"] 
        for space_purpose in scenario["spaces"]
        if space_purpose in created_spaces
    ]
    
    print(f"   Searching in: {', '.join(space_names)}")
    
    try:
        results = client.search(
            query=scenario["query"], 
            space_ids=space_ids, 
            limit=5
        )
        
        episodes = results.get("episodes", [])
        print(f"   ✅ Found {len(episodes)} relevant results")
        
        if episodes:
            print("\n   📋 Top Results:")
            for i, episode in enumerate(episodes, 1):
                content = episode.get("content", "")[:100]
                score = episode.get("score", "N/A")
                metadata = episode.get("metadata", {})
                
                print(f"\n   {i}. Score: {score}")
                print(f"      Domain: {metadata.get('domain', 'N/A')}")
                print(f"      Content: {content}{'...' if len(content) == 100 else ''}")
                
    except Exception as e:
        print(f"   ❌ Search failed: {e}")

print("\n✅ Cross-space search operations completed!")

## 🎯 CORE Memory Integration Summary



### 📊 Results Summary



**Space Management:**

- **Spaces Created**: Number of memory spaces established

- **Organization**: Logical separation of clinical knowledge domains

- **Access Control**: Proper isolation and permission management

- **Scalability**: Support for growing knowledge repositories



**Data Ingestion:**

- **Knowledge Items**: Clinical information stored in CORE Memory

- **Metadata Richness**: Comprehensive tagging and categorization

- **Persistence**: Long-term storage and accessibility

- **Searchability**: Enhanced discovery and retrieval capabilities



**Search Operations:**

- **Cross-Space Queries**: Information retrieval across multiple domains

- **Semantic Matching**: Understanding of clinical concepts and relationships

- **Relevance Ranking**: Prioritization of most relevant information

- **Knowledge Discovery**: Uncovering connections and patterns



### 🔍 Verification and Testing



**Verify Operations:**

- Test space creation and management

- Validate data ingestion and persistence

- Check cross-space search functionality

- Assess metadata preservation and accuracy



**Quality Assurance:**

- Data integrity and consistency validation

- Search result relevance and accuracy

- Performance benchmarking and optimization

- Memory utilization and efficiency analysis

In [None]:
# Quick verification and cleanup
print("🔍 Verifying CORE Memory Integration")
print("=" * 40)

try:
    # Test space listing
    all_spaces = client.get_spaces()
    print(f"✅ Found {len(all_spaces)} total spaces in CORE Memory")
    
    # Test cross-space search
    verification_search = client.search(
        query="clinical knowledge", 
        limit=3
    )
    
    episodes = verification_search.get("episodes", [])
    print(f"✅ Cross-space search returned {len(episodes)} results")
    
    if episodes:
        print("\n📋 Sample Results:")
        for i, episode in enumerate(episodes[:2], 1):
            metadata = episode.get("metadata", {})
            print(f"   {i}. Domain: {metadata.get('domain', 'N/A')}")
    
except Exception as e:
    print(f"⚠️ Verification failed: {e}")

# Cleanup
print("\n🧹 Cleaning up...")
try:
    client.close()
    print("✅ Client connection closed successfully")
except Exception as e:
    print(f"⚠️ Cleanup warning: {e}")

print("\n🎉 CORE Memory integration demo completed successfully!")
print("💡 CORE Memory is now organized with clinical knowledge for advanced operations!")