# Health Check Functionality Demo

This notebook demonstrates the health check functionality of the RAG Engine Mini.

## Learning Objectives

By the end of this notebook, you will understand:
1. How the health check functionality works in the RAG Engine
2. The different health check endpoints available
3. How to use the health check API endpoints
4. The architecture of the health check service
5. How health checks fit into production monitoring

In [None]:
import sys
import os
from pathlib import Path
import asyncio
import json
from datetime import datetime

# Add the project root to the path
project_root = Path("../")
sys.path.insert(0, str(project_root))

print(f"Project root: {project_root}")
print("Environment set up successfully")

## Understanding the Health Check Architecture

The health check functionality follows the same architectural patterns as the rest of the RAG Engine:

1. **Port/Adapter Pattern**: The `HealthCheckServicePort` defines the interface
2. **Dependency Injection**: Services are injected through the container
3. **Separation of Concerns**: Health logic is separate from API logic
4. **Component Monitoring**: Individual component health checks
5. **System-wide Monitoring**: Comprehensive system health reports

In [None]:
# Let's look at the health check service definition
from src.application.services.health_check_service import HealthCheckService, HealthCheckResult, SystemHealthReport

print("Health Check Service Components:")
print(f"- Health Check Service: {HealthCheckService.__name__}")
print(f"- Health Check Result: {HealthCheckResult.__name__}")
print(f"- System Health Report: {SystemHealthReport.__name__}")

print(f"\nHealth check service methods: {[method for method in dir(HealthCheckService) if not method.startswith('_') and callable(getattr(HealthCheckService, method, None))]}\n")

## Using the Health Check Service

Let's see how to use the health check service to monitor different system components:

In [None]:
# Import required classes
from src.application.services.health_check_service import HealthCheckService
from src.adapters.persistence.placeholder import PlaceholderDocumentRepo
from src.adapters.cache.redis_cache import RedisCache
from src.adapters.vector.qdrant_store import QdrantVectorStore
from src.adapters.llm.openai_llm import OpenAILLM
from src.core.config import settings

# For this demo, we'll use placeholder implementations
doc_repo = PlaceholderDocumentRepo()

# Create mock implementations for other services
# In a real scenario, these would be actual service implementations
class MockCache:
    async def set(self, key, value, ttl):
        return True
    async def get(self, key):
        return "health_check"
    async def delete(self, key):
        return True

class MockVectorStore:
    pass

class MockLLM:
    pass

cache = MockCache()
vector_store = MockVectorStore()
llm = MockLLM()

# Create the health check service
health_service = HealthCheckService(
    document_repo=doc_repo,
    cache=cache,
    vector_store=vector_store,
    llm=llm
)

print("Health check service initialized successfully")
print(f"Document repository: {type(doc_repo).__name__}")
print(f"Cache: {type(cache).__name__}")

## Testing Individual Component Health

Let's test the health of different system components:

In [None]:
print("Testing health of individual components:\n")

components = ["database", "cache", "vector_store", "llm", "api"]

for component in components:
    try:
        result = asyncio.run(health_service.check_component_health(component))
        print(f"‚úÖ {component.upper()} health check completed")
        print(f"   Status: {result.status}")
        print(f"   Response time: {result.response_time_ms} ms")
        print(f"   Details: {result.details[:60]}..." if len(result.details) > 60 else f"   Details: {result.details}")
        print()
    except Exception as e:
        print(f"‚ùå {component.upper()} health check failed: {e}\n")

## Running a Comprehensive System Health Check

Now let's run a comprehensive health check of the entire system:

In [None]:
try:
    report = asyncio.run(health_service.check_system_health())
    
    print("System Health Report:")
    print(f"- Overall Status: {report.overall_status}")
    print(f"- Generated at: {report.timestamp}")
    print(f"- Components Checked: {len(report.components)}")
    print(f"- Dependencies Monitored: {len(report.dependencies)}")
    
    print(f"\nMetrics:")
    for key, value in report.metrics.items():
        print(f"  {key}: {value}")
    
    print(f"\nComponent Details:")
    for comp in report.components:
        status_icon = "üî¥" if comp.status == "error" else "üü°" if comp.status == "degraded" else "üü¢"
        print(f"  {status_icon} {comp.component}: {comp.status} ({comp.response_time_ms}ms)")
        
    print(f"\nDependency Status:")
    for dep, info in report.dependencies.items():
        status_icon = "üî¥" if info['status'] in ['error', 'disconnected'] else "üü¢"
        print(f"  {status_icon} {dep}: {info['status']}")
        
except Exception as e:
    print(f"Failed to run system health check: {e}")

## Testing Dependency Health Checks

Let's check the health of specific dependencies:

In [None]:
dependencies = ["postgresql", "redis", "qdrant", "llm_provider"]

print("Testing dependency health checks:\n")

for dep in dependencies:
    try:
        result = asyncio.run(health_service.check_dependency_health(dep))
        status_icon = "üî¥" if result.status == "error" else "üü°" if result.status == "degraded" else "üü¢"
        print(f"{status_icon} {dep}: {result.status} ({result.response_time_ms}ms)")
        print(f"   Details: {result.details[:60]}..." if len(result.details) > 60 else f"   Details: {result.details}")
        print()
    except Exception as e:
        print(f"‚ùå {dep} health check failed: {e}\n")

## API Endpoints

The health check functionality is also available through API endpoints. Let's examine the routes:

In [None]:
# Import the API router to see available endpoints
from src.api.v1.routes_health import router

print("Health Check API routes:")
for route in router.routes:
    if hasattr(route, 'methods') and hasattr(route, 'path'):
        print(f"- {list(route.methods)}: {route.path}")

print(f"\nTotal health check API routes: {len([r for r in router.routes if hasattr(r, 'methods')])}")

## Health Check Importance in Production

Health checks are critical for production systems. Let's look at the different types:

In [None]:
print("Types of Health Checks in Production Systems:\n")

health_types = {
    "Liveness": {
        "purpose": "Is the application alive and responding?",
        "endpoint": "/health/live",
        "use_case": "If dead, restart the container/pod"
    },
    "Readiness": {
        "purpose": "Is the application ready to receive traffic?",
        "endpoint": "/health/ready", 
        "use_case": "If not ready, don't send traffic to it"
    },
    "Detailed": {
        "purpose": "Comprehensive system status",
        "endpoint": "/health/detailed",
        "use_case": "Monitor specific components and dependencies"
    },
    "Dependencies": {
        "purpose": "Check external service connections",
        "endpoint": "/health/dependencies",
        "use_case": "Verify connectivity to DB, cache, etc."
    }
}

for health_type, details in health_types.items():
    print(f"{health_type} Health Check:")
    print(f"  Purpose: {details['purpose']}")
    print(f"  Endpoint: {details['endpoint']}")
    print(f"  Use Case: {details['use_case']}\n")

## Summary

In this notebook, we explored the health check functionality of the RAG Engine Mini:

1. **Architecture**: The health check service follows the same architectural patterns as the rest of the system
2. **Component Monitoring**: Individual checks for database, cache, vector store, LLM, and API
3. **System Reporting**: Comprehensive reports combining all component statuses
4. **API Access**: Multiple endpoints for different health check needs
5. **Production Value**: Critical for monitoring, alerting, and system reliability

Health checks are essential for production systems, enabling automated monitoring, alerting, and remediation. The RAG Engine's health check implementation provides comprehensive visibility into system status and component health, which is crucial for maintaining reliable RAG services in production environments!