# Neo4j Lab 11: Python Driver & Service Architecture
## Part 1: Python Driver Setup and Basics

**Duration:** 10 minutes  
**Objective:** Set up the Neo4j Python driver, establish connections, and implement enterprise-grade connection management

---

## Overview

This notebook covers:
- Environment setup and dependency installation
- Neo4j Python driver configuration
- Connection verification and health checks
- Enterprise connection manager implementation
- Connection pooling and retry logic

## Cell 1: Install and Verify Dependencies

First, we'll install all required packages and verify they're working correctly.

In [None]:
# Cell 1: Install and verify dependencies
import subprocess
import sys
import importlib

def install_package(package):
    """Install package using pip"""
    try:
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✓ Successfully installed {package}")
    except subprocess.CalledProcessError as e:
        print(f"✗ Failed to install {package}: {e}")

def verify_package(package_name, import_name=None):
    """Verify package is installed and importable"""
    if import_name is None:
        import_name = package_name.replace('-', '_')
    
    try:
        importlib.import_module(import_name)
        print(f"✓ {package_name} is installed and importable")
        return True
    except ImportError:
        print(f"✗ {package_name} is not available")
        return False

# Required packages for this lab
required_packages = [
    ("neo4j", "neo4j"),
    ("python-dotenv", "dotenv"),
    ("pytest", "pytest"),
    ("pydantic", "pydantic"),
    ("typing-extensions", "typing_extensions")
]

print("DEPENDENCY VERIFICATION:")
print("=" * 50)

all_available = True
for package_name, import_name in required_packages:
    if not verify_package(package_name, import_name):
        print(f"Installing {package_name}...")
        install_package(package_name)
        all_available = False

if all_available:
    print("\n✓ All dependencies are ready!")
else:
    print("\n⚠ Some dependencies were installed. Restart kernel and re-run this cell.")

print("=" * 50)

## Cell 2: Environment Setup and Configuration

Configure Neo4j connection parameters using environment variables for security and flexibility.

In [None]:
# Cell 2: Environment configuration and connection setup
import os
from dotenv import load_dotenv
from neo4j import GraphDatabase
import logging
from typing import Optional, Dict, Any, List
import time
import json

# Load environment configuration
load_dotenv()

# Neo4j connection configuration from environment
NEO4J_URI = os.getenv("NEO4J_URI", "bolt://localhost:7687")
NEO4J_USERNAME = os.getenv("NEO4J_USERNAME", "neo4j")
NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD", "password")
NEO4J_DATABASE = os.getenv("NEO4J_DATABASE", "neo4j")

# Additional connection configuration from environment
NEO4J_MAX_CONNECTION_LIFETIME = int(os.getenv("NEO4J_MAX_CONNECTION_LIFETIME", 1800))
NEO4J_MAX_POOL_SIZE = int(os.getenv("NEO4J_MAX_POOL_SIZE", 50))
NEO4J_ACQUISITION_TIMEOUT = int(os.getenv("NEO4J_ACQUISITION_TIMEOUT", 60))
NEO4J_MAX_RETRY_TIME = int(os.getenv("NEO4J_MAX_RETRY_TIME", 30))
NEO4J_ENCRYPTED = os.getenv("NEO4J_ENCRYPTED", "false").lower() == "true"
NEO4J_TRUST = os.getenv("NEO4J_TRUST", "TRUST_ALL_CERTIFICATES")

# Configure logging for debugging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

print("🔧 ENVIRONMENT CONFIGURATION:")
print("=" * 50)
print(f"Neo4j URI: {NEO4J_URI}")
print(f"Database: {NEO4J_DATABASE}")
print(f"Username: {NEO4J_USERNAME}")
print(f"Password: {'*' * len(NEO4J_PASSWORD)}")
print(f"Max Connection Lifetime: {NEO4J_MAX_CONNECTION_LIFETIME}s")
print(f"Max Pool Size: {NEO4J_MAX_POOL_SIZE}")
print(f"Acquisition Timeout: {NEO4J_ACQUISITION_TIMEOUT}s")
print(f"Max Retry Time: {NEO4J_MAX_RETRY_TIME}s")
print(f"Encrypted: {NEO4J_ENCRYPTED}")
print(f"Trust: {NEO4J_TRUST}")
print("=" * 50)

# Test basic connection using loaded configuration
try:
    test_driver = GraphDatabase.driver(
        NEO4J_URI, 
        auth=(NEO4J_USERNAME, NEO4J_PASSWORD),
        max_connection_lifetime=NEO4J_MAX_CONNECTION_LIFETIME,
        max_connection_pool_size=NEO4J_MAX_POOL_SIZE,
        connection_acquisition_timeout=NEO4J_ACQUISITION_TIMEOUT,
        encrypted=NEO4J_ENCRYPTED,
        trust=NEO4J_TRUST
    )
    with test_driver.session(database=NEO4J_DATABASE) as session:
        result = session.run("RETURN 'Connection successful' as message")
        message = result.single()["message"]
        print(f"✓ Connection Test: {message}")
    test_driver.close()
except Exception as e:
    print(f"✗ Connection Failed: {e}")
    print("\n🔍 TROUBLESHOOTING STEPS:")
    print("1. Verify Docker container 'neo4j' is running: docker ps")
    print("2. Check container logs: docker logs neo4j")
    print("3. Restart container if needed: docker restart neo4j")
    print("4. Verify port 7687 is not blocked by firewall")

## Cell 3: Enterprise Connection Manager Implementation

Build a production-grade connection manager with connection pooling, retry logic, and health monitoring.

In [None]:
# Cell 3: Production-grade connection manager
from typing import Optional, Dict, Any, List, Callable
import time
import threading
from contextlib import contextmanager

class Neo4jConnectionManager:
    """
    Enterprise-grade Neo4j connection manager with:
    - Connection pooling and retry logic
    - Health monitoring and metrics
    - Thread-safe operations
    - Graceful error handling
    """
    
    def __init__(self, uri: str, username: str, password: str, database: str = "neo4j"):
        self.uri = uri
        self.username = username
        self.password = password
        self.database = database
        self._driver = None
        self._connection_attempts = 0
        self._successful_queries = 0
        self._failed_queries = 0
        self._lock = threading.Lock()
        
        # Load connection configuration from environment or use defaults
        self.config = {
            "max_connection_lifetime": int(os.getenv("NEO4J_MAX_CONNECTION_LIFETIME", 30 * 60)),
            "max_connection_pool_size": int(os.getenv("NEO4J_MAX_POOL_SIZE", 50)),
            "connection_acquisition_timeout": int(os.getenv("NEO4J_ACQUISITION_TIMEOUT", 60)),
            "max_retry_time": int(os.getenv("NEO4J_MAX_RETRY_TIME", 30)),
            "encrypted": os.getenv("NEO4J_ENCRYPTED", "false").lower() == "true",
            "trust": os.getenv("NEO4J_TRUST", "TRUST_ALL_CERTIFICATES")
        }
        
        self._initialize_driver()
    
    def _initialize_driver(self):
        """Initialize Neo4j driver with retry logic"""
        max_attempts = 3
        retry_delay = 2
        
        for attempt in range(1, max_attempts + 1):
            try:
                self._connection_attempts += 1
                logger.info(f"Initializing driver (attempt {attempt}/{max_attempts})")
                
                self._driver = GraphDatabase.driver(
                    self.uri,
                    auth=(self.username, self.password),
                    **self.config
                )
                
                # Verify connection
                with self._driver.session(database=self.database) as session:
                    session.run("RETURN 1").single()
                
                logger.info("✓ Driver initialized successfully")
                return
                
            except Exception as e:
                logger.error(f"Connection attempt {attempt} failed: {e}")
                if attempt < max_attempts:
                    time.sleep(retry_delay)
                    retry_delay *= 2  # Exponential backoff
                else:
                    raise Exception(f"Failed to connect after {max_attempts} attempts: {e}")
    
    @contextmanager
    def get_session(self):
        """Context manager for database sessions"""
        session = None
        try:
            if not self._driver:
                self._initialize_driver()
            
            session = self._driver.session(database=self.database)
            yield session
            
        except Exception as e:
            logger.error(f"Session error: {e}")
            raise
        finally:
            if session:
                session.close()
    
    def execute_query(self, query: str, parameters: Optional[Dict[str, Any]] = None, retry_count: int = 3):
        """Execute query with retry logic and error handling"""
        parameters = parameters or {}
        
        for attempt in range(1, retry_count + 1):
            try:
                with self.get_session() as session:
                    start_time = time.time()
                    result = session.run(query, parameters)
                    records = [record for record in result]
                    execution_time = time.time() - start_time
                    
                    self._successful_queries += 1
                    logger.debug(f"Query executed successfully in {execution_time:.3f}s")
                    return records
                    
            except Exception as e:
                self._failed_queries += 1
                logger.error(f"Query attempt {attempt} failed: {e}")
                
                if attempt < retry_count:
                    wait_time = 2 ** attempt  # Exponential backoff
                    logger.info(f"Retrying in {wait_time} seconds...")
                    time.sleep(wait_time)
                else:
                    raise Exception(f"Query failed after {retry_count} attempts: {e}")
    
    def execute_write_transaction(self, transaction_function: Callable, **kwargs):
        """Execute write transaction with proper error handling"""
        try:
            with self.get_session() as session:
                return session.execute_write(transaction_function, **kwargs)
        except Exception as e:
            logger.error(f"Write transaction failed: {e}")
            raise
    
    def execute_read_transaction(self, transaction_function: Callable, **kwargs):
        """Execute read transaction with proper error handling"""
        try:
            with self.get_session() as session:
                return session.execute_read(transaction_function, **kwargs)
        except Exception as e:
            logger.error(f"Read transaction failed: {e}")
            raise
    
    def health_check(self) -> Dict[str, Any]:
        """Comprehensive health check with metrics"""
        try:
            start_time = time.time()
            
            with self.get_session() as session:
                # Basic connectivity test
                result = session.run("RETURN datetime() as server_time, 'healthy' as status")
                record = result.single()
                
                # Get database info
                db_info = session.run("""
                    CALL dbms.components() YIELD name, versions, edition
                    RETURN name, versions[0] as version, edition
                """).single()
                
                # Get basic statistics
                stats = session.run("""
                    MATCH (n) 
                    RETURN count(n) as node_count
                    UNION ALL
                    MATCH ()-[r]->() 
                    RETURN count(r) as relationship_count
                """).data()
                
                response_time = time.time() - start_time
                
                return {
                    "status": "healthy",
                    "server_time": str(record["server_time"]),
                    "database": {
                        "name": db_info["name"],
                        "version": db_info["version"],
                        "edition": db_info["edition"]
                    },
                    "statistics": {
                        "nodes": stats[0]["node_count"] if stats else 0,
                        "relationships": stats[1]["relationship_count"] if len(stats) > 1 else 0
                    },
                    "connection_metrics": {
                        "connection_attempts": self._connection_attempts,
                        "successful_queries": self._successful_queries,
                        "failed_queries": self._failed_queries,
                        "response_time_ms": round(response_time * 1000, 2)
                    }
                }
                
        except Exception as e:
            return {
                "status": "unhealthy",
                "error": str(e),
                "connection_metrics": {
                    "connection_attempts": self._connection_attempts,
                    "successful_queries": self._successful_queries,
                    "failed_queries": self._failed_queries
                }
            }
    
    def close(self):
        """Close driver connection"""
        if self._driver:
            self._driver.close()
            logger.info("Driver connection closed")

# Initialize connection manager
print("🚀 INITIALIZING CONNECTION MANAGER:")
print("=" * 50)

try:
    connection_manager = Neo4jConnectionManager(
        uri=NEO4J_URI,
        username=NEO4J_USERNAME,
        password=NEO4J_PASSWORD,
        database=NEO4J_DATABASE
    )
    
    print("✓ Connection manager initialized successfully")
    
    # Perform health check
    health_status = connection_manager.health_check()
    print(f"✓ Health check status: {health_status['status']}")
    
    if health_status['status'] == 'healthy':
        print(f"✓ Database: {health_status['database']['name']} {health_status['database']['version']}")
        print(f"✓ Nodes: {health_status['statistics']['nodes']}")
        print(f"✓ Relationships: {health_status['statistics']['relationships']}")
        print(f"✓ Response time: {health_status['connection_metrics']['response_time_ms']}ms")
    else:
        print(f"✗ Health check failed: {health_status.get('error', 'Unknown error')}")
    
except Exception as e:
    print(f"✗ Failed to initialize connection manager: {e}")
    print("\n🔍 TROUBLESHOOTING:")
    print("1. Ensure Neo4j Docker container is running")
    print("2. Check network connectivity to port 7687")
    print("3. Verify credentials are correct")
    print("4. Check Docker container logs for errors")

print("=" * 50)

## Summary

In this notebook, you've:

1. ✅ Installed and verified all required dependencies
2. ✅ Configured Neo4j connection parameters using environment variables
3. ✅ Implemented an enterprise-grade connection manager with:
   - Connection pooling
   - Retry logic with exponential backoff
   - Health monitoring and metrics
   - Thread-safe operations
4. ✅ Verified database connectivity and performance

**Next Steps:** Proceed to `02_pydantic_models_and_validation.ipynb` to implement type-safe data models with Pydantic.