# Maritime Logistics Case Generation Pipeline

This notebook implements a multi-stage case generation system that creates realistic maritime logistics case studies with solutions. The pipeline leverages:

1. **Vector Database**: Qdrant Cloud collections with 1,346 maritime logistics datapoints and domain-specific guidelines
2. **Generative AI**: Gemini models for structured text generation
3. **Knowledge Integration**: Retrieval-augmented generation to incorporate accurate domain knowledge

The system follows the approach described in ADR 004-case-generation-strategy, creating cases that demonstrate practical applications of logistics regulations and requirements.

## 1. System Setup and Configuration

This section initializes connections to the vector database, sets up the Gemini API, and validates access to necessary data collections.

In [66]:
import sys
import os
import json
import time
import uuid
import random
from pathlib import Path
import pandas as pd
from tqdm.notebook import tqdm
from dotenv import load_dotenv
import google.generativeai as genai
from IPython.display import display, Markdown, HTML

# Add parent directory to path
sys.path.append("..")
from utils.qdrant_client import get_qdrant_client, get_embedding

# Load environment variables
load_dotenv()

# Configure Google Generative AI with API key
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))

# Model configuration
LLM_MODEL = "gemini-2.0-flash-exp"  # LLM for text generation
EMBEDDING_MODEL = "text-embedding-004"  # Model for embeddings

# Initialize the LLM model
genai_model = genai.GenerativeModel(LLM_MODEL)

# Set up logging
import logging
logger = logging.getLogger("case_generation")
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(asctime)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)

print(f"Models initialized: LLM={LLM_MODEL}, Embeddings={EMBEDDING_MODEL}")

Models initialized: LLM=gemini-2.0-flash-exp, Embeddings=text-embedding-004


In [67]:
# Additional imports to enable interim storage options
import re
import uuid
from qdrant_client.http import models
from datetime import datetime

In [68]:
# Set collection names
DATAPOINTS_COLLECTION = "maritime_logistics_kb"  # Your main collection with 1,346 datapoints
REFERENCES_COLLECTION = "case_generation_references"  # The collection with guidelines & examples

# Test connection
client = get_qdrant_client()
print(f"Connected to Qdrant Cloud. Available collections:")
collections = client.get_collections().collections
for collection in collections:
    count = client.count(collection_name=collection.name).count
    print(f"- {collection.name}: {count} points")

Connected to Qdrant Cloud. Available collections:
- case_generation_references: 46 points
- logistics_datapoints: 1345 points


In [149]:
# Update embedding model format for Google Vertex AI compatibility
print(f"Previous embedding model format: {EMBEDDING_MODEL}")
EMBEDDING_MODEL = f"models/{EMBEDDING_MODEL}" if not EMBEDDING_MODEL.startswith("models/") else EMBEDDING_MODEL
print(f"Updated embedding model format: {EMBEDDING_MODEL}")

# Test that the embedding works with updated format
def test_embedding():
    try:
        test_result = create_embeddings("test query")
        if test_result:
            print("✅ Embedding creation works with updated model format")
            return True
        else:
            print("❌ Embedding created but returned empty result")
            return False
    except Exception as e:
        print(f"❌ Error testing embeddings: {str(e)}")
        return False

# Run the test to verify our configuration works
test_embedding()

Previous embedding model format: text-embedding-004
Updated embedding model format: models/text-embedding-004


2025-03-28 17:08:23,824 - INFO - Created embeddings with models/text-embedding-004 in 0.35s
2025-03-28 17:08:23,824 - INFO - Created embeddings with models/text-embedding-004 in 0.35s
2025-03-28 17:08:23,824 - INFO - Created embeddings with models/text-embedding-004 in 0.35s
2025-03-28 17:08:23,824 - INFO - Created embeddings with models/text-embedding-004 in 0.35s


✅ Embedding creation works with updated model format


True

### 1.1 Checkpoint Storage Implementation

In [128]:
def make_json_serializable(obj):
    """Recursively convert objects to be JSON serializable
    
    This function handles:
    - Dictionaries (recursively processed)
    - Lists and tuples (recursively processed)
    - Custom objects with __dict__ attributes
    - Removes logger objects and callable items
    - Returns primitive types directly
    """
    if obj is None:
        return None
    elif isinstance(obj, (str, int, float, bool)):
        return obj
    elif isinstance(obj, dict):
        return {k: make_json_serializable(v) for k, v in obj.items() 
                if k != "logger" and not callable(v)}
    elif isinstance(obj, (list, tuple)):
        return [make_json_serializable(i) for i in obj]
    elif hasattr(obj, '__dict__'):
        # Convert custom objects to dictionaries
        try:
            return make_json_serializable(obj.__dict__)
        except:
            # If that fails, try string representation
            return str(obj)
    else:
        # Default to string representation for other types
        try:
            return str(obj)
        except:
            return "Non-serializable object"

In [69]:
def initialize_checkpoint(case=None):
    """Initialize a new case with checkpoint metadata"""
    checkpoint_dir = Path("../Data/Checkpoints")
    checkpoint_dir.mkdir(parents=True, exist_ok=True)
    
    if case is None:
        case = {}
        
    # Generate a unique ID if not present
    if "case_id" not in case:
        case["case_id"] = str(uuid.uuid4())
    
    # Set up checkpoint information
    case["checkpoint_file"] = str(checkpoint_dir / f"case_{case['case_id']}.json")
    case["checkpoint_history"] = []
    case["last_checkpoint"] = None
    case["creation_timestamp"] = datetime.now().isoformat()
    
    # Initial save
    save_checkpoint(case, "initialized")
    
    return case


In [129]:
def save_checkpoint(case, stage):
    """Save a checkpoint of the case generation process"""
    # Make a copy to avoid modifying the original
    case_copy = case.copy()
    
    # Update checkpoint metadata
    case_copy["last_checkpoint"] = stage
    case_copy["checkpoint_time"] = time.strftime("%Y-%m-%d %H:%M:%S")
    
    # Make JSON serializable
    case_copy = make_json_serializable(case_copy)
    
    # Ensure checkpoint directory exists
    os.makedirs(os.path.dirname(case_copy["checkpoint_file"]), exist_ok=True)
    
    # Save to file
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    
    print(f"✓ Checkpoint saved: {stage}")
    return case

In [71]:
def load_checkpoint(case_id=None):
    """Load a case from a checkpoint file"""
    checkpoint_dir = Path("../Data/Checkpoints")
    
    if case_id:
        # Load specific case
        checkpoint_file = checkpoint_dir / f"case_{case_id}.json"
        if checkpoint_file.exists():
            with open(checkpoint_file, "r", encoding="utf-8") as f:
                return json.load(f)
        else:
            print(f"No checkpoint found for case ID: {case_id}")
            return None
    else:
        # Find most recent checkpoint
        checkpoint_files = list(checkpoint_dir.glob("case_*.json"))
        if not checkpoint_files:
            print("No checkpoints found")
            return None
        
        # Sort by modification time (most recent first)
        checkpoint_files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
        
        # Load most recent checkpoint
        with open(checkpoint_files[0], "r", encoding="utf-8") as f:
            case = json.load(f)
            print(f"Loaded checkpoint from {checkpoint_files[0].name}")
            print(f"Last stage completed: {case.get('last_checkpoint', 'unknown')}")
            return case


In [72]:
def list_checkpoints(limit=10, include_completed=False):
    """List available checkpoints with their status"""
    checkpoint_dir = Path("../Data/Checkpoints")
    checkpoint_files = list(checkpoint_dir.glob("case_*.json"))
    
    if not checkpoint_files:
        print("No checkpoints found")
        return []
    
    # Sort by modification time (most recent first)
    checkpoint_files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
    
    results = []
    for file in checkpoint_files[:limit]:
        try:
            with open(file, "r", encoding="utf-8") as f:
                case = json.load(f)
                
                # Skip completed cases if requested
                if not include_completed and case.get('last_checkpoint') == 'completed':
                    continue
                    
                results.append({
                    "case_id": case.get("case_id", "unknown"),
                    "last_stage": case.get("last_checkpoint", "unknown"),
                    "title": case.get("title", "Untitled Case"),
                    "created": case.get("creation_timestamp", "unknown"),
                    "modified": datetime.fromtimestamp(file.stat().st_mtime).isoformat(),
                    "checkpoint_file": str(file)
                })
        except Exception as e:
            print(f"Error reading {file}: {e}")
    
    # Print summary table
    print(f"Found {len(results)} checkpoints:")
    for i, r in enumerate(results):
        print(f"{i+1}. [{r['last_stage']}] {r['title']} ({r['modified']})")
    
    return results

### 1.2 Logging Solution

In [73]:
import logging
import json
from datetime import datetime
from pathlib import Path

class CaseGenerationLogger:
    """Logger for case generation pipeline with JSON file output"""
    
    def __init__(self, case_id=None, log_dir="../Data/Logs"):
        self.case_id = case_id or str(uuid.uuid4())[:8]
        self.log_dir = Path(log_dir)
        self.log_dir.mkdir(parents=True, exist_ok=True)
        
        # Create log filename with case ID
        self.log_file = self.log_dir / f"case_{self.case_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.jsonl"
        
        # Initialize metrics
        self.start_time = datetime.now()
        self.stage_timings = {}
        self.stage_start_time = None
        self.current_stage = None
        
        # Initialize the log file with header
        self.info("LOGGING_INITIALIZED", "Case generation logging started", {
            "case_id": self.case_id,
            "log_file": str(self.log_file),
            "timestamp": self.start_time.isoformat()
        })
        
        print(f"✓ Logging initialized for case {self.case_id}")
    
    def _log(self, level, event_type, message, data=None):
        """Write a log entry to the log file"""
        log_entry = {
            "timestamp": datetime.now().isoformat(),
            "level": level,
            "case_id": self.case_id,
            "event": event_type,
            "message": message,
            "stage": self.current_stage
        }
        
        # Add additional data if provided
        if data:
            log_entry["data"] = data
        
        # Write to log file
        with open(self.log_file, "a", encoding="utf-8") as f:
            f.write(json.dumps(log_entry) + "\n")
    
    def start_stage(self, stage_name):
        """Start timing a new stage"""
        self.current_stage = stage_name
        self.stage_start_time = datetime.now()
        self.info(f"STAGE_START", f"Starting stage: {stage_name}")
    
    def end_stage(self, success=True, result_summary=None):
        """End timing for the current stage and log results"""
        if not self.current_stage or not self.stage_start_time:
            return
        
        duration = (datetime.now() - self.stage_start_time).total_seconds()
        self.stage_timings[self.current_stage] = duration
        
        log_data = {
            "duration_seconds": duration,
            "success": success
        }
        
        if result_summary:
            log_data["result_summary"] = result_summary
            
        event_type = "STAGE_COMPLETE" if success else "STAGE_FAILED"
        self.info(
            event_type, 
            f"Stage {self.current_stage} completed in {duration:.2f}s",
            log_data
        )
    
    def debug(self, event_type, message, data=None):
        """Log debug information"""
        self._log("DEBUG", event_type, message, data)
    
    def info(self, event_type, message, data=None):
        """Log informational message"""
        self._log("INFO", event_type, message, data)
    
    def warning(self, event_type, message, data=None):
        """Log warning"""
        self._log("WARNING", event_type, message, data)
    
    def error(self, event_type, message, data=None):
        """Log error"""
        self._log("ERROR", event_type, message, data)
    
    def critical(self, event_type, message, data=None):
        """Log critical error"""
        self._log("CRITICAL", event_type, message, data)
    
    def log_llm_request(self, model, prompt_length, temperature=None, max_tokens=None):
        """Log an LLM request being made"""
        self.debug("LLM_REQUEST", f"Request to {model}", {
            "model": model,
            "prompt_length": prompt_length,
            "temperature": temperature,
            "max_tokens": max_tokens
        })
    
    def log_llm_response(self, model, response_length, duration):
        """Log an LLM response received"""
        self.debug("LLM_RESPONSE", f"Response from {model}", {
            "model": model,
            "response_length": response_length,
            "duration_seconds": duration
        })
    
    def log_data_retrieval(self, query, results_count, duration=None):
        """Log data retrieval from vector DB"""
        self.debug("DATA_RETRIEVAL", f"Retrieved {results_count} results", {
            "query": query[:100] + "..." if len(query) > 100 else query,
            "results_count": results_count,
            "duration_seconds": duration
        })
    
    def get_summary(self):
        """Get a summary of the logging activity"""
        if self.stage_timings:
            total_duration = sum(self.stage_timings.values())
            stages = len(self.stage_timings)
        else:
            total_duration = 0
            stages = 0
            
        return {
            "case_id": self.case_id,
            "log_file": str(self.log_file),
            "total_duration": total_duration,
            "stages_completed": stages,
            "stage_timings": self.stage_timings
        }
    
    def finalize(self):
        """Log final summary and completion"""
        total_duration = (datetime.now() - self.start_time).total_seconds()
        
        self.info("GENERATION_COMPLETE", f"Case generation completed in {total_duration:.2f}s", {
            "total_duration_seconds": total_duration,
            "stage_timings": self.stage_timings
        })
        
        return self.get_summary()

### 1.3 Arize Phoenix Tracing Activation

In [90]:
# Arize Phoenix setup with VertexAI instrumentation
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Phoenix configuration
USE_PHOENIX_TRACING = True
PHOENIX_PROJECT_ID = "maritime_logistics_case_generator"
PHOENIX_ENVIRONMENT = "development"  # We'll use this in span attributes

def setup_phoenix_tracing():
    """Set up Arize Phoenix tracing with VertexAI instrumentation"""
    try:
        # Import required packages
        from openinference.instrumentation.vertexai import VertexAIInstrumentor
        from arize.otel import register
        import google.generativeai as genai
        
        # Get API credentials
        SPACE_ID = os.getenv("ARIZE_SPACE_ID") 
        API_KEY = os.getenv("ARIZE_API_KEY")
        
        if not SPACE_ID or not API_KEY:
            print("⚠️ Missing ARIZE_SPACE_ID or ARIZE_API_KEY in .env file")
            return False
            
        # Initialize tracing - removing the environment parameter
        tracer_provider = register(
            space_id=SPACE_ID,
            api_key=API_KEY,
            project_name=PHOENIX_PROJECT_ID  # No environment parameter
        )
        
        # Instrument with VertexAI
        VertexAIInstrumentor().instrument(tracer_provider=tracer_provider)
        
        # Add custom instrumentation for Google Generative AI
        from opentelemetry import trace
        
        # Get a tracer
        tracer = trace.get_tracer("google-generativeai-custom")
        
        # Store the original method
        original_generate_content = genai.GenerativeModel.generate_content
        
        # Create a traced version - we'll include environment in the span attributes
        def traced_generate_content(self, contents, **kwargs):
            with tracer.start_as_current_span(
                "google.generativeai.generate_content",
                attributes={
                    "model": self.model_name,
                    "project_id": PHOENIX_PROJECT_ID,
                    "environment": PHOENIX_ENVIRONMENT,  # Include environment as an attribute
                    "service.name": "google-generativeai"
                }
            ) as span:
                # Add input to span
                span.set_attribute("input.content", str(contents)[:1000])  # Truncate if too long
                
                # Add generation parameters to span if present
                if kwargs.get("generation_config"):
                    config = kwargs["generation_config"]
                    if hasattr(config, "temperature"):
                        span.set_attribute("parameter.temperature", config.temperature)
                    if hasattr(config, "top_p"):
                        span.set_attribute("parameter.top_p", config.top_p)
                    if hasattr(config, "top_k"):
                        span.set_attribute("parameter.top_k", config.top_k)
                
                # Call the original method
                result = original_generate_content(self, contents, **kwargs)
                
                # Add output to span
                if hasattr(result, "text"):
                    span.set_attribute("output.text", result.text[:1000])  # Truncate if too long
                
                return result
        
        # Replace the original method with our traced version
        genai.GenerativeModel.generate_content = traced_generate_content
        
        print("✅ Arize Phoenix tracing initialized")
        print(f"   Project: {PHOENIX_PROJECT_ID}")
        print(f"   Environment: {PHOENIX_ENVIRONMENT} (set as span attribute)")
        
        return True
        
    except ImportError as e:
        print(f"⚠️ Import error: {e}")
        return False
    except Exception as e:
        print(f"❌ Error initializing Phoenix: {e}")
        return False

In [82]:
def skip_arize_setup():
    """Skip Arize Phoenix setup and continue with the notebook"""
    from IPython.display import display, HTML
    
    # Add placeholder global variables so the rest of the notebook works
    global USE_PHOENIX_TRACING, arize_tracing_enabled
    global PHOENIX_PROJECT_ID, PHOENIX_ENVIRONMENT
    
    USE_PHOENIX_TRACING = False
    arize_tracing_enabled = False
    PHOENIX_PROJECT_ID = "not_configured"
    PHOENIX_ENVIRONMENT = "development"
    
    display(HTML("""
    <div style="padding: 15px; background-color: #e8f4f8; border-radius: 5px; margin: 20px 0;">
        <h4 style="margin-top: 0;">✓ Arize Phoenix Setup Skipped</h4>
        <p>You can continue using the notebook without Phoenix tracing.</p>
        <p style="font-size: 0.9em; color: #555;">
            The case generation system will work normally, but you won't see traces in the Phoenix dashboard.
        </p>
    </div>
    """))

In [91]:
# Initialize Phoenix tracing if enabled
arize_tracing_enabled = True
if USE_PHOENIX_TRACING:
    arize_tracing_enabled = setup_phoenix_tracing()

Overriding of current TracerProvider is not allowed
Attempting to instrument while already instrumented


🔭 OpenTelemetry Tracing Details 🔭
|  Arize Project: maritime_logistics_case_generator
|  Span Processor: BatchSpanProcessor
|  Collector Endpoint: otlp.arize.com
|  Transport: gRPC
|  Transport Headers: {'space_id': '****', 'api_key': '****', 'user-agent': '****'}
|  
|  Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|  
|  `register` has set this TracerProvider as the global OpenTelemetry default.
|  To disable this behavior, call `register` with `set_global_tracer_provider=False`.

✅ Arize Phoenix tracing initialized
   Project: maritime_logistics_case_generator
   Environment: development (set as span attribute)


#### Test

In [92]:
# Test Phoenix tracing with a simple generation
import google.generativeai as genai

# Initialize model with our global variable
LLM_MODEL = "gemini-2.0-flash-exp"
genai_model = genai.GenerativeModel(LLM_MODEL)

def test_phoenix_tracing():
    """Test function to verify Phoenix tracing"""
    try:
        # Simple test prompt
        test_prompt = "Write a one-sentence description of maritime logistics."
        
        # Generate with tracing
        response = genai_model.generate_content(
            test_prompt,
            generation_config={"temperature": 0.7}
        )
        
        if response.text:
            print("✅ Generation successful!")
            print("\nGenerated text:")
            print("-" * 50)
            print(response.text)
            print("-" * 50)
            print("\nCheck the Phoenix dashboard to see the trace.")
        else:
            print("❌ No text generated")
            
    except Exception as e:
        print(f"❌ Error during test: {str(e)}")

# Run the test
test_phoenix_tracing()

✅ Generation successful!

Generated text:
--------------------------------------------------
Maritime logistics encompasses the planning, execution, and control of the movement and storage of goods via sea, ensuring efficient and timely delivery across global supply chains.

--------------------------------------------------

Check the Phoenix dashboard to see the trace.


## 2. Multi-Stage Pipeline Implementation

The case generation process follows five distinct stages, each building on the previous to create increasingly refined and accurate content.

### Stage 1: Case Draft Generation

**Process**: Selects a random example case from our reference collection and uses it as inspiration to generate a new, unique case draft focused on container shipping logistics between Asia and Northern Europe/Baltic.

**Expected Output**: Initial case scenario with realistic operational challenges, fictional but plausible entities, and a clear problem to solve.

In [96]:
# Set up logging
import logging
logger = logging.getLogger("case_generation")
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(asctime)s - %(levelname)s - %(message)s'))
logger.addHandler(handler)

In [132]:
def select_random_example():
    """Select a random example case from the database"""
    client = get_qdrant_client()
    
    try:
        # Get all examples
        examples, _ = client.scroll(
            collection_name=REFERENCES_COLLECTION,
            scroll_filter=models.Filter(  # Ensure this is scroll_filter, not filter
                must=[
                    models.FieldCondition(
                        key="content_type",
                        match=models.MatchValue(value="example")
                    )
                ]
            ),
            limit=100,
            with_payload=True
        )
        
        if not examples:
            print("⚠️ No example cases found in the database")
            # Return a default example
            return {
                "title": "Default Example",
                "summary": "This is a default example case for maritime logistics training. It involves a vessel carrying containers between ports, with delays and regulatory issues.",
                "filename": "default_example.md"
            }
        
        # Select a random example
        example = random.choice(examples)
        print(f"✅ Selected example: {example.payload.get('title', 'Untitled')}")
        
        return example.payload
        
    except Exception as e:
        print(f"⚠️ Error retrieving examples: {str(e)}")
        print("Using default example instead.")
        # Return a default example as fallback
        return {
            "title": "Default Example",
            "summary": "This is a default example case for maritime logistics training. It involves a vessel carrying containers between ports, with delays and regulatory issues.",
            "filename": "default_example.md"
        }

In [98]:
def generate_with_llm(prompt, temperature=0.7):
    """Generate text with LLM, with Phoenix tracing happening automatically"""
    try:
        # Start measuring time for our own logging
        start_time = time.time()
        
        # Use the model (tracing happens automatically through instrumentation)
        response = genai_model.generate_content(
            prompt,
            generation_config={"temperature": temperature}
        )
        duration = time.time() - start_time
        
        if response.candidates and response.candidates[0].content.parts:
            response_text = response.candidates[0].content.parts[0].text
            logger.info(f"Generated content with {LLM_MODEL}: {len(response_text)} chars in {duration:.2f}s")
            return response_text
        else:
            logger.warning("No response generated")
            return ""
            
    except Exception as e:
        error_msg = str(e)
        logger.error(f"Error generating with LLM {LLM_MODEL}: {error_msg}")
        raise

In [100]:
def generate_case_draft(example_case):
    """Generate initial case draft based on an example case"""
    # Create prompt for case generation
    prompt = f"""
    You are tasked with creating a new case study for maritime logistics training.
    
    I'll provide you with an EXAMPLE CASE for inspiration. Your task is to create a NEW CASE that:
    1. Is in a similar domain but with entirely different details
    2. Focuses on container shipping logistics between Asia and Northern Europe/Baltic
    3. Involves realistic operational challenges
    4. References specific ports, companies, and vessels (use realistic but fictional names)
    5. Presents a clear problem that needs resolution
    
    DO NOT copy the example directly - create something new that tests similar knowledge.
    
    EXAMPLE CASE SUMMARY:
    {example_case.get('summary', 'No summary available')}
    
    NEW CASE (write only the case description, not the solution):
    """
    
    print("Generating initial case draft...")
    case_draft = generate_with_llm(
        prompt,
        temperature=0.7  # Using default temperature, model is set in generate_with_llm
    )
    time.sleep(6)  # Rate limit
    
    return {
        "example_inspiration": example_case.get('title', 'Unknown example'),
        "example_filename": example_case.get('filename', 'unknown.md'),
        "draft_case": case_draft,
        "creation_date": time.strftime("%Y-%m-%d"),
        "stage": "draft",
        "model": LLM_MODEL  # Store the model used for reference
    }

In [101]:
def create_embeddings(text):
    """Create embeddings using the specified embedding model"""
    try:
        # Start measuring time
        start_time = time.time()
        
        # Generate embeddings
        embedding = genai.embed_content(
            model=EMBEDDING_MODEL,
            content=text,
            task_type="retrieval_document"
        )
        duration = time.time() - start_time
        
        logger.info(f"Created embeddings with {EMBEDDING_MODEL} in {duration:.2f}s")
        
        # Return the embedding values
        return embedding["embedding"]
    except Exception as e:
        logger.error(f"Error creating embeddings with {EMBEDDING_MODEL}: {str(e)}")
        raise

#### Test

In [102]:
# Test generation with tracing
test_result = generate_with_llm("Write a short paragraph about maritime logistics.")
print(test_result)

2025-03-28 14:17:44,667 - INFO - Generated content with gemini-2.0-flash-exp: 572 chars in 1.35s
2025-03-28 14:17:44,667 - INFO - Generated content with gemini-2.0-flash-exp: 572 chars in 1.35s
2025-03-28 14:17:44,667 - INFO - Generated content with gemini-2.0-flash-exp: 572 chars in 1.35s
2025-03-28 14:17:44,667 - INFO - Generated content with gemini-2.0-flash-exp: 572 chars in 1.35s


Maritime logistics is the backbone of global trade, encompassing the intricate processes of moving goods across oceans and waterways. It involves a complex network of ports, ships, terminals, and various transportation modes, all coordinated to ensure the efficient and cost-effective delivery of cargo. From managing customs clearance and documentation to optimizing vessel routes and container handling, maritime logistics plays a crucial role in connecting manufacturers, suppliers, and consumers worldwide, facilitating the flow of goods that fuel the global economy.



### Stage 2: Critical Analysis & Query Generation

**Process**: Analyzes the draft case against guidelines, identifies knowledge gaps, and generates search queries to find relevant regulations and information.

**Expected Output**: Structured analysis with specific search queries and keywords for retrieving relevant datapoints.

In [103]:
def analyze_and_generate_queries(case_draft):
    """Analyze the case draft and generate queries for datapoint retrieval"""
    
    # Load general guidelines
    guidelines = get_general_guidelines()
    
    # Create prompt for analysis
    prompt = f"""
    You are a maritime logistics expert analyzing a draft case study.
    
    CASE DRAFT:
    {case_draft['draft_case']}
    
    GUIDELINES FOR CASE QUALITY:
    {guidelines[:2000]}
    
    Your task:
    1. Identify key topics, entities, and regulations mentioned in the case
    2. Generate 5-8 specific search queries that would help find relevant datapoints 
    3. List 3-5 areas where the case could be improved with more specific regulatory details
    
    Format your response as follows:
    
    CASE ANALYSIS:
    [Brief analysis of the current case draft]
    
    KEY TOPICS:
    - Topic 1
    - Topic 2
    [etc.]
    
    SEARCH QUERIES:
    1. [Specific query 1]
    2. [Specific query 2]
    [etc.]
    
    AREAS FOR IMPROVEMENT:
    1. [Area 1 - what regulatory or factual detail is needed]
    2. [Area 2 - what regulatory or factual detail is needed]
    [etc.]
    """
    
    print("Analyzing case draft and generating queries...")
    analysis = generate_with_llm(prompt, temperature=0.2)  # Lower temperature for analysis
    time.sleep(6)  # Rate limit
    
    # Update case information
    case_draft.update({
        "analysis": analysis,
        "stage": "analysis",
        "model": LLM_MODEL  # Store the model used
    })
    
    return case_draft

In [125]:
def get_general_guidelines():
    """Retrieve general case generation guidelines"""
    client = get_qdrant_client()
    
    # Get guidelines
    guidelines, _ = client.scroll(
        collection_name=REFERENCES_COLLECTION,
        scroll_filter=models.Filter(  # FIXED: 'filter' → 'scroll_filter'
            must=[
                models.FieldCondition(
                    key="guideline_type",
                    match=models.MatchValue(value="general")
                )
            ]
        ),
        limit=100,
        with_payload=True
    )
    
    if not guidelines:
        return "No guidelines available"
    
    # Combine all guideline chunks
    combined = ""
    for guideline in guidelines:
        combined += guideline.payload.get('content', '') + "\n\n"
        
    return combined

### Stage 3: Datapoint Retrieval

**Process**: Uses the generated queries and keywords to search the vector database for relevant maritime regulations, requirements, and procedures.

**Expected Output**: Collection of contextually relevant datapoints that can inform case enhancements.

In [105]:
def find_relevant_datapoints(case_draft, max_points=15):
    """Find datapoints relevant to the case based on analysis"""
    # Extract queries from analysis
    analysis = case_draft.get('analysis', '')
    
    # Simple extraction of queries (in production, use more robust parsing)
    queries = []
    in_queries_section = False
    for line in analysis.split('\n'):
        if 'SEARCH QUERIES:' in line:
            in_queries_section = True
            continue
        if in_queries_section and line.strip() and not line.startswith('AREAS FOR'):
            # Strip numbers and punctuation
            query = line.strip()
            for prefix in ['1.', '2.', '3.', '4.', '5.', '6.', '7.', '8.', '9.', '- ']:
                if query.startswith(prefix):
                    query = query[len(prefix):].strip()
            queries.append(query)
        if 'AREAS FOR' in line:
            in_queries_section = False
    
    print(f"Extracted {len(queries)} queries from analysis:")
    for q in queries:
        print(f"- {q}")
        
    # Get relevant datapoints using each query
    all_datapoints = []
    client = get_qdrant_client()
    
    for query in queries:
        query_embedding = get_embedding(query)
        if not query_embedding:
            print(f"Warning: Failed to generate embedding for query '{query}'")
            continue
            
        results = client.search(
            collection_name=DATAPOINTS_COLLECTION,
            query_vector=query_embedding,
            limit=5  # Get top 5 per query
        )
        
        # Add results to our list, avoiding duplicates
        for result in results:
            datapoint_id = result.id
            if not any(dp['id'] == datapoint_id for dp in all_datapoints):
                all_datapoints.append({
                    'id': datapoint_id,
                    'score': result.score,
                    'payload': result.payload,
                    'query': query
                })
    
    # Sort by relevance and limit
    all_datapoints = sorted(all_datapoints, key=lambda x: x['score'], reverse=True)
    all_datapoints = all_datapoints[:max_points]
    
    print(f"Found {len(all_datapoints)} relevant datapoints")
    
    return all_datapoints


In [106]:
def find_relevant_guideline(case_draft):
    """Find the most relevant domain-specific guideline"""
    client = get_qdrant_client()
    
    # Get embedding for the case draft
    case_text = case_draft.get('draft_case', '')
    case_embedding = get_embedding(case_text)
    
    if not case_embedding:
        print("Warning: Failed to generate embedding for case draft")
        return "No relevant guidelines found"
    
    # Search for guidelines (excluding general guidelines)
    guidelines = client.search(
        collection_name=REFERENCES_COLLECTION,
        query_vector=case_embedding,
        filter=models.Filter(
            must=[
                models.FieldCondition(
                    key="content_type",
                    match=models.MatchValue(value="guideline")
                )
            ],
            must_not=[
                models.FieldCondition(
                    key="guideline_type",
                    match=models.MatchValue(value="general")
                )
            ]
        ),
        limit=5
    )
    
    if not guidelines:
        return "No relevant guidelines found"
    
    # Get most relevant guideline content
    top_guideline = guidelines[0].payload
    guideline_type = top_guideline.get('guideline_type', 'unknown')
    content = top_guideline.get('content', 'No content available')
    
    print(f"Found relevant guideline type: {guideline_type}")
    return content

### Stage 4: Contextual Enhancement

**Process**: Identifies the most relevant domain-specific guideline based on case content, then enhances the case with specific regulatory details from retrieved datapoints.

**Expected Output**: An improved case that incorporates specific regulations, requirements, and realistic logistics processes.

In [107]:
def enhance_case_with_context(case_draft, datapoints, guideline):
    """Enhance the case draft with datapoints and guidelines"""
    
    # Prepare datapoints in a readable format
    datapoint_text = ""
    for i, dp in enumerate(datapoints, 1):
        payload = dp['payload']
        datapoint_text += f"DATAPOINT {i}:\n"
        datapoint_text += f"Type: {payload.get('datapoint_type', 'Unknown')}\n"
        datapoint_text += f"Entity: {payload.get('relevant_entity', 'Unknown')}\n"
        datapoint_text += f"Content: {payload.get('content', 'No content')}\n\n"
    
    # Create prompt for enhancement
    prompt = f"""
    You are a maritime logistics expert improving a case study with specific factual and regulatory details.
    
    ORIGINAL CASE DRAFT:
    {case_draft['draft_case']}
    
    ANALYSIS AND SUGGESTIONS:
    {case_draft['analysis']}
    
    RELEVANT DOMAIN GUIDELINE:
    {guideline[:2000]}
    
    RELEVANT DATAPOINTS:
    {datapoint_text[:3000]}
    
    Your task is to enhance the original case draft by:
    1. Adding specific regulatory references from the datapoints
    2. Including more detailed logistics processes mentioned in the guidelines
    3. Making the scenario more realistic and detailed
    4. Ensuring the case aligns with industry best practices
    5. Developing the case narrative with attention to the guidelines
    
    ENHANCED CASE:
    """
    
    print("Enhancing case with contextual information...")
    # Using higher temperature for creative enhancement
    enhanced_case = generate_with_llm(prompt, temperature=0.7)
    time.sleep(6)  # Rate limit
    
    # Update case information
    case_draft.update({
        "enhanced_case": enhanced_case,
        "datapoints_used": [dp['id'] for dp in datapoints],
        "relevant_guideline": guideline[:500] + "...",
        "stage": "enhanced",
        "model": LLM_MODEL  # Store the model used
    })
    
    return case_draft

### Stage 5: Solution Development

**Process**: Generates a comprehensive solution addressing all aspects of the enhanced case, referencing specific regulations and providing clear guidance.

**Expected Output**: Professional consulting-style solution with step-by-step recommendations and regulatory justifications.

In [108]:
def generate_solution(case_draft, datapoints):
    """Generate a solution for the enhanced case"""
    
    # Prepare datapoints in a readable format
    datapoint_text = ""
    for i, dp in enumerate(datapoints, 1):
        payload = dp['payload']
        datapoint_text += f"DATAPOINT {i}:\n"
        datapoint_text += f"Type: {payload.get('datapoint_type', 'Unknown')}\n"
        datapoint_text += f"Entity: {payload.get('relevant_entity', 'Unknown')}\n"
        datapoint_text += f"Content: {payload.get('content', 'No content')}\n\n"
    
    # Create prompt for solution
    prompt = f"""
    You are a maritime logistics expert providing a solution to a case study.
    
    CASE:
    {case_draft.get('enhanced_case', case_draft['draft_case'])}
    
    RELEVANT DATAPOINTS:
    {datapoint_text[:3000]}
    
    Your task is to provide a comprehensive solution that:
    1. Addresses the core issues in the case
    2. References specific regulations and requirements from the datapoints
    3. Provides step-by-step guidance on how to resolve the situation
    4. Recommends best practices for similar situations in the future
    5. Includes any documentation or stakeholder communication needed
    
    Format your response as a professional consulting solution.
    
    SOLUTION:
    """
    
    print("Generating solution for the case...")
    # Medium temperature (0.5) for professional yet creative solution
    solution = generate_with_llm(prompt, temperature=0.5)
    time.sleep(6)  # Rate limit
    
    # Format as final case-solution pair
    final_case = {
        "title": generate_case_title(case_draft),
        "case": case_draft.get('enhanced_case', case_draft['draft_case']),
        "solution": solution,
        "metadata": {
            "creation_date": case_draft['creation_date'],
            "inspiration": case_draft['example_inspiration'],
            "datapoints_used": case_draft.get('datapoints_used', []),
            "process": "Four-stage guided generation with context enhancement",
            "model": LLM_MODEL  # Add model information to metadata
        }
    }
    
    return final_case

In [109]:
def generate_case_title(case_draft):
    """Generate a title for the case"""
    
    # Use the enhanced case if available, otherwise use the draft
    case_text = case_draft.get('enhanced_case', case_draft['draft_case'])
    
    # Create prompt for title generation
    prompt = f"""
    Based on the following case study, create a concise, professional title that:
    1. Captures the core challenge or situation
    2. Mentions the key company or organization
    3. Is under 10 words in length
    
    CASE:
    {case_text[:1000]}
    
    TITLE:
    """
    
    # Generate title with very low temperature for precision
    title = generate_with_llm(prompt, temperature=0.1)
    
    # Clean up the title (remove quotes or extra formatting)
    title = title.strip().strip('"').strip("'")
    
    return title

In [113]:
from qdrant_client.http import models

def get_general_guideline():
    """Retrieve the general case generation guideline using configured embedding model"""
    client = get_qdrant_client()
    
    # Query for general guidelines
    results = client.search(
        collection_name=REFERENCES_COLLECTION,
        query_vector=create_embeddings("general case study guidelines maritime logistics"),
        query_filter=models.Filter(
            must=[
                models.FieldCondition(
                    key="guideline_type",
                    match=models.MatchValue(value="general")
                )
            ]
        ),
        limit=10,
        with_payload=True,
        with_vectors=False  # No need to return vectors
    )
    
    # Combine all general guideline chunks
    guideline_text = ""
    for result in results:
        if "content" in result.payload:
            guideline_text += result.payload["content"] + "\n\n"
    
    return {
        "text": guideline_text,
        "model": EMBEDDING_MODEL,  # Track which embedding model was used
        "chunks_found": len(results)
    }

In [114]:
def analyze_case_draft(case_draft):
    """Analyze the case draft and generate search queries"""
    # Get general guidelines
    guideline = get_general_guideline()
    
    # Create prompt for analysis
    prompt = f"""
    I'll provide you with a DRAFT CASE and GUIDELINES for case creation. Your task is to:
    1. Critically analyze the case against the guidelines
    2. Identify topics and concepts that need to be researched in our datapoints
    3. Generate search queries to find relevant regulations and requirements
    
    GUIDELINES:
    {guideline[:3000]}  # Truncate if too long
    
    DRAFT CASE:
    {case_draft["draft_case"]}
    
    Please respond with:
    
    ## Case Analysis
    [Provide a critical assessment of the draft case against the guidelines]
    
    ## Knowledge Gaps
    [List specific areas where more regulatory/requirement information is needed]
    
    ## Search Queries
    [Provide 10 specific search queries that would help find relevant datapoints in our database]
    
    ## Keywords
    [List 10-15 specific keywords that are most relevant to this case]
    """
    
    print("Analyzing case draft and generating search queries...")
    # Low temperature for analytical task
    analysis = generate_with_llm(prompt, temperature=0.3)
    time.sleep(6)  # Rate limit
    
    # Add analysis to the case
    case_draft["analysis"] = analysis
    case_draft["stage"] = "analyzed"
    case_draft["model"] = LLM_MODEL  # Add model information
    
    return case_draft

In [115]:
def extract_queries_and_keywords(case):
    """Extract search queries and keywords from the analysis"""
    analysis = case.get("analysis", "")
    
    # Try to extract queries
    queries = []
    if "## Search Queries" in analysis:
        queries_section = analysis.split("## Search Queries")[1].split("##")[0]
        for line in queries_section.strip().split("\n"):
            clean_line = line.strip()
            if clean_line and not clean_line.startswith("#"):
                # Remove leading numbers or bullet points
                clean_line = re.sub(r"^\d+\.\s*|\-\s*", "", clean_line)
                if clean_line:
                    queries.append(clean_line)
    
    # Try to extract keywords
    keywords = []
    if "## Keywords" in analysis:
        keywords_section = analysis.split("## Keywords")[1].split("##")[0] if "##" in analysis.split("## Keywords")[1] else analysis.split("## Keywords")[1]
        for line in keywords_section.strip().split("\n"):
            clean_line = line.strip()
            if clean_line and not clean_line.startswith("#"):
                # Remove leading numbers or bullet points
                clean_line = re.sub(r"^\d+\.\s*|\-\s*", "", clean_line)
                if clean_line:
                    for word in clean_line.split(","):
                        word = word.strip()
                        if word:
                            keywords.append(word)
    
    # If no queries or keywords found, generate some default ones
    if not queries:
        print("No search queries found in analysis, using defaults")
        queries = ["maritime logistics", "container shipping", "customs requirements"]
    
    if not keywords:
        print("No keywords found in analysis, using defaults")
        keywords = ["logistics", "shipping", "maritime", "container"]
    
    return queries, keywords


In [116]:
def retrieve_relevant_datapoints(case):
    """Retrieve relevant datapoints using the queries and keywords with configured embedding model"""
    # Extract queries and keywords
    queries, keywords = extract_queries_and_keywords(case)
    print(f"Using {len(queries)} queries and {len(keywords)} keywords to find relevant datapoints")
    print(f"Using embedding model: {EMBEDDING_MODEL}")
    
    client = get_qdrant_client()
    all_datapoints = []
    query_stats = []  # Track performance of each query
    
    # Search using each query
    for query in tqdm(queries, desc="Searching datapoints"):
        query_embedding = create_embeddings(query)  # Use our standard embedding function
        if query_embedding:
            results = client.search(
                collection_name=REFERENCES_COLLECTION,  # Use global constant
                query_vector=query_embedding,
                limit=5,
                with_payload=True,
                with_vectors=False  # Save bandwidth by not retrieving vectors
            )
            
            # Track query performance
            query_stats.append({
                "query": query,
                "results_found": len(results),
                "top_score": results[0].score if results else 0
            })
            
            for result in results:
                # Add score, query info, and embedding model
                result.payload["search_score"] = result.score
                result.payload["matched_query"] = query
                result.payload["embedding_model"] = EMBEDDING_MODEL
                all_datapoints.append(result.payload)
    
    # Deduplicate datapoints
    unique_datapoints = []
    seen_ids = set()
    for datapoint in all_datapoints:
        dp_id = datapoint.get("id", str(datapoint))
        if dp_id not in seen_ids:
            seen_ids.add(dp_id)
            unique_datapoints.append(datapoint)
    
    print(f"Retrieved {len(unique_datapoints)} unique datapoints")
    
    # Add to case with enhanced metadata
    case.update({
        "relevant_datapoints": unique_datapoints,
        "search_queries": queries,
        "keywords": keywords,
        "stage": "datapoints_retrieved",
        "search_metadata": {
            "embedding_model": EMBEDDING_MODEL,
            "total_queries": len(queries),
            "unique_datapoints": len(unique_datapoints),
            "query_stats": query_stats,
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
        }
    })
    
    return case

In [117]:
def find_relevant_guideline(case):
    """Find the most relevant domain-specific guideline using configured embedding model"""
    client = get_qdrant_client()
    
    # Create a query from the case draft
    query_text = case.get("draft_case", "")[:5000]  # Limit length
    print(f"Creating embedding using model: {EMBEDDING_MODEL}")
    query_embedding = create_embeddings(query_text)
    
    if not query_embedding:
        print(f"❌ Failed to create embedding for case draft using {EMBEDDING_MODEL}")
        return None
    
    # Search for domain-specific guidelines
    results = client.search(
        collection_name=REFERENCES_COLLECTION,
        query_vector=query_embedding,
        query_filter=models.Filter(
            must=[
                models.FieldCondition(
                    key="content_type", 
                    match=models.MatchValue(value="guideline")
                ),
                models.FieldCondition(
                    key="guideline_type",
                    match=models.MatchAny(any=["bishou", "maritime", "ocean"])
                )
            ]
        ),
        limit=3,
        with_payload=True,
        with_vectors=False  # Optimize by not retrieving vectors
    )
    
    if not results:
        print("❌ No relevant guidelines found")
        return None
    
    # Get the top guideline
    top_guideline = results[0].payload
    guideline_type = top_guideline.get("guideline_type", "unknown")
    print(f"✅ Found relevant guideline: {guideline_type} (score: {results[0].score:.4f})")
    
    # Add all relevant guideline chunks
    guideline_content = ""
    for result in results:
        if "content" in result.payload:
            guideline_content += result.payload["content"] + "\n\n"
    
    return {
        "guideline_type": guideline_type,
        "content": guideline_content,
        "score": results[0].score,
        "embedding_model": EMBEDDING_MODEL,
        "chunks_found": len(results),
        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
        "query_length": len(query_text)
    }

In [118]:
def enhance_case_with_context(case):
    """Enhance the case with datapoints and domain-specific guidelines"""
    # Find relevant guideline
    guideline = find_relevant_guideline(case)
    
    # Prepare datapoints for prompt
    datapoints_text = ""
    for i, dp in enumerate(case.get("relevant_datapoints", [])[:15]):  # Limit to top 15
        datapoints_text += f"\nDATAPOINT {i+1}:\n"
        datapoints_text += f"Title: {dp.get('title', 'Untitled')}\n"
        datapoints_text += f"Content: {dp.get('content', 'No content')}\n"
        if "port_area" in dp:
            datapoints_text += f"Port Area: {dp.get('port_area', 'Unknown')}\n"
        if "relevant_entity" in dp:
            datapoints_text += f"Entity: {dp.get('relevant_entity', 'Unknown')}\n"
    
    # Create prompt
    prompt = f"""
    I'll provide you with a DRAFT CASE, DATAPOINTS, and DOMAIN GUIDELINES. Your task is to:
    1. Enhance the case with specific regulatory details from the datapoints
    2. Ensure it follows domain-specific guidelines 
    3. Make the scenario more realistic and educational
    4. Add a clear title for the case
    
    DRAFT CASE:
    {case.get("draft_case", "")}
    
    RELEVANT DATAPOINTS:
    {datapoints_text}
    
    DOMAIN GUIDELINES:
    {guideline.get("content", "") if guideline else "No specific guidelines available"}
    
    Please respond with:
    
    ## Case Title
    [Provide a concise, descriptive title for this case]
    
    ## Enhanced Case
    [Provide the improved case with specific regulations and requirements from the datapoints]
    """
    
    print(f"Enhancing case with contextual information using {LLM_MODEL}...")
    enhanced_content = generate_with_llm(
        prompt,
        temperature=0.7  # Higher temperature for creative enhancement
    )
    time.sleep(6)  # Rate limit
    
    # Extract title and enhanced case
    title = "Untitled Case"
    enhanced_case = enhanced_content
    
    if "## Case Title" in enhanced_content:
        parts = enhanced_content.split("## Case Title")
        title_section = parts[1].split("##")[0]
        title = title_section.strip()
        
    if "## Enhanced Case" in enhanced_content:
        parts = enhanced_content.split("## Enhanced Case")
        enhanced_case = parts[1].strip()
    
    # Update case with enhanced metadata
    case.update({
        "title": title,
        "enhanced_case": enhanced_case,
        "domain_guideline": guideline.get("guideline_type") if guideline else "none",
        "stage": "enhanced",
        "enhancement_metadata": {
            "model": LLM_MODEL,
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
            "datapoints_used": len(case.get("relevant_datapoints", [])),
            "guideline_type": guideline.get("guideline_type") if guideline else "none",
            "enhancement_version": "2.0"
        }
    })
    
    return case

In [119]:
def develop_solution(case):
    """Develop a solution for the case using configured LLM model"""
    # Create prompt for solution development
    datapoints_text = ""
    for i, dp in enumerate(case.get("relevant_datapoints", [])[:10]):  # Limit to top 10
        datapoints_text += f"\nDATAPOINT {i+1}:\n"
        datapoints_text += f"Title: {dp.get('title', 'Untitled')}\n"
        datapoints_text += f"Content: {dp.get('content', 'No content')}\n"
        if "relevant_entity" in dp:
            datapoints_text += f"Entity: {dp.get('relevant_entity', 'Unknown')}\n"
    
    prompt = f"""
    I'll provide you with an ENHANCED CASE. Your task is to:
    1. Develop a comprehensive solution that addresses all aspects of the case
    2. Reference specific regulations and requirements that apply
    3. Explain the reasoning behind the solution
    4. Structure the solution clearly with steps/recommendations
    
    CASE TITLE:
    {case.get("title", "Untitled Case")}
    
    CASE SCENARIO:
    {case.get("enhanced_case", case.get("draft_case", ""))}
    
    RELEVANT DATAPOINTS:
    {datapoints_text}
    
    Please provide a detailed solution that demonstrates understanding of maritime logistics regulations and requirements.
    Structure your response as follows:
    
    ## Executive Summary
    [Brief overview of the solution]
    
    ## Detailed Solution Steps
    [Step-by-step solution with regulatory references]
    
    ## Recommendations
    [Key recommendations and best practices]
    
    ## Risk Mitigation
    [Potential risks and mitigation strategies]
    """
    
    print(f"Developing solution using {LLM_MODEL}...")
    solution = generate_with_llm(
        prompt,
        temperature=0.4  # Lower temperature for more focused solution
    )
    time.sleep(6)  # Rate limit
    
    # Update case with solution and metadata
    case.update({
        "solution": solution,
        "stage": "completed",
        "solution_metadata": {
            "model": LLM_MODEL,
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
            "datapoints_referenced": len(case.get("relevant_datapoints", [])),
            "solution_version": "2.0",
            "prompt_tokens": len(prompt),  # Approximate token count
            "structure_version": "four_part"  # Track solution structure version
        }
    })
    
    return case

In [133]:
def log_stage_error(case, stage_name, error):
    """Log a stage error and print details"""
    error_msg = str(error)
    print(f"\n❌ ERROR IN {stage_name.upper()}: {error_msg}")
    
    if 'logger' in case:
        case['logger'].error(f"{stage_name.upper()}_ERROR", f"Error in {stage_name}: {error_msg}", {
            "error": error_msg,
            "stage": stage_name
        })
    
    print("\nStack trace:")
    import traceback
    traceback.print_exc()
    
    return error_msg


In [4]:
def initialize_checkpoint():
    """Initialize a new case checkpoint with basic metadata and unique ID"""
    # Generate unique case ID
    timestamp = time.strftime("%Y%m%d-%H%M%S")
    random_suffix = ''.join(random.choices(string.ascii_lowercase + string.digits, k=6))
    case_id = f"case-{timestamp}-{random_suffix}"
    
    # Create initial case structure
    case = {
        "case_id": case_id,
        "creation_time": time.strftime("%Y-%m-%d %H:%M:%S"),
        "last_checkpoint": "initialized",
        "pipeline_version": "2.0",
        "draft_complete": False,
        "analysis_complete": False,
        "datapoints_complete": False,
        "enhancement_complete": False,
        "solution_complete": False
    }
    
    return case

def save_final_case(case):
    """Save the final case to output directory"""
    # Create a sanitized file name for the case
    title = case.get("title", "Untitled_Case")
    sanitized_title = "".join(c if c.isalnum() or c in " _-" else "_" for c in title).replace(" ", "_")
    
    # Get the case ID
    case_id = case["case_id"]
    
    # Create the output directory if it doesn't exist
    output_dir = Path("../Output/Generated_Cases")
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Prepare the case copy for saving (remove logger)
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    
    # Save the file
    filepath = output_dir / f"{sanitized_title}.json"
    with open(filepath, "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    
    print(f"✅ Final case saved to {filepath}")
    
    return filepath

def log_stage_error(case, stage_name, error):
    """Log an error that occurred during a pipeline stage"""
    error_msg = str(error)
    print(f"❌ Error in {stage_name} stage: {error_msg}")
    
    if 'logger' in case and case['logger']:
        case['logger'].error(f"ERROR_{stage_name.upper()}", f"Error in {stage_name} stage: {error_msg}", {
            "error": error_msg,
            "stage": stage_name,
            "llm_model": LLM_MODEL,
            "embedding_model": EMBEDDING_MODEL
        })
        case['logger'].end_stage(success=False, result_summary={"error": error_msg})

def run_case_generation_pipeline(resume_from=None, save_checkpoints=True, debug=True):
    """Run the full case generation pipeline with checkpoint saving and logging"""
    case = None
    
    # Debug mode indication
    if debug:
        print("🔍 Running pipeline in DEBUG mode")
    
    # Resume from existing checkpoint if requested
    if resume_from:
        case = load_checkpoint(resume_from)
        if not case:
            print(f"Could not load checkpoint {resume_from}, starting new case")
    
    # Initialize new case if not resuming
    if not case:
        print("STAGE 0: INITIALIZATION")
        case = initialize_checkpoint()
        print(f"Created new case with ID: {case['case_id']}")
    
    # Initialize logger
    logger = CaseGenerationLogger(case_id=case['case_id'])
    case['logger'] = logger  # Store reference to logger
    
    # Track the current stage
    current_stage = case.get("last_checkpoint", "initialized")
    
    try:
        # Stage 1: Case Draft Generation
        if current_stage in ["initialized"]:
            logger.start_stage("draft_generation")
            print(f"\nSTAGE 1: CASE DRAFT GENERATION (using {LLM_MODEL})")
            
            try:
                if debug: print("  1.1 Selecting random example case...")
                example_case = select_random_example()
                logger.info("EXAMPLE_SELECTED", f"Selected example case", {
                    "example_length": len(example_case) if example_case else 0
                })
                
                if debug: print("  1.2 Generating case draft...")
                # Log LLM request
                prompt_length = len(example_case.get("summary", ""))[:2000] + 500  # Approximate
                logger.log_llm_request(LLM_MODEL, prompt_length)
                
                # Time the LLM call
                start_time = datetime.now()
                case.update(generate_case_draft(example_case))
                duration = (datetime.now() - start_time).total_seconds()
                
                # Log LLM response
                logger.log_llm_response(
                    LLM_MODEL, 
                    len(case.get("draft_case", "")),
                    duration
                )
                
                display(Markdown("### Initial Case Draft"))
                display(Markdown(case["draft_case"]))
                
                logger.info("DRAFT_GENERATED", "Case draft generated", {
                    "draft_length": len(case.get("draft_case", "")),
                    "model": LLM_MODEL
                })
                
                if save_checkpoints:
                    if debug: print("  1.3 Saving checkpoint...")
                    save_checkpoint(case, "draft")
                    
                logger.end_stage(result_summary={
                    "draft_length": len(case.get("draft_case", "")),
                    "model": LLM_MODEL
                })
            except Exception as e:
                log_stage_error(case, "draft_generation", e)
                if not debug:  # If not in debug mode, re-raise to exit
                    raise
        
        # Stage 2: Critical Analysis & Query Generation
        if current_stage in ["initialized", "draft"]:
            logger.start_stage("analysis")
            print(f"\nSTAGE 2: CRITICAL ANALYSIS & QUERY GENERATION (using {LLM_MODEL})")
            
            try:
                if debug: print("  2.1 Analyzing case draft...")
                # Log LLM request details 
                prompt_length = len(case.get("draft_case", "")) + 1000  # Approximate
                logger.log_llm_request(LLM_MODEL, prompt_length)
                
                # Time the LLM call
                start_time = datetime.now()
                case = analyze_case_draft(case)
                duration = (datetime.now() - start_time).total_seconds()
                
                # Log LLM response
                logger.log_llm_response(
                    LLM_MODEL, 
                    len(case.get("analysis", "")),
                    duration
                )
                
                display(Markdown("### Case Analysis"))
                display(Markdown(case["analysis"]))
                
                if debug: print("  2.2 Extracting queries and keywords...")
                # Extract and log queries/keywords
                queries, keywords = extract_queries_and_keywords(case)
                logger.info("ANALYSIS_COMPLETE", "Case analysis complete", {
                    "analysis_length": len(case.get("analysis", "")),
                    "query_count": len(queries),
                    "keyword_count": len(keywords),
                    "model": LLM_MODEL
                })
                
                if save_checkpoints:
                    if debug: print("  2.3 Saving checkpoint...")
                    save_checkpoint(case, "analyzed")
                    
                logger.end_stage(result_summary={
                    "query_count": len(queries),
                    "keyword_count": len(keywords),
                    "model": LLM_MODEL
                })
            except Exception as e:
                log_stage_error(case, "analysis", e)
                if not debug:  # If not in debug mode, re-raise to exit
                    raise
        
        # Stage 3: Retrieve Relevant Datapoints
        if current_stage in ["initialized", "draft", "analyzed"]:
            logger.start_stage("datapoint_retrieval")
            print(f"\nSTAGE 3: RETRIEVE RELEVANT DATAPOINTS (using {EMBEDDING_MODEL})")
            
            try:
                if debug: print("  3.1 Retrieving datapoints...")
                # Time datapoint retrieval
                start_time = datetime.now()
                case = retrieve_relevant_datapoints(case)
                duration = (datetime.now() - start_time).total_seconds()
                
                datapoint_count = len(case.get("relevant_datapoints", []))
                print(f"Retrieved {datapoint_count} datapoints")
                
                logger.info("DATAPOINTS_RETRIEVED", f"Retrieved {datapoint_count} datapoints", {
                    "datapoint_count": datapoint_count,
                    "retrieval_duration": duration,
                    "embedding_model": EMBEDDING_MODEL
                })
                
                if save_checkpoints:
                    if debug: print("  3.2 Saving checkpoint...")
                    save_checkpoint(case, "datapoints_retrieved")
                    
                logger.end_stage(result_summary={
                    "datapoint_count": datapoint_count,
                    "embedding_model": EMBEDDING_MODEL
                })
            except Exception as e:
                log_stage_error(case, "datapoint_retrieval", e)
                if not debug:  # If not in debug mode, re-raise to exit
                    raise
        
        # Stage 4: Contextual Enhancement
        if current_stage in ["initialized", "draft", "analyzed", "datapoints_retrieved"]:
            logger.start_stage("enhancement")
            print(f"\nSTAGE 4: CONTEXTUAL ENHANCEMENT (using {LLM_MODEL})")
            
            try:
                if debug: print("  4.1 Finding relevant domain-specific guideline...")
                guideline = find_relevant_guideline(case)
                
                if guideline:
                    print(f"✅ Found guideline: {guideline.get('guideline_type', 'unknown')} with {guideline.get('chunks_found', 0)} chunks")
                else:
                    print("⚠️ No relevant guidelines found, using default")
                    guideline = {
                        "guideline_type": "default_maritime",
                        "content": "Maritime logistics cases should demonstrate realistic challenges in container shipping."
                    }
                
                if debug: print("  4.2 Enhancing case with context...")
                # Log LLM request
                datapoints_text = json.dumps(case.get("relevant_datapoints", [])[:15])
                prompt_length = len(case.get("draft_case", "")) + len(datapoints_text)
                logger.log_llm_request(LLM_MODEL, prompt_length)
                
                # Time the enhancement
                start_time = datetime.now()
                case = enhance_case_with_context(case)
                duration = (datetime.now() - start_time).total_seconds()
                
                # Log LLM response
                logger.log_llm_response(
                    LLM_MODEL, 
                    len(case.get("enhanced_case", "")),
                    duration
                )
                
                display(Markdown(f"### Enhanced Case: {case.get('title', 'Untitled Case')}"))
                display(Markdown(case["enhanced_case"]))
                
                logger.info("CASE_ENHANCED", "Case enhanced with context", {
                    "title": case.get("title", "Untitled Case"),
                    "enhanced_length": len(case.get("enhanced_case", "")),
                    "domain_guideline": case.get("domain_guideline", "none"),
                    "model": LLM_MODEL
                })
                
                if save_checkpoints:
                    if debug: print("  4.3 Saving checkpoint...")
                    save_checkpoint(case, "enhanced")
                    
                logger.end_stage(result_summary={
                    "title": case.get("title", "Untitled Case"),
                    "domain_guideline": case.get("domain_guideline", "none"),
                    "model": LLM_MODEL
                })
            except Exception as e:
                log_stage_error(case, "enhancement", e)
                if not debug:  # If not in debug mode, re-raise to exit
                    raise
        
        # Stage 5: Solution Development
        if current_stage in ["initialized", "draft", "analyzed", "datapoints_retrieved", "enhanced"]:
            logger.start_stage("solution")
            print(f"\nSTAGE 5: SOLUTION DEVELOPMENT (using {LLM_MODEL})")
            
            try:
                if debug: print("  5.1 Developing solution...")
                # Log LLM request
                prompt_length = len(case.get("enhanced_case", "")) + 1000  # Approximate
                logger.log_llm_request(LLM_MODEL, prompt_length)
                
                # Time solution development
                start_time = datetime.now()
                case = develop_solution(case)
                duration = (datetime.now() - start_time).total_seconds()
                
                # Log LLM response
                logger.log_llm_response(
                    LLM_MODEL, 
                    len(case.get("solution", "")),
                    duration
                )
                
                display(Markdown("### Solution"))
                display(Markdown(case["solution"]))
                
                logger.info("SOLUTION_DEVELOPED", "Solution developed", {
                    "solution_length": len(case.get("solution", "")),
                    "model": LLM_MODEL
                })
                
                if save_checkpoints:
                    if debug: print("  5.2 Saving checkpoint...")
                    save_checkpoint(case, "completed")
                    
                logger.end_stage(result_summary={
                    "solution_length": len(case.get("solution", "")),
                    "model": LLM_MODEL
                })
            except Exception as e:
                log_stage_error(case, "solution", e)
                if not debug:  # If not in debug mode, re-raise to exit
                    raise
        
        # Finalization - Fixed indentation and structure
        current_stage = "finalization"
        try:
            if debug: print("\nFINALIZING CASE GENERATION")
            # Add generation metadata
            case["generation_metadata"] = {
                "llm_model": LLM_MODEL,
                "embedding_model": EMBEDDING_MODEL,
                "completion_time": time.strftime("%Y-%m-%d %H:%M:%S"),
                "pipeline_version": "2.0"
            }
            
            # Save the final case to output directory
            if debug: print("  6.1 Saving final case...")
            filepath = save_final_case(case)  # Using the proper function name
            case["final_path"] = str(filepath)
            
            # Log completion
            logger.info("CASE_SAVED", f"Case saved to {filepath}", {
                "filepath": str(filepath),
                "llm_model": LLM_MODEL,
                "embedding_model": EMBEDDING_MODEL
            })
            
            if save_checkpoints:
                if debug: print("  6.2 Saving final checkpoint...")
                save_checkpoint(case, "completed")
            
            # Finalize logging
            if debug: print("  6.3 Finalizing logs...")
            log_summary = logger.finalize()
            case["log_summary"] = log_summary
            
            print("\n✨ CASE GENERATION COMPLETE ✨")
            return case, filepath
            
        except Exception as e:
            log_stage_error(case, "finalization", e)
            return case, None
            
    except Exception as e:
        error_msg = str(e)
        print(f"\n❌ PIPELINE FAILURE: {error_msg}")
        
        # Log the error
        if 'logger' in case and case['logger']:
            case['logger'].error("PIPELINE_ERROR", f"Error in pipeline: {error_msg}", {
                "error": error_msg,
                "stage": current_stage,
                "llm_model": LLM_MODEL,
                "embedding_model": EMBEDDING_MODEL
            })
            case['logger'].end_stage(success=False, result_summary={"error": error_msg})
            
        if save_checkpoints and case:
            # Save checkpoint at point of failure
            case["error"] = error_msg
            case["error_metadata"] = {
                "llm_model": LLM_MODEL,
                "embedding_model": EMBEDDING_MODEL,
                "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
                "traceback": traceback.format_exc()
            }
            save_checkpoint(case, f"failed_at_{current_stage}")
            print(f"✓ Failure checkpoint saved: failed_at_{current_stage}")
            
        # Print stack trace in debug mode
        if debug:
            import traceback
            traceback.print_exc()
            
        return case, None

## 3. Pipeline Execution

This section executes the full pipeline and displays the resulting case-solution pair.

### 3.1 Pipeline Debugging

In [136]:
# 1. Initialize case and logging
def initialize_debug():
    """Initialize a new case for debugging"""
    print("🔍 Starting interactive debugging")
    
    # Create a new case
    case_id = str(uuid.uuid4())
    checkpoint_dir = Path("../Data/Checkpoints")
    os.makedirs(checkpoint_dir, exist_ok=True)
    
    case = {
        "case_id": case_id,
        "checkpoint_file": str(checkpoint_dir / f"case_{case_id}.json"),
        "checkpoint_history": [],
        "last_checkpoint": "initialized",  # FIXED: Set this explicitly to "initialized"
        "creation_timestamp": datetime.now().isoformat()
    }
    
    print(f"Created new case with ID: {case['case_id']}")
    print(f"Current stage: {case['last_checkpoint']}")
    
    # Initialize logger
    logger = CaseGenerationLogger(case_id=case['case_id'])
    case['logger'] = logger
    
    # Save initial checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]  # Remove logger before serializing
    
    # Ensure checkpoint directory exists
    os.makedirs(os.path.dirname(case_copy["checkpoint_file"]), exist_ok=True)
    
    # Save to file
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    
    print(f"✓ Checkpoint saved: initialized")
    
    return case


In [143]:
# Run the initialization
debug_case = initialize_debug()

🔍 Starting interactive debugging
Created new case with ID: 33b0c355-cb29-4e8b-af4f-683c248ce179
Current stage: initialized
✓ Logging initialized for case 33b0c355-cb29-4e8b-af4f-683c248ce179
✓ Checkpoint saved: initialized


In [137]:
# 2. Stage 1: Case Draft Generation
def debug_stage_1(case):
    """Run Stage 1: Case Draft Generation"""
    print(f"\nSTAGE 1: CASE DRAFT GENERATION (using {LLM_MODEL})")
    logger = case['logger']
    logger.start_stage("draft_generation")
    
    print("  1.1 Selecting random example case...")
    # Manually specify a simple example case to avoid database issues
    example_case = {
        "title": "Test Example Case",
        "summary": "A container ship experiences delays at Rotterdam port due to documentation issues and must navigate customs regulations.",
        "filename": "test_example.md"
    }
    logger.info("EXAMPLE_SELECTED", f"Selected example case", {
        "example_length": len(example_case["summary"])
    })
    
    print("  1.2 Generating case draft...")
    prompt = f"""
    You are tasked with creating a new case study for maritime logistics training.
    
    I'll provide you with an EXAMPLE CASE for inspiration. Your task is to create a NEW CASE that:
    1. Is in a similar domain but with entirely different details
    2. Focuses on container shipping logistics between Asia and Northern Europe/Baltic
    3. Involves realistic operational challenges
    4. References specific ports, companies, and vessels (use realistic but fictional names)
    5. Presents a clear problem that needs resolution
    
    DO NOT copy the example directly - create something new that tests similar knowledge.
    
    EXAMPLE CASE SUMMARY:
    {example_case["summary"]}
    
    NEW CASE (write only the case description, not the solution):
    """
    
    # Log LLM request
    prompt_length = len(prompt)
    logger.log_llm_request(LLM_MODEL, prompt_length)
    
    # Time the LLM call
    start_time = datetime.now()
    draft_case = generate_with_llm(prompt, temperature=0.7)
    duration = (datetime.now() - start_time).total_seconds()
    
    # Log LLM response
    logger.log_llm_response(
        LLM_MODEL, 
        len(draft_case),
        duration
    )
    
    case.update({
        "draft_case": draft_case,
        "example_inspiration": example_case["title"],
        "example_filename": example_case["filename"],
        "creation_date": time.strftime("%Y-%m-%d"),
        "stage": "draft",
        "last_checkpoint": "draft"  # Update the checkpoint stage
    })
    
    display(Markdown("### Initial Case Draft"))
    display(Markdown(case["draft_case"]))
    
    logger.info("DRAFT_GENERATED", "Case draft generated", {
        "draft_length": len(case.get("draft_case", "")),
        "model": LLM_MODEL
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: draft")
    
    logger.end_stage(result_summary={
        "draft_length": len(case.get("draft_case", "")),
        "model": LLM_MODEL
    })
    
    return case


In [144]:
# Run Stage 1
debug_case = debug_stage_1(debug_case)


STAGE 1: CASE DRAFT GENERATION (using gemini-2.0-flash-exp)
  1.1 Selecting random example case...
  1.2 Generating case draft...


2025-03-28 16:58:17,852 - INFO - Generated content with gemini-2.0-flash-exp: 3419 chars in 4.82s
2025-03-28 16:58:17,852 - INFO - Generated content with gemini-2.0-flash-exp: 3419 chars in 4.82s
2025-03-28 16:58:17,852 - INFO - Generated content with gemini-2.0-flash-exp: 3419 chars in 4.82s
2025-03-28 16:58:17,852 - INFO - Generated content with gemini-2.0-flash-exp: 3419 chars in 4.82s


### Initial Case Draft

## Case Study: The "Baltic Mariner" Bottleneck

**Scenario:**

Oceanic Shipping Solutions (OSS), a medium-sized container shipping company headquartered in Hamburg, Germany, specializes in trade routes between East Asia and the Baltic Sea region. OSS operates a fleet of ten container vessels, including the *Baltic Mariner*, a 6,800 TEU container ship.

The *Baltic Mariner* is currently en route from Shanghai, China, to St. Petersburg, Russia, a key destination for OSS's Baltic operations. Its cargo manifest includes a diverse range of goods, from electronics and textiles to automotive components and consumer goods, destined for various clients in Russia, Finland, and the Baltic states. The vessel's scheduled route includes calls at Busan (South Korea), Singapore, Colombo (Sri Lanka), and Rotterdam (Netherlands) before proceeding to St. Petersburg via the Kiel Canal.

Everything proceeded smoothly until the vessel’s arrival in Rotterdam. While initially scheduled for a 24-hour turnaround for unloading and loading operations, the *Baltic Mariner* encountered an unexpected delay. A new, stringent inspection regime, implemented by Dutch Customs and Port Authority due to recent concerns about undeclared hazardous materials, flagged several containers for mandatory inspection. These containers were identified based on a risk assessment algorithm that considered factors such as the declared cargo type, the shipper's history, and the origin of the goods.

Complicating matters further, the Rotterdam terminal, managed by ECT Delta Terminal, is experiencing peak season congestion due to a surge in import volumes from Asia. This has resulted in limited availability of inspection slots and longer waiting times for container handling.

Adding to the pressure, one of the refrigerated containers flagged for inspection is carrying temperature-sensitive pharmaceuticals destined for a hospital in St. Petersburg. Any significant delay could compromise the integrity of the cargo and lead to substantial financial losses and reputational damage for OSS.

The *Baltic Mariner* is now facing a potential 72-hour delay in Rotterdam. This delay will impact the vessel's arrival time in St. Petersburg, causing ripple effects on downstream logistics operations, including trucking and rail connections for the cargo destined for other Baltic countries. OSS is also contractually obligated to meet specific delivery deadlines with its clients, facing potential penalty clauses for late deliveries.

**The Problem:**

OSS needs to develop a strategy to minimize the impact of the delay in Rotterdam and ensure the timely delivery of its cargo, particularly the temperature-sensitive pharmaceuticals, to St. Petersburg and other Baltic destinations. This strategy must consider the following factors:

*   The limited availability of inspection slots and container handling capacity at the Rotterdam terminal.
*   The potential risks associated with delaying temperature-sensitive cargo.
*   The contractual obligations and potential penalty clauses for late deliveries.
*   The overall impact on OSS's reputation and customer relationships.
*   The cost implications of alternative solutions, such as expediting cargo handling or rerouting shipments.

How should OSS mitigate the impact of the Rotterdam delay and ensure the timely delivery of its cargo while minimizing financial losses and reputational damage?


✓ Checkpoint saved: draft


In [139]:
# 3. Stage 2: Critical Analysis & Query Generation
def debug_stage_2(case):
    """Run Stage 2: Critical Analysis & Query Generation"""
    print(f"\nSTAGE 2: CRITICAL ANALYSIS & QUERY GENERATION (using {LLM_MODEL})")
    logger = case['logger']
    logger.start_stage("analysis")
    
    print("  2.1 Using simplified guidelines...")
    # Hard-code simple guidelines for testing
    guidelines = """
    CASE STUDY GUIDELINES:
    1. Cases must include specific real-world regulations
    2. Include details about documentation requirements
    3. Reference actual port procedures and maritime regulations
    4. Make sure customs clearance processes are accurate
    5. Include realistic timelines for shipping operations
    """
    
    print("  2.2 Analyzing case draft...")
    prompt = f"""
    You are a maritime logistics expert analyzing a draft case study.
    
    CASE DRAFT:
    {case['draft_case']}
    
    GUIDELINES FOR CASE QUALITY:
    {guidelines}
    
    Your task:
    1. Identify key topics, entities, and regulations mentioned in the case
    2. Generate 5-8 specific search queries that would help find relevant datapoints 
    3. List 3-5 areas where the case could be improved with more specific regulatory details
    
    Format your response as follows:
    
    ## Case Analysis
    [Brief analysis of the current case draft]
    
    ## Key Topics
    - Topic 1
    - Topic 2
    [etc.]
    
    ## Search Queries
    1. [Specific query 1]
    2. [Specific query 2]
    [etc.]
    
    ## Areas for Improvement
    1. [Area 1 - what regulatory or factual detail is needed]
    2. [Area 2 - what regulatory or factual detail is needed]
    [etc.]
    """
    
    # Log LLM request
    prompt_length = len(prompt)
    logger.log_llm_request(LLM_MODEL, prompt_length)
    
    # Time the LLM call
    start_time = datetime.now()
    analysis = generate_with_llm(prompt, temperature=0.3)
    duration = (datetime.now() - start_time).total_seconds()
    
    # Log LLM response
    logger.log_llm_response(
        LLM_MODEL, 
        len(analysis),
        duration
    )
    
    case.update({
        "analysis": analysis,
        "stage": "analysis",
        "last_checkpoint": "analyzed",  # Update the checkpoint stage
        "model": LLM_MODEL
    })
    
    display(Markdown("### Case Analysis"))
    display(Markdown(case["analysis"]))
    
    print("  2.3 Extracting queries and keywords...")
    # Simplified extraction for debug purposes
    queries = []
    keywords = []
    
    for line in analysis.split('\n'):
        if line.strip().startswith('1.') and 'Search Queries' in analysis.split('\n')[analysis.split('\n').index(line)-5:analysis.split('\n').index(line)]:
            queries.append(line.strip()[3:])
        if line.strip().startswith('- ') and 'Key Topics' in analysis.split('\n')[analysis.split('\n').index(line)-5:analysis.split('\n').index(line)]:
            keywords.append(line.strip()[2:])
    
    # Ensure we have at least some queries and keywords
    if not queries:
        queries = ["maritime regulations Baltic Sea", "container shipping documentation requirements"]
    if not keywords:
        keywords = ["maritime logistics", "container shipping", "documentation"]
        
    case["search_queries"] = queries
    case["keywords"] = keywords
    
    print(f"  Extracted {len(queries)} queries and {len(keywords)} keywords")
    
    logger.info("ANALYSIS_COMPLETE", "Case analysis complete", {
        "analysis_length": len(case.get("analysis", "")),
        "query_count": len(queries),
        "keyword_count": len(keywords),
        "model": LLM_MODEL
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: analyzed")
    
    logger.end_stage(result_summary={
        "query_count": len(queries),
        "keyword_count": len(keywords),
        "model": LLM_MODEL
    })
    
    return case



In [145]:
# Run Stage 2
debug_case = debug_stage_2(debug_case)


STAGE 2: CRITICAL ANALYSIS & QUERY GENERATION (using gemini-2.0-flash-exp)
  2.1 Using simplified guidelines...
  2.2 Analyzing case draft...


2025-03-28 16:59:43,882 - INFO - Generated content with gemini-2.0-flash-exp: 3722 chars in 4.87s
2025-03-28 16:59:43,882 - INFO - Generated content with gemini-2.0-flash-exp: 3722 chars in 4.87s
2025-03-28 16:59:43,882 - INFO - Generated content with gemini-2.0-flash-exp: 3722 chars in 4.87s
2025-03-28 16:59:43,882 - INFO - Generated content with gemini-2.0-flash-exp: 3722 chars in 4.87s


### Case Analysis

## Case Analysis

The case study presents a realistic scenario of a container ship facing delays due to port congestion and increased customs scrutiny in Rotterdam. It highlights the complexities of maritime logistics, particularly the challenges of managing time-sensitive cargo and adhering to contractual obligations. The core problem revolves around mitigating the impact of the delay while minimizing financial and reputational damage. However, the case could benefit from more specific details regarding regulations, documentation, and port procedures to enhance its realism and analytical depth.

## Key Topics

- Port Congestion (Rotterdam)
- Customs Inspections (Netherlands)
- Temperature-Sensitive Pharmaceuticals
- Container Shipping
- Contractual Obligations (Shipping)
- Risk Assessment Algorithms (Customs)
- Reefer Container Management
- Maritime Logistics
- Kiel Canal Transit

## Search Queries

1. "Rotterdam port congestion peak season statistics"
2. "Dutch Customs inspection procedures containerized cargo"
3. "EU regulations temperature controlled pharmaceuticals transport"
4. "ECT Delta Terminal Rotterdam container handling capacity"
5. "Kiel Canal transit regulations container ships"
6. "INCOTERMS 2020 liability late delivery"
7. "EU customs risk assessment algorithm container cargo"
8. "Rotterdam Port Authority hazardous materials regulations"

## Areas for Improvement

1. **Specific Dutch Customs Regulations:** The case mentions a "new, stringent inspection regime." To improve the case, specify the exact regulation(s) being implemented (e.g., referencing specific articles within the Dutch Customs Act or EU regulations on customs controls). Include details on the documentation required for high-risk cargo and the specific criteria used to flag containers for inspection beyond the general "risk assessment algorithm."

2. **Reefer Container Monitoring and Documentation:** Elaborate on the specific temperature monitoring requirements for pharmaceutical shipments under EU regulations (e.g., GDP - Good Distribution Practice guidelines). Include details on the documentation required to prove temperature integrity (e.g., temperature logs, calibration certificates for monitoring equipment) and the procedures for reporting temperature excursions to relevant authorities. Mention specific standards like EN 12830 for temperature recorders.

3. **Contractual Obligations and Liability:** The case mentions "contractual obligations and potential penalty clauses." Specify the relevant INCOTERM used in the contract between OSS and its clients. This will define the point at which risk and responsibility for the cargo transfer. Detail the typical penalty clauses for late delivery in shipping contracts, including the calculation of demurrage and detention charges.

4. **Port Procedure Specifics:** Include details on the communication protocols between the shipping line (OSS), the terminal operator (ECT Delta), and Dutch Customs. What specific forms or electronic messages are required to request inspection slots, report hazardous materials, or appeal inspection decisions? What are the standard procedures for handling refrigerated containers flagged for inspection, including power supply and temperature monitoring during the inspection process?

5. **Hazardous Material Declaration Details:** Expand on the "undeclared hazardous materials" concern. What specific types of hazardous materials are Dutch Customs particularly concerned about? What are the consequences of misdeclaration or non-declaration of hazardous materials, including fines and potential criminal charges? What specific documentation (e.g., IMO declarations) is required for hazardous materials shipments?


  2.3 Extracting queries and keywords...
  Extracted 2 queries and 3 keywords
✓ Checkpoint saved: analyzed


In [138]:
# 4. Stage 3: Datapoint Retrieval
def debug_stage_3(case):
    """Run Stage 3: Datapoint Retrieval"""
    print(f"\nSTAGE 3: DATAPOINT RETRIEVAL (using {EMBEDDING_MODEL})")
    logger = case['logger']
    logger.start_stage("datapoint_retrieval")
    
    print("  3.1 Creating mock datapoints (avoiding database calls)...")
    
    # Create mock datapoints for debugging
    mock_datapoints = [
        {
            "id": "dp001",
            "title": "Baltic Sea Shipping Regulations",
            "content": "Vessels entering the Baltic Sea must comply with HELCOM regulations for environmental protection. Ships must use low-sulfur fuel (0.1% or less) and follow strict waste management procedures.",
            "datapoint_type": "regulation",
            "relevant_entity": "HELCOM",
            "search_score": 0.92
        },
        {
            "id": "dp002",
            "title": "Container Documentation Requirements",
            "content": "All containers must have a properly completed bill of lading, packing list, commercial invoice, and dangerous goods declaration if applicable. For EU ports, an Entry Summary Declaration (ENS) must be submitted electronically at least 24 hours before loading.",
            "datapoint_type": "requirement",
            "relevant_entity": "EU Customs",
            "search_score": 0.89
        },
        {
            "id": "dp003",
            "title": "Port of Hamburg Container Handling Procedures",
            "content": "The Port of Hamburg requires advance notification for all container vessels at least 48 hours before arrival. Container release requires proper customs clearance and port fees payment. The terminal operates 24/7 with special procedures for oversized or refrigerated containers.",
            "datapoint_type": "procedure",
            "relevant_entity": "Port of Hamburg",
            "search_score": 0.85
        },
        {
            "id": "dp004",
            "title": "Customs Clearance in Northern Europe",
            "content": "Customs clearance in Northern European ports requires digital submission of the Single Administrative Document (SAD), valid certificates of origin, and compliance with EU product safety regulations. AEO-certified companies benefit from expedited procedures.",
            "datapoint_type": "process",
            "relevant_entity": "EU Customs Union",
            "search_score": 0.82
        },
        {
            "id": "dp005",
            "title": "Container Shipping Transit Times Asia-Europe",
            "content": "Standard transit times from major Asian ports to Northern Europe: Shanghai to Hamburg: 30-35 days, Singapore to Rotterdam: 24-28 days, Hong Kong to Gdańsk: 35-40 days. Weather conditions in the North Sea can add 1-3 days delay during winter months.",
            "datapoint_type": "reference",
            "relevant_entity": "Shipping Lines",
            "search_score": 0.79
        }
    ]
    
    # Add to case
    case["relevant_datapoints"] = mock_datapoints
    case["stage"] = "datapoints_retrieved"
    case["last_checkpoint"] = "datapoints_retrieved"
    
    print(f"  Mock datapoints created: {len(mock_datapoints)}")
    
    # Add search metadata
    case["search_metadata"] = {
        "embedding_model": EMBEDDING_MODEL,
        "total_queries": len(case.get("search_queries", [])),
        "unique_datapoints": len(mock_datapoints),
        "query_stats": [
            {"query": q, "results_found": random.randint(3, 8), "top_score": round(0.75 + random.random()*0.2, 2)}
            for q in case.get("search_queries", ["default query"])
        ],
        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
    }
    
    logger.info("DATAPOINTS_RETRIEVED", f"Retrieved {len(mock_datapoints)} datapoints", {
        "datapoint_count": len(mock_datapoints),
        "embedding_model": EMBEDDING_MODEL
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: datapoints_retrieved")
    
    logger.end_stage(result_summary={
        "datapoint_count": len(mock_datapoints),
        "embedding_model": EMBEDDING_MODEL
    })
    
    return case



In [147]:
# 5. Stage 3: Real Datapoint Retrieval
def debug_stage_3_real(case):
    """Run Stage 3: Datapoint Retrieval with real database"""
    print(f"\nSTAGE 3: DATAPOINT RETRIEVAL (using {EMBEDDING_MODEL})")
    logger = case['logger']
    logger.start_stage("datapoint_retrieval")
    
    print("  3.1 Retrieving datapoints using search queries...")
    
    # Time datapoint retrieval
    start_time = datetime.now()
    try:
        # Use the actual retrieve_relevant_datapoints function
        case = retrieve_relevant_datapoints(case)
        print("✅ Successfully retrieved datapoints from database")
    except Exception as e:
        print(f"❌ Error retrieving datapoints: {str(e)}")
        print("Falling back to mock datapoints...")
        # Create mock datapoints as fallback
        mock_datapoints = [
            {
                "id": "dp001",
                "title": "Baltic Sea Shipping Regulations",
                "content": "Vessels entering the Baltic Sea must comply with HELCOM regulations...",
                "datapoint_type": "regulation",
                "relevant_entity": "HELCOM",
                "search_score": 0.92
            },
            # Include the other mock datapoints
        ]
        case["relevant_datapoints"] = mock_datapoints
        case["using_mock_data"] = True
    
    duration = (datetime.now() - start_time).total_seconds()
    
    datapoint_count = len(case.get("relevant_datapoints", []))
    print(f"Retrieved {datapoint_count} datapoints")
    
    logger.info("DATAPOINTS_RETRIEVED", f"Retrieved {datapoint_count} datapoints", {
        "datapoint_count": datapoint_count,
        "retrieval_duration": duration,
        "embedding_model": EMBEDDING_MODEL,
        "using_mock_data": case.get("using_mock_data", False)
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: datapoints_retrieved")
    
    logger.end_stage(result_summary={
        "datapoint_count": datapoint_count,
        "embedding_model": EMBEDDING_MODEL,
        "using_mock_data": case.get("using_mock_data", False)
    })
    
    return case


In [150]:
# Run Stage 3 with real data
debug_case = debug_stage_3_real(debug_case)


STAGE 3: DATAPOINT RETRIEVAL (using models/text-embedding-004)
  3.1 Retrieving datapoints using search queries...
No keywords found in analysis, using defaults
Using 8 queries and 4 keywords to find relevant datapoints
Using embedding model: models/text-embedding-004


Searching datapoints:   0%|          | 0/8 [00:00<?, ?it/s]

2025-03-28 17:09:54,788 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
2025-03-28 17:09:54,788 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
2025-03-28 17:09:54,788 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
2025-03-28 17:09:54,788 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
  results = client.search(
2025-03-28 17:09:55,195 - INFO - Created embeddings with models/text-embedding-004 in 0.26s
2025-03-28 17:09:55,195 - INFO - Created embeddings with models/text-embedding-004 in 0.26s
2025-03-28 17:09:55,195 - INFO - Created embeddings with models/text-embedding-004 in 0.26s
2025-03-28 17:09:55,195 - INFO - Created embeddings with models/text-embedding-004 in 0.26s
2025-03-28 17:09:55,606 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
2025-03-28 17:09:55,606 - INFO - Created embeddings with models/text-embedding-004 in 0.37s
2025-03-28 17:09:55,606 - INFO - Created embeddings w

Retrieved 40 unique datapoints
✅ Successfully retrieved datapoints from database
Retrieved 40 datapoints
✓ Checkpoint saved: datapoints_retrieved


In [153]:
# 5. Stage 4: Contextual Enhancement with REAL guidelines
def debug_stage_4(case):
    """Run Stage 4: Contextual Enhancement using real guidelines from Qdrant"""
    print(f"\nSTAGE 4: CONTEXTUAL ENHANCEMENT (using {LLM_MODEL})")
    logger = case['logger']
    logger.start_stage("enhancement")
    
    print("  4.1 Finding relevant domain-specific guideline...")
    try:
        # Use the real guideline retrieval function
        guideline = find_relevant_guideline(case)
        
        if guideline:
            print(f"✅ Found guideline: {guideline.get('guideline_type', 'unknown')} with {guideline.get('chunks_found', 0)} chunks")
        else:
            print("⚠️ No relevant guidelines found, using default")
            guideline = {
                "guideline_type": "default_maritime",
                "content": "Maritime logistics cases should demonstrate realistic challenges in container shipping."
            }
    except Exception as e:
        print(f"❌ Error finding guideline: {str(e)}")
        print("Using default guideline instead")
        guideline = {
            "guideline_type": "default_maritime",
            "content": "Maritime logistics cases should demonstrate realistic challenges in container shipping."
        }
    
    print("  4.2 Preparing datapoints for prompt...")
    # Format datapoints for the prompt
    datapoints_text = ""
    for i, dp in enumerate(case.get("relevant_datapoints", [])[:15]):
        datapoints_text += f"\nDATAPOINT {i+1}:\n"
        datapoints_text += f"Title: {dp.get('title', 'Untitled')}\n"
        datapoints_text += f"Content: {dp.get('content', 'No content')}\n"
        if "relevant_entity" in dp:
            datapoints_text += f"Entity: {dp.get('relevant_entity', 'Unknown')}\n"
    
    print("  4.3 Enhancing case with context...")
    prompt = f"""
    I'll provide you with a DRAFT CASE, DATAPOINTS, and DOMAIN GUIDELINES. Your task is to:
    1. Enhance the case with specific regulatory details from the datapoints
    2. Ensure it follows domain-specific guidelines 
    3. Make the scenario more realistic and educational
    4. Add a clear title for the case
    
    DRAFT CASE:
    {case.get("draft_case", "")}
    
    RELEVANT DATAPOINTS:
    {datapoints_text}
    
    DOMAIN GUIDELINES:
    {guideline.get("content", "No specific guidelines available")}
    
    Please respond with:
    
    ## Case Title
    [Provide a concise, descriptive title for this case]
    
    ## Enhanced Case
    [Provide the improved case with specific regulations and requirements from the datapoints]
    """
    
    # Log LLM request
    prompt_length = len(prompt)
    logger.log_llm_request(LLM_MODEL, prompt_length)
    
    # Time the LLM call
    start_time = datetime.now()
    enhanced_content = generate_with_llm(prompt, temperature=0.7)
    duration = (datetime.now() - start_time).total_seconds()
    
    # Log LLM response
    logger.log_llm_response(
        LLM_MODEL, 
        len(enhanced_content),
        duration
    )
    
    # Extract title and enhanced case
    title = "Untitled Case"
    enhanced_case = enhanced_content
    
    if "## Case Title" in enhanced_content:
        parts = enhanced_content.split("## Case Title")
        title_section = parts[1].split("##")[0]
        title = title_section.strip()
        
    if "## Enhanced Case" in enhanced_content:
        parts = enhanced_content.split("## Enhanced Case")
        enhanced_case = parts[1].strip()
    
    # Update case
    case.update({
        "title": title,
        "enhanced_case": enhanced_case,
        "domain_guideline": guideline.get("guideline_type", "default"),
        "stage": "enhanced",
        "last_checkpoint": "enhanced",
        "enhancement_metadata": {
            "model": LLM_MODEL,
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
            "datapoints_used": len(case.get("relevant_datapoints", [])),
            "guideline_type": guideline.get("guideline_type", "default"),
            "guideline_chunks": guideline.get("chunks_found", 0) if guideline else 0
        }
    })
    
    display(Markdown(f"### Enhanced Case: {case.get('title', 'Untitled Case')}"))
    display(Markdown(case["enhanced_case"]))
    
    logger.info("CASE_ENHANCED", "Case enhanced with context", {
        "title": case.get("title", "Untitled Case"),
        "enhanced_length": len(case.get("enhanced_case", "")),
        "domain_guideline": case.get("domain_guideline", "none"),
        "model": LLM_MODEL
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: enhanced")
    
    logger.end_stage(result_summary={
        "title": case.get("title", "Untitled Case"),
        "domain_guideline": case.get("domain_guideline", "default"),
        "model": LLM_MODEL
    })
    
    return case

In [154]:
# Run Stage 4
debug_case = debug_stage_4(debug_case)


STAGE 4: CONTEXTUAL ENHANCEMENT (using gemini-2.0-flash-exp)
  4.1 Finding relevant domain-specific guideline...
Creating embedding using model: models/text-embedding-004


2025-03-28 17:16:24,815 - INFO - Created embeddings with models/text-embedding-004 in 0.32s
2025-03-28 17:16:24,815 - INFO - Created embeddings with models/text-embedding-004 in 0.32s
2025-03-28 17:16:24,815 - INFO - Created embeddings with models/text-embedding-004 in 0.32s
2025-03-28 17:16:24,815 - INFO - Created embeddings with models/text-embedding-004 in 0.32s
  results = client.search(


✅ Found relevant guideline: maritime (score: 0.6963)
✅ Found guideline: maritime with 3 chunks
  4.2 Preparing datapoints for prompt...
  4.3 Enhancing case with context...


2025-03-28 17:16:32,911 - INFO - Generated content with gemini-2.0-flash-exp: 7149 chars in 7.95s
2025-03-28 17:16:32,911 - INFO - Generated content with gemini-2.0-flash-exp: 7149 chars in 7.95s
2025-03-28 17:16:32,911 - INFO - Generated content with gemini-2.0-flash-exp: 7149 chars in 7.95s
2025-03-28 17:16:32,911 - INFO - Generated content with gemini-2.0-flash-exp: 7149 chars in 7.95s


### Enhanced Case: **The "Baltic Mariner" Crisis: Navigating Rotterdam's Regulatory Bottleneck**

**Scenario:**

Oceanic Shipping Solutions (OSS), a medium-sized container shipping company headquartered in Hamburg, Germany, specializes in trade routes between East Asia and the Baltic Sea region. OSS operates a fleet of ten container vessels, including the *Baltic Mariner*, a 6,800 TEU container ship.

The *Baltic Mariner* is currently en route from Shanghai, China, to St. Petersburg, Russia, a key destination for OSS's Baltic operations. Its cargo manifest includes a diverse range of goods, from electronics and textiles to automotive components and consumer goods, destined for various clients in Russia, Finland, and the Baltic states. The vessel's scheduled route includes calls at Busan (South Korea), Singapore, Colombo (Sri Lanka), and Rotterdam (Netherlands) before proceeding to St. Petersburg via the Kiel Canal. The shipment operates under DAP Incoterms, placing the responsibility for import clearance on OSS.

Everything proceeded smoothly until the vessel’s arrival in Rotterdam. While initially scheduled for a 24-hour turnaround for unloading and loading operations, the *Baltic Mariner* encountered an unexpected delay. A new, stringent inspection regime, implemented by Dutch Customs and the Port of Rotterdam Authority, was in effect. This regime, driven by heightened security concerns and stricter enforcement of EU Regulation No 952/2013 (Union Customs Code - UCC) Article 46, aims to combat the import of undeclared hazardous materials and counterfeit goods. Several containers were flagged for mandatory inspection based on a risk assessment algorithm. This algorithm considered factors such as the declared cargo type, the shipper's history, the origin of the goods (with specific attention to regions flagged for export control violations), and inconsistencies in the 24-hour manifest data submitted prior to arrival.

The containers flagged include:

*   One containing high-value electronics. The Bill of Lading description was deemed too vague ("Electronic Components") and requires a more detailed Commercial Invoice and Packing List according to customs requirements.
*   One refrigerated container with pharmaceuticals.
*   Several containers originating from a region known for intellectual property violations. Dutch Customs requires verification of the authenticity and proper documentation of the goods, increasing inspection time.

Complicating matters further, the Rotterdam terminal, managed by ECT Delta Terminal, is experiencing peak season congestion due to a surge in import volumes from Asia. This has resulted in limited availability of inspection slots and longer waiting times for container handling. The terminal is also experiencing a backlog in processing Entry Summary Declarations (ENS), a pre-loading notification required under EU customs regulations, further delaying the inspection process.

Adding to the pressure, one of the refrigerated containers flagged for inspection is carrying temperature-sensitive pharmaceuticals destined for a hospital in St. Petersburg. The pharmaceuticals require a constant temperature between 2°C and 8°C. Any significant delay could compromise the integrity of the cargo, violating Good Distribution Practice (GDP) guidelines for pharmaceuticals and leading to substantial financial losses, potential health risks, and reputational damage for OSS. The Bill of Lading must accurately reflect the temperature requirements and a temperature monitoring log is essential.

The *Baltic Mariner* is now facing a potential 72-hour delay in Rotterdam. This delay will impact the vessel's arrival time in St. Petersburg, causing ripple effects on downstream logistics operations, including trucking and rail connections for the cargo destined for other Baltic countries. OSS is also contractually obligated to meet specific delivery deadlines with its clients, facing potential penalty clauses for late deliveries. Furthermore, the delay increases the risk of demurrage and detention charges at the Rotterdam terminal.

**The Problem:**

OSS needs to develop a strategy to minimize the impact of the delay in Rotterdam and ensure the timely delivery of its cargo, particularly the temperature-sensitive pharmaceuticals, to St. Petersburg and other Baltic destinations. This strategy must consider the following factors:

*   The limited availability of inspection slots and container handling capacity at the Rotterdam terminal.
*   The potential risks associated with delaying temperature-sensitive cargo, including GDP compliance and potential health risks.
*   The contractual obligations and potential penalty clauses for late deliveries, as well as potential demurrage and detention charges.
*   The overall impact on OSS's reputation and customer relationships.
*   The cost implications of alternative solutions, such as expediting cargo handling, rerouting shipments, or using alternative transportation modes.
*   Compliance with EU customs regulations, including the Union Customs Code (UCC) and ENS requirements. The strategy must consider the need for accurate and complete documentation, including the Bill of Lading, Commercial Invoice, Packing List, and potentially a Certificate of Origin.
*   Adherence to ISPS Code compliance and security documentation requirements, given the heightened security measures in place. Ensure the Ship Security Certificate (ISSC) is valid and the Ship Security Plan (SSP) has been followed.

**Specifically, OSS must address:**

1.  **Expediting the Inspection Process:** How can OSS proactively engage with Dutch Customs and the Port of Rotterdam Authority to expedite the inspection process for the flagged containers? This includes providing all necessary documentation promptly and accurately. This may include pre-arrival submission of documents via the Port Community System (PCS).
2.  **Protecting the Pharmaceuticals:** What measures can OSS take to ensure the integrity of the temperature-sensitive pharmaceuticals during the delay, including potentially arranging for cold storage at the terminal or using a temperature-controlled transportation solution?
3.  **Mitigating Downstream Impacts:** How can OSS communicate effectively with its clients in Russia, Finland, and the Baltic states to manage expectations and minimize disruptions to their supply chains? Can alternative transportation arrangements be made for cargo destined for these other countries?
4.  **Minimizing Costs:** How can OSS optimize its strategy to minimize financial losses, including penalty clauses, demurrage and detention charges, and the cost of alternative solutions?

How should OSS mitigate the impact of the Rotterdam delay and ensure the timely delivery of its cargo while minimizing financial losses and reputational damage? Furthermore, how can OSS improve its import workflows and documentation procedures to prevent similar delays in the future, taking into account Incoterms, data quality issues, and security regulations? Should OSS implement EDI for Bill of Lading exchange to prevent future delays?

✓ Checkpoint saved: enhanced


In [155]:
# 6. Stage 5: Solution Development
def debug_stage_5(case):
    """Run Stage 5: Solution Development"""
    print(f"\nSTAGE 5: SOLUTION DEVELOPMENT (using {LLM_MODEL})")
    logger = case['logger']
    logger.start_stage("solution")
    
    print("  5.1 Preparing solution prompt...")
    # Format datapoints for the solution prompt
    datapoints_text = ""
    for i, dp in enumerate(case.get("relevant_datapoints", [])[:5]):
        datapoints_text += f"\nDATAPOINT {i+1}:\n"
        datapoints_text += f"Title: {dp.get('title', 'Untitled')}\n"
        datapoints_text += f"Content: {dp.get('content', 'No content')}\n"
        if "relevant_entity" in dp:
            datapoints_text += f"Entity: {dp.get('relevant_entity', 'Unknown')}\n"
    
    prompt = f"""
    I'll provide you with an ENHANCED CASE. Your task is to:
    1. Develop a comprehensive solution that addresses all aspects of the case
    2. Reference specific regulations and requirements that apply
    3. Explain the reasoning behind the solution
    4. Structure the solution clearly with steps/recommendations
    
    CASE TITLE:
    {case.get("title", "Untitled Case")}
    
    CASE SCENARIO:
    {case.get("enhanced_case", case.get("draft_case", ""))}
    
    RELEVANT DATAPOINTS:
    {datapoints_text}
    
    Please provide a detailed solution that demonstrates understanding of maritime logistics regulations and requirements.
    Structure your response as follows:
    
    ## Executive Summary
    [Brief overview of the solution]
    
    ## Detailed Solution Steps
    [Step-by-step solution with regulatory references]
    
    ## Recommendations
    [Key recommendations and best practices]
    
    ## Risk Mitigation
    [Potential risks and mitigation strategies]
    """
    
    print("  5.2 Generating solution...")
    # Log LLM request
    prompt_length = len(prompt)
    logger.log_llm_request(LLM_MODEL, prompt_length)
    
    # Time the LLM call
    start_time = datetime.now()
    solution = generate_with_llm(prompt, temperature=0.4)
    duration = (datetime.now() - start_time).total_seconds()
    
    # Log LLM response
    logger.log_llm_response(
        LLM_MODEL, 
        len(solution),
        duration
    )
    
    # Update case
    case.update({
        "solution": solution,
        "stage": "completed",
        "last_checkpoint": "completed",
        "solution_metadata": {
            "model": LLM_MODEL,
            "timestamp": time.strftime("%Y-%m-%d %H:%M:%S"),
            "datapoints_referenced": len(case.get("relevant_datapoints", [])),
            "solution_version": "2.0"
        }
    })
    
    display(Markdown("### Solution"))
    display(Markdown(case["solution"]))
    
    logger.info("SOLUTION_DEVELOPED", "Solution developed", {
        "solution_length": len(case.get("solution", "")),
        "model": LLM_MODEL
    })
    
    # Save checkpoint
    case_copy = case.copy()
    if "logger" in case_copy:
        del case_copy["logger"]
    with open(case_copy["checkpoint_file"], "w", encoding="utf-8") as f:
        json.dump(case_copy, f, ensure_ascii=False, indent=2)
    print(f"✓ Checkpoint saved: completed")
    
    logger.end_stage(result_summary={
        "solution_length": len(case.get("solution", "")),
        "model": LLM_MODEL
    })
    
    return case



In [156]:
# Run Stage 5
debug_case = debug_stage_5(debug_case)


STAGE 5: SOLUTION DEVELOPMENT (using gemini-2.0-flash-exp)
  5.1 Preparing solution prompt...
  5.2 Generating solution...


2025-03-28 17:17:59,649 - INFO - Generated content with gemini-2.0-flash-exp: 10133 chars in 11.45s
2025-03-28 17:17:59,649 - INFO - Generated content with gemini-2.0-flash-exp: 10133 chars in 11.45s
2025-03-28 17:17:59,649 - INFO - Generated content with gemini-2.0-flash-exp: 10133 chars in 11.45s
2025-03-28 17:17:59,649 - INFO - Generated content with gemini-2.0-flash-exp: 10133 chars in 11.45s


### Solution

## Executive Summary

The "Baltic Mariner" crisis requires a multi-pronged approach focusing on immediate action to expedite inspections, protect temperature-sensitive cargo, manage stakeholder communication, and minimize costs. This involves proactive engagement with Dutch Customs and the Port of Rotterdam Authority, securing appropriate cold storage for pharmaceuticals, transparent communication with clients, and exploring alternative transportation options. Long-term, OSS needs to improve its documentation procedures, enhance data quality, and leverage technology to prevent future delays. This solution addresses the immediate crisis while also laying the groundwork for a more resilient and efficient supply chain.

## Detailed Solution Steps

**Step 1: Immediate Engagement and Expediting Inspections**

*   **Action:** Immediately contact Dutch Customs and the Port of Rotterdam Authority to understand the specific reasons for the flagged containers and the required corrective actions. Designate a dedicated liaison to handle all communication.
*   **Regulatory Reference:** EU Regulation No 952/2013 (Union Customs Code - UCC) Article 46 outlines the basis for customs controls and risk assessment. The Port of Rotterdam Authority operates under Dutch customs regulations, which are aligned with the UCC.
*   **Reasoning:** Proactive communication can demonstrate OSS's commitment to compliance and potentially expedite the inspection process. Understanding the specific concerns allows for targeted responses.
*   **Deliverables:**
    *   Establish direct communication channels with Dutch Customs and the Port of Rotterdam Authority.
    *   Obtain a detailed explanation of the reasons for each flagged container.
    *   Provide all requested documentation immediately (Commercial Invoice, Packing List, Certificate of Origin if applicable, Bill of Lading).
*   **Specific Actions for Flagged Containers:**
    *   **Electronics Container:** Immediately provide a detailed Commercial Invoice and Packing List specifying the exact components, their value, and their intended use. If possible, provide product datasheets.
    *   **Pharmaceuticals Container:** Emphasize the time-sensitive nature of the cargo and the GDP requirements. Provide the temperature monitoring log, Bill of Lading, and any relevant certifications. Request priority inspection due to the critical nature of the cargo.
    *   **IP Violation Region Containers:** Provide evidence of the authenticity of the goods, such as licenses, trademarks, or certificates of origin. Prepare for a thorough inspection and be ready to answer any questions regarding the origin and legitimacy of the goods.
*   **Leverage Port Community System (PCS):** Ensure all documentation is submitted electronically via the Port Community System (PCS) to facilitate faster processing.
*   **ISPS Code Compliance:** Re-verify that all ISPS Code requirements have been met, including ensuring the Ship Security Certificate (ISSC) is valid and the Ship Security Plan (SSP) has been followed.

**Step 2: Protecting the Pharmaceuticals**

*   **Action:** Immediately contact the Rotterdam terminal (ECT Delta Terminal) to arrange for cold storage for the pharmaceuticals container. If cold storage is unavailable, explore alternative temperature-controlled transportation solutions.
*   **Regulatory Reference:** Good Distribution Practice (GDP) guidelines for pharmaceuticals mandate the maintenance of a consistent temperature range (2°C to 8°C in this case). Failure to comply can result in regulatory penalties and product recalls.
*   **Reasoning:** Maintaining the integrity of the pharmaceuticals is paramount. Failure to do so can have severe consequences.
*   **Deliverables:**
    *   Secure cold storage at the Rotterdam terminal or arrange for temperature-controlled transportation.
    *   Continuously monitor the temperature of the container.
    *   Document all temperature readings and actions taken.
*   **Contingency Plan:** If cold storage is unavailable, consider transferring the pharmaceuticals to a temperature-controlled truck and transporting them directly to St. Petersburg via road. This would be a costly option but may be necessary to preserve the integrity of the cargo.

**Step 3: Mitigating Downstream Impacts**

*   **Action:** Communicate proactively and transparently with clients in Russia, Finland, and the Baltic states to inform them of the delay and its potential impact on their deliveries.
*   **Regulatory Reference:** While no specific regulation mandates communication, maintaining good customer relationships is crucial for business continuity and reputation management.
*   **Reasoning:** Managing expectations and providing timely updates can minimize customer dissatisfaction and potential disputes.
*   **Deliverables:**
    *   Prepare a communication template explaining the situation and the expected delay.
    *   Contact each client individually to discuss their specific needs and concerns.
    *   Provide regular updates on the progress of the inspection and the revised delivery schedule.
*   **Alternative Transportation Options:**
    *   **Air Freight:** For time-critical cargo, consider air freighting the goods from Rotterdam to St. Petersburg or other Baltic destinations. This is an expensive option but may be necessary to meet contractual obligations.
    *   **Trucking/Rail:** Explore alternative trucking or rail routes to bypass the congestion in Rotterdam. This may involve transloading the cargo to another vessel or mode of transport.

**Step 4: Minimizing Costs**

*   **Action:** Negotiate with the Rotterdam terminal (ECT Delta Terminal) to minimize demurrage and detention charges. Explore options for expedited container handling.
*   **Regulatory Reference:** Demurrage and detention charges are governed by the terminal's tariff and the terms of the contract between OSS and the terminal operator.
*   **Reasoning:** Minimizing financial losses is a key objective.
*   **Deliverables:**
    *   Negotiate a waiver or reduction of demurrage and detention charges.
    *   Secure expedited container handling services.
*   **Cost-Benefit Analysis:** Conduct a thorough cost-benefit analysis of all alternative solutions, including air freight, trucking, and expedited container handling. Factor in the potential costs of penalty clauses, demurrage and detention charges, and reputational damage.
*   **Insurance Claim:** Assess the possibility of filing an insurance claim to cover the costs associated with the delay.

**Step 5: Improving Import Workflows and Documentation Procedures**

*   **Action:** Conduct a thorough review of OSS's import workflows and documentation procedures to identify areas for improvement.
*   **Regulatory Reference:** EU Regulation No 952/2013 (Union Customs Code - UCC) emphasizes the importance of accurate and complete documentation for customs clearance.
*   **Reasoning:** Preventing future delays requires addressing the root causes of the current problem.
*   **Deliverables:**
    *   Identify and correct any data quality issues in the Bill of Lading and other import documents.
    *   Implement a system for verifying the accuracy and completeness of all import documents before shipment.
    *   Develop a checklist of required documents for each type of cargo and destination.
*   **Specific Improvements:**
    *   **Bill of Lading Accuracy:** Implement a system for verifying the accuracy of the Bill of Lading before it is issued. This should include cross-checking the information against the Commercial Invoice, Packing List, and other relevant documents. Consider implementing EDI for Bill of Lading exchange to improve data accuracy and efficiency.
    *   **Data Quality:** Implement data validation rules to ensure that all required fields are completed accurately and consistently.
    *   **Incoterms:** Ensure that all parties understand their responsibilities under the applicable Incoterms.
    *   **Security Regulations:** Stay up-to-date on the latest security regulations and ensure that all shipments comply with these regulations.
*   **Training:** Provide training to all employees involved in the import process on the importance of accurate and complete documentation.

## Recommendations

*   **Proactive Communication:** Establish and maintain strong relationships with Dutch Customs and the Port of Rotterdam Authority.
*   **Data Quality:** Implement robust data quality controls to ensure the accuracy and completeness of all import documents.
*   **Technology:** Leverage technology, such as EDI and Port Community Systems, to improve efficiency and reduce errors.
*   **Contingency Planning:** Develop contingency plans for dealing with unexpected delays and disruptions.
*   **Training:** Provide ongoing training to employees on import regulations and best practices.

## Risk Mitigation

*   **Risk:** Potential for further delays due to unforeseen circumstances.
    *   **Mitigation:** Maintain close communication with Dutch Customs and the Port of Rotterdam Authority. Have alternative transportation options readily available.
*   **Risk:** Damage to the pharmaceuticals due to temperature fluctuations.
    *   **Mitigation:** Continuously monitor the temperature of the container. Have a backup cold storage solution available.
*   **Risk:** Customer dissatisfaction due to late deliveries.
    *   **Mitigation:** Communicate proactively and transparently with clients. Offer alternative solutions, such as air freight, to mitigate the impact of the delay.
*   **Risk:** Financial losses due to penalty clauses, demurrage and detention charges, and the cost of alternative solutions.
    *   **Mitigation:** Negotiate with the Rotterdam terminal to minimize demurrage and detention charges. Conduct a thorough cost-benefit analysis of all alternative solutions. Explore the possibility of filing an insurance claim.
*   **Risk:** Security breach or non-compliance with ISPS Code.
    *   **Mitigation:** Re-verify ISPS Code compliance. Ensure all security documentation is up-to-date and accurate.


✓ Checkpoint saved: completed


In [157]:
# 7. Final Case Package
def save_final_case(case):
    """Save the final case to output directory"""
    print("\nFINALIZING CASE GENERATION")
    logger = case['logger']
    
    # Add generation metadata
    case["generation_metadata"] = {
        "llm_model": LLM_MODEL,
        "embedding_model": EMBEDDING_MODEL,
        "completion_time": time.strftime("%Y-%m-%d %H:%M:%S"),
        "pipeline_version": "2.0"
    }
    
    # Format as final case-solution pair
    final_case = {
        "title": case.get("title", "Untitled Case"),
        "case": case.get("enhanced_case", case.get("draft_case", "")),
        "solution": case.get("solution", ""),
        "metadata": {
            "creation_date": case.get("creation_date", time.strftime("%Y-%m-%d")),
            "case_id": case.get("case_id", "unknown"),
            "model": LLM_MODEL,
            "generation_pipeline": "Interactive debugging"
        }
    }
    
    # Create output directory
    output_dir = Path("../Output/Generated_Cases")
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Create filename from title
    if "title" in final_case and final_case["title"]:
        # Clean title for filename
        title = re.sub(r'[^\w\s-]', '', final_case["title"]).strip()
        title = re.sub(r'[-\s]+', '_', title)
        filename = f"{title}.json"
    else:
        filename = f"case_{case.get('case_id', 'unknown')}.json"
    
    filepath = output_dir / filename
    
    # Save to file
    with open(filepath, "w", encoding="utf-8") as f:
        json.dump(final_case, f, ensure_ascii=False, indent=2)
    
    print(f"✅ Final case saved to {filepath}")
    
    # Finalize logging
    logger.info("CASE_SAVED", f"Case saved to {filepath}", {
        "filepath": str(filepath),
        "llm_model": LLM_MODEL,
        "embedding_model": EMBEDDING_MODEL
    })
    
    log_summary = logger.finalize()
    
    print("\n✨ CASE GENERATION COMPLETE ✨")
    
    return filepath



In [158]:
# Save the final case
final_filepath = save_final_case(debug_case)


FINALIZING CASE GENERATION
✅ Final case saved to ../Output/Generated_Cases/The_Baltic_Mariner_Crisis_Navigating_Rotterdams_Regulatory_Bottleneck.json

✨ CASE GENERATION COMPLETE ✨


### 3.2 Pipeline Execution

In [2]:
# Run the full pipeline with debugging enabled
case, filepath = run_case_generation_pipeline(debug=True)

🔍 Running pipeline in DEBUG mode
STAGE 0: INITIALIZATION


NameError: name 'initialize_checkpoint' is not defined

### 3.2 Case Visualization

Formatted display of the final case and solution, showing how domain knowledge has been integrated into a realistic scenario.

In [None]:
def display_case(case):
    """Display the case in a formatted way"""
    if not case:
        display(Markdown("### No case available to display"))
        return
    
    # Create a formatted HTML view
    html = f"""
    <div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ddd; border-radius: 8px;">
        <h1 style="text-align: center; margin-bottom: 30px;">{case.get('title', 'Untitled Case')}</h1>
        
        <h2>Case Scenario</h2>
        <div style="background: #f9f9f9; padding: 15px; border-radius: 5px; margin-bottom: 20px;">
            {case.get('enhanced_case', case.get('draft_case', '')).replace('\n', '<br>')}</div>
        
        <h2>Solution</h2>
        <div style="background: #f0f7ff; padding: 15px; border-radius: 5px;">
            {case.get('solution', 'No solution available').replace('\n', '<br>')}</div>
        
        <div style="margin-top: 30px; font-size: 0.9em; color: #777;">
            <p>Created: {case.get('creation_date', 'Unknown date')}</p>
            <p>Based on example: {case.get('example_inspiration', 'Original')}</p>
            <p>Domain guideline: {case.get('domain_guideline', 'None')}</p>
        </div>
    </div>
    """
    
    display(HTML(html))

# Display the completed case
if case:
    display_case(case)
else:
    print("No case was generated")

## 4. Batch Generation (Optional)

For generating multiple cases with different focus areas or regions, this section provides batch processing capabilities.

In [None]:
def generate_case_with_focus(example_case, focus_area, region):
    """Generate a case draft with specific focus area and region"""
    prompt = f"""
    Create a new maritime logistics case study scenario inspired by but not copying the example below.
    
    The case should focus specifically on: {focus_area}
    The region/route context should be: {region}
    
    EXAMPLE CASE FOR INSPIRATION:
    {example_case[:2000]}
    
    IMPORTANT GUIDELINES:
    1. Create a realistic container shipping/logistics scenario with clear problems to solve
    2. Include fictional but plausible company names and stakeholders
    3. Focus on international regulatory compliance and documentation issues
    4. Present specific logistics challenges that require expertise to resolve
    5. The scenario should prompt the reader to consider relevant regulations
    6. Make the case educational while remaining engaging
    7. Include enough specific details to make the case realistic
    8. Focus on the {focus_area} aspects of maritime logistics
    9. Set the scenario in the {region} context
    
    CASE:
    """
    
    print(f"Generating case draft with focus on {focus_area} in {region}...")
    draft_case = generate_with_llm(prompt, model="gemini-1.5-pro-latest")
    time.sleep(6)  # Rate limit
    
    # Create case object
    return {
        "draft_case": draft_case,
        "example_inspiration": example_case[:100] + "...",
        "focus_area": focus_area,
        "region": region,
        "creation_date": time.strftime("%Y-%m-%d"),
        "stage": "draft"
    }


In [3]:
def batch_generate_cases(num_cases=3, focus_areas=None, regions=None, max_retries=2):
    """
    Generate multiple cases in batch with different focus areas or regions
    
    Parameters:
    -----------
    num_cases : int
        Number of cases to generate
    focus_areas : list
        List of focus areas to distribute across cases (e.g., ['customs', 'documentation', 'compliance'])
    regions : list
        List of regions to focus on (e.g., ['Baltic', 'North Sea', 'Asia-Europe'])
    max_retries : int
        Maximum number of retries per case if generation fails
        
    Returns:
    --------
    dict
        Summary of generation results with paths to saved cases
    """
        # Create batch record
    batch_id = str(uuid.uuid4())
    batch_dir = Path("../Data/Checkpoints/batches")
    batch_dir.mkdir(parents=True, exist_ok=True)
    
    batch_record = {
        "batch_id": batch_id,
        "timestamp": datetime.now().isoformat(),
        "requested_cases": num_cases,
        "focus_areas": focus_areas,
        "regions": regions,
        "case_records": [],
        "completed_cases": []
    }
    
    # Save initial batch record
    with open(batch_dir / f"batch_{batch_id}.json", "w") as f:
        json.dump(batch_record, f, indent=2)
    
    # Default focus areas if none provided
    if not focus_areas:
        focus_areas = [
            "customs documentation", 
            "container compliance", 
            "port operations", 
            "multimodal transfer", 
            "regulatory requirements"
        ]
    
    # Default regions if none provided
    if not regions:
        regions = [
            "Baltic Sea ports", 
            "North European gateways", 
            "China-Europe route", 
            "Southeast Asia logistics", 
            "Nordic regional distribution"
        ]
    
    # Ensure we have enough combinations for requested cases
    if num_cases > len(focus_areas) * len(regions):
        print(f"Warning: Requested {num_cases} cases but only {len(focus_areas) * len(regions)} unique combinations. Some combinations will repeat.")
    
    print(f"Starting batch generation of {num_cases} cases...")
    generated_cases = []
    failed_attempts = 0
    results = {
        "success": [],
        "failed": [],
        "total_requested": num_cases,
        "timestamp": time.strftime("%Y-%m-%d %H:%M:%S")
    }
    
    # Create progress bar
    progress_bar = tqdm(total=num_cases, desc="Generating cases")
    
    # Try to generate the requested number of cases
    case_index = 0
    while len(generated_cases) < num_cases and failed_attempts <= num_cases * max_retries:
        # Select focus area and region for this case
        focus_area = focus_areas[case_index % len(focus_areas)]
        region = regions[(case_index // len(focus_areas)) % len(regions)]
        
        print(f"\n\n{'='*80}\nGenerating case {len(generated_cases)+1}/{num_cases}")
        print(f"Focus Area: {focus_area} | Region: {region}\n{'='*80}\n")
        
        try:
            # Select random example
            example_case = select_random_example()
            
            # Generate case draft with focus area and region guidance
            case = generate_case_with_focus(example_case, focus_area, region)
            
            # Continue with the rest of the pipeline
            case = analyze_case_draft(case)
            case = retrieve_relevant_datapoints(case)
            case = enhance_case_with_context(case)
            case = develop_solution(case)
            
            # Add metadata about the batch generation
            case["batch_metadata"] = {
                "focus_area": focus_area,
                "region": region,
                "batch_index": len(generated_cases),
                "generation_time": time.strftime("%Y-%m-%d %H:%M:%S")
            }
            
            # Save the case
            filepath = save_final_case(case)
            
            # Add to results
            generated_cases.append(case)
            results["success"].append({
                "title": case.get("title", "Untitled"),
                "filepath": str(filepath),
                "focus_area": focus_area,
                "region": region
            })
            
            # Update progress
            progress_bar.update(1)
            
        except Exception as e:
            failed_attempts += 1
            print(f"Error generating case: {e}")
            import traceback
            traceback.print_exc()
            
            results["failed"].append({
                "focus_area": focus_area,
                "region": region,
                "error": str(e)
            })
            
            # Wait before retrying
            time.sleep(10)
        
        # Move to next case configuration
        case_index += 1
    
    progress_bar.close()
    
    # Add summary to results
    results["total_generated"] = len(generated_cases)
    results["success_rate"] = len(generated_cases) / num_cases * 100 if num_cases > 0 else 0
    
    # Save summary report
    report_file = Path("../Data/GeneratedCases/batch_report.json")
    with open(report_file, "w") as f:
        json.dump(results, f, indent=2)
    
    print(f"\nBatch generation complete!")
    print(f"Successfully generated: {len(results['success'])}/{num_cases} cases")
    print(f"Failed attempts: {len(results['failed'])}")
    print(f"Report saved to: {report_file}")
    
    return results


In [None]:
# Example of how to use the batch generator
def run_batch_generation_demo():
    """Run a small batch generation demo"""
    # Define focus areas and regions for diverse case generation
    focus_areas = [
        "customs documentation requirements",
        "container loading compliance",
        "hazardous materials handling",
        "import/export regulations",
        "port security procedures"
    ]
    
    regions = [
        "Baltic Sea to East Asian ports",
        "Northern Europe hub distribution",
        "China-Hamburg express route",
        "Scandinavian multimodal network"
    ]
    
    # Generate a small batch (2 cases) for demonstration
    results = batch_generate_cases(
        num_cases=2,  
        focus_areas=focus_areas,
        regions=regions
    )
    
    # Show summary of generated cases
    if results["success"]:
        print("\nGenerated Case Summaries:")
        for i, case_info in enumerate(results["success"]):
            print(f"\nCase {i+1}: {case_info['title']}")
            print(f"Focus: {case_info['focus_area']}")
            print(f"Region: {case_info['region']}")
            print(f"Saved to: {case_info['filepath']}")
    
    return results

In [None]:
# Run the batch generation demo
# Uncomment to execute:
# batch_results = run_batch_generation_demo()

### 4.1 Display Batch Results

In [None]:
def display_batch_results(results):
    """Display batch generation results in a formatted way"""
    if not results or "success" not in results:
        display(Markdown("### No batch results available to display"))
        return
    
    # Create summary table
    html = f"""
    <div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ddd; border-radius: 8px;">
        <h1 style="text-align: center; margin-bottom: 20px;">Batch Generation Results</h1>
        
        <div style="margin-bottom: 20px;">
            <h3>Summary</h3>
            <table style="width: 100%; border-collapse: collapse;">
                <tr style="background-color: #f2f2f2;">
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Total Requested</th>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{results.get('total_requested', 'N/A')}</td>
                </tr>
                <tr>
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Successfully Generated</th>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{len(results.get('success', []))}</td>
                </tr>
                <tr style="background-color: #f2f2f2;">
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Failed Attempts</th>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{len(results.get('failed', []))}</td>
                </tr>
                <tr>
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Success Rate</th>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{results.get('success_rate', 0):.1f}%</td>
                </tr>
                <tr style="background-color: #f2f2f2;">
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Timestamp</th>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{results.get('timestamp', 'N/A')}</td>
                </tr>
            </table>
        </div>
        
        <div style="margin-bottom: 20px;">
            <h3>Generated Cases</h3>
            <table style="width: 100%; border-collapse: collapse;">
                <tr style="background-color: #f2f2f2;">
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">#</th>
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Title</th>
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Focus Area</th>
                    <th style="padding: 8px; text-align: left; border: 1px solid #ddd;">Region</th>
                </tr>
    """
    
    # Add row for each successful case
    for i, case in enumerate(results.get('success', [])):
        bg_color = "#f2f2f2" if i % 2 == 0 else "white"
        html += f"""
                <tr style="background-color: {bg_color};">
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{i+1}</td>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{case.get('title', 'Untitled')}</td>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{case.get('focus_area', 'N/A')}</td>
                    <td style="padding: 8px; text-align: left; border: 1px solid #ddd;">{case.get('region', 'N/A')}</td>
                </tr>
        """
    
    html += """
            </table>
        </div>
    </div>
    """
    
    display(HTML(html))



In [None]:
# Display batch results when available:
# if 'batch_results' in globals():
#     display_batch_results(batch_results)

## 5. Checkpoint Management

This section provides tools for managing case generation checkpoints. The system saves interim results after each pipeline stage, enabling:

- **Recovery from failures**: Resume generation from the last successful stage
- **Process inspection**: Examine intermediate outputs for debugging
- **Generation pausing**: Split the generation process across multiple sessions

Each case is stored as a single JSON file with a unique ID, tracking the history of all processing stages.

In [36]:
def display_checkpoint_manager():
    """Display an interface for managing checkpoints"""
    # List available checkpoints
    checkpoints = list_checkpoints(include_completed=True)
    
    if not checkpoints:
        display(HTML("<p style='color:red'>No checkpoints available</p>"))
        return
    
    # Create HTML table for checkpoints
    html = """
    <style>
        .checkpoint-table {width:100%; border-collapse:collapse; margin:20px 0}
        .checkpoint-table th, .checkpoint-table td {padding:8px; border:1px solid #ddd}
        .checkpoint-table tr:nth-child(even) {background-color:#f2f2f2}
        .checkpoint-table th {background-color:#4CAF50; color:white; text-align:left}
    </style>
    <h3>Available Checkpoints</h3>
    <table class="checkpoint-table">
        <tr>
            <th>#</th>
            <th>Case ID</th>
            <th>Title</th>
            <th>Stage</th>
            <th>Modified</th>
        </tr>
    """
    
    for i, cp in enumerate(checkpoints):
        html += f"""
        <tr>
            <td>{i+1}</td>
            <td>{cp['case_id'][:8]}...</td>
            <td>{cp['title']}</td>
            <td>{cp['last_stage']}</td>
            <td>{cp['modified']}</td>
        </tr>
        """
    
    html += "</table>"
    
    # Display instructions for resuming
    html += """
    <p>To resume a checkpoint, use:</p>
    <pre>case, filepath = run_case_generation_pipeline(resume_from="CASE_ID")</pre>
    <p>Where CASE_ID is the ID from the table above.</p>
    """
    
    display(HTML(html))

In [None]:
# Run the checkpoint manager
display_checkpoint_manager()

# Examples:
# List recent checkpoints
# list_checkpoints()

# Resume a specific checkpoint
# case, filepath = run_case_generation_pipeline(resume_from="the-case-id-here")

# Start new case with checkpointing
# case, filepath = run_case_generation_pipeline()

## 6. Log Management & Analysis

This section provides tools for managing, analyzing, and visualizing logs from the case generation process. The logging system captures detailed information about each stage of generation, including:

- **Timing metrics**: How long each stage takes
- **LLM interactions**: Requests and responses to language models
- **Data retrievals**: Vector searches and results
- **Errors and warnings**: Issues encountered during generation
- **Overall performance**: Success rates and bottlenecks

These tools help identify performance issues, track errors, and analyze trends across multiple case generations.

In [39]:
def display_logs(case_id=None, limit=100):
    """Display logs for a specific case or most recent logs"""
    log_dir = Path("../Data/Logs")
    
    if not log_dir.exists():
        display(HTML("<p style='color:red'>No logs directory found</p>"))
        return
    
    # Find log files
    if case_id:
        log_files = list(log_dir.glob(f"case_{case_id}_*.jsonl"))
    else:
        log_files = list(log_dir.glob("case_*.jsonl"))
    
    if not log_files:
        display(HTML("<p style='color:red'>No log files found</p>"))
        return
    
    # Sort by modification time (most recent first)
    log_files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
    
    # Select most recent log file
    log_file = log_files[0]
    
    # Read and parse log entries
    log_entries = []
    try:
        with open(log_file, 'r') as f:
            for line in f:
                try:
                    entry = json.loads(line.strip())
                    log_entries.append(entry)
                except json.JSONDecodeError:
                    continue
    except Exception as e:
        display(HTML(f"<p style='color:red'>Error reading log file: {e}</p>"))
        return
    
    # Limit number of entries
    log# filepath: /Users/max/Documents/Code/magdeburg25/Notebooks/04_Case_Generation.ipynb


In [41]:

def load_log_file(log_path):
    """Load and parse a JSONL log file"""
    logs = []
    with open(log_path, 'r') as f:
        for line in f:
            try:
                logs.append(json.loads(line))
            except json.JSONDecodeError:
                print(f"Warning: Could not parse log line: {line[:50]}...")
    return logs

def find_log_files(case_id=None):
    """Find log files, optionally filtered by case_id"""
    log_dir = Path("../Data/Logs")
    if not log_dir.exists():
        return []
    
    if case_id:
        pattern = f"case_{case_id}_*.jsonl"
    else:
        pattern = "case_*.jsonl"
        
    return sorted(log_dir.glob(pattern), key=lambda p: p.stat().st_mtime, reverse=True)



In [42]:
def analyze_log(log_path):
    """Analyze a log file and extract key metrics"""
    logs = load_log_file(log_path)
    if not logs:
        return {"error": "No logs found or could not parse log file"}
    
    # Extract case ID
    case_id = logs[0].get('case_id', 'unknown')
    
    # Find all stages
    stages = []
    stage_timings = {}
    current_stage = None
    stage_start_time = None
    
    errors = []
    llm_calls = []
    data_retrievals = []
    
    for entry in logs:
        if entry.get('event') == 'STAGE_START':
            current_stage = entry.get('stage')
            stage_start_time = datetime.fromisoformat(entry.get('timestamp'))
            stages.append(current_stage)
        
        elif entry.get('event') in ['STAGE_COMPLETE', 'STAGE_FAILED']:
            if current_stage and 'data' in entry and 'duration_seconds' in entry['data']:
                stage_timings[current_stage] = entry['data']['duration_seconds']
                
        # Track errors
        if entry.get('level') in ['ERROR', 'CRITICAL']:
            errors.append(entry)
            
        # Track LLM calls
        if entry.get('event') == 'LLM_REQUEST':
            llm_calls.append(entry)
            
        # Track data retrievals
        if entry.get('event') == 'DATA_RETRIEVAL':
            data_retrievals.append(entry)
    
    # Calculate overall timing if available
    start_log = next((l for l in logs if l.get('event') == 'LOGGING_INITIALIZED'), None)
    end_log = next((l for l in reversed(logs) if l.get('event') == 'GENERATION_COMPLETE'), None)
    
    total_duration = None
    if start_log and end_log:
        start_time = datetime.fromisoformat(start_log.get('timestamp'))
        end_time = datetime.fromisoformat(end_log.get('timestamp'))
        total_duration = (end_time - start_time).total_seconds()
    
    return {
        "case_id": case_id,
        "log_file": str(log_path),
        "log_entries": len(logs),
        "stages": stages,
        "stage_timings": stage_timings,
        "total_duration": total_duration,
        "errors": len(errors),
        "llm_calls": len(llm_calls),
        "data_retrievals": len(data_retrievals),
        "success": any(l.get('event') == 'GENERATION_COMPLETE' for l in logs),
    }




In [43]:
def display_log_analysis(log_path):
    """Display an analysis of a log file with visualizations"""
    analysis = analyze_log(log_path)
    if 'error' in analysis:
        display(HTML(f"<p style='color:red'>{analysis['error']}</p>"))
        return
    
    # Create summary HTML
    html = f"""
    <div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ddd; border-radius: 8px;">
        <h2>Log Analysis: Case {analysis['case_id']}</h2>
        
        <div style="display: flex; margin-bottom: 20px;">
            <div style="flex: 1; padding: 10px; background-color: #f8f9fa; border-radius: 5px; margin-right: 10px;">
                <h3 style="margin-top: 0;">Summary</h3>
                <p><b>Log entries:</b> {analysis['log_entries']}</p>
                <p><b>Total duration:</b> {analysis['total_duration']:.2f}s ({analysis['total_duration']/60:.2f} min)</p>
                <p><b>Success:</b> {"✅" if analysis['success'] else "❌"}</p>
                <p><b>Errors:</b> {analysis['errors']}</p>
                <p><b>LLM calls:</b> {analysis['llm_calls']}</p>
            </div>
            
            <div style="flex: 1; padding: 10px; background-color: #f8f9fa; border-radius: 5px;">
                <h3 style="margin-top: 0;">Stage Timings</h3>
    """
    
    # Add stage timing bars
    if analysis['stage_timings']:
        max_time = max(analysis['stage_timings'].values())
        for stage, time in analysis['stage_timings'].items():
            percentage = (time / max_time) * 100
            html += f"""
                <div style="margin-bottom: 10px;">
                    <div>{stage}: {time:.2f}s</div>
                    <div style="background-color: #ddd; border-radius: 3px; height: 20px; width: 100%;">
                        <div style="background-color: #4CAF50; height: 20px; width: {percentage}%; border-radius: 3px;"></div>
                    </div>
                </div>
            """
    else:
        html += "<p>No stage timing data available</p>"
    
    html += """
            </div>
        </div>
    </div>
    """
    
    display(HTML(html))



In [44]:
def log_management_dashboard():
    """Display a dashboard for log management"""
    log_files = find_log_files()
    
    if not log_files:
        display(HTML("<p>No log files found</p>"))
        return
    
    # Create basic dashboard
    html = """
    <style>
    .log-table {width:100%; border-collapse:collapse; margin:20px 0}
    .log-table th, .log-table td {padding:8px; border:1px solid #ddd; text-align:left;}
    .log-table tr:nth-child(even) {background-color:#f2f2f2}
    .log-table th {background-color:#4CAF50; color:white;}
    </style>
    <h2>Case Generation Logs</h2>
    <table class="log-table">
        <tr>
            <th>#</th>
            <th>Case ID</th>
            <th>Timestamp</th>
            <th>File Size</th>
            <th>Action</th>
        </tr>
    """
    
    for i, log_file in enumerate(log_files[:20]):  # Limit to 20 most recent
        size_kb = log_file.stat().st_size / 1024
        timestamp = datetime.fromtimestamp(log_file.stat().st_mtime).strftime("%Y-%m-%d %H:%M:%S")
        case_id = log_file.name.split('_')[1]  # Extract case ID from filename
        
        html += f"""
        <tr>
            <td>{i+1}</td>
            <td>{case_id}</td>
            <td>{timestamp}</td>
            <td>{size_kb:.1f} KB</td>
            <td>
                <button onclick="IPython.notebook.kernel.execute('display_log_analysis(\"{log_file}\")')">
                    Analyze
                </button>
            </td>
        </tr>
        """
    
    html += """
    </table>
    <p>To analyze a specific log file:</p>
    <code>display_log_analysis("/path/to/log_file.jsonl")</code>
    
    <h3>Log Management Options</h3>
    <ul>
        <li><b>Cleanup old logs:</b> <code>cleanup_logs(days=30)</code></li>
        <li><b>Archive logs:</b> <code>archive_logs("../Data/LogArchive")</code></li>
        <li><b>Find logs for a case:</b> <code>find_log_files("case_id")</code></li>
    </ul>
    """
    
    display(HTML(html))


In [45]:
def cleanup_logs(days=30):
    """Remove logs older than the specified number of days"""
    log_dir = Path("../Data/Logs")
    if not log_dir.exists():
        print("Log directory not found")
        return
    
    cutoff_time = datetime.now() - timedelta(days=days)
    old_logs = []
    
    for log_file in log_dir.glob("*.jsonl"):
        mod_time = datetime.fromtimestamp(log_file.stat().st_mtime)
        if mod_time < cutoff_time:
            old_logs.append(log_file)
    
    if not old_logs:
        print(f"No logs older than {days} days found")
        return
    
    print(f"Found {len(old_logs)} logs older than {days} days:")
    for log in old_logs[:5]:
        print(f"- {log.name}")
    
    if len(old_logs) > 5:
        print(f"... and {len(old_logs)-5} more")
        
    confirm = input(f"Delete these {len(old_logs)} log files? (yes/no): ")
    if confirm.lower() == 'yes':
        for log in old_logs:
            log.unlink()
        print(f"Deleted {len(old_logs)} log files")
    else:
        print("Operation cancelled")



In [46]:
def archive_logs(archive_dir="../Data/LogArchive", days=30):
    """Archive logs older than the specified number of days"""
    import shutil
    from datetime import datetime, timedelta
    
    log_dir = Path("../Data/Logs")
    archive_path = Path(archive_dir)
    archive_path.mkdir(parents=True, exist_ok=True)
    
    cutoff_time = datetime.now() - timedelta(days=days)
    old_logs = []
    
    for log_file in log_dir.glob("*.jsonl"):
        mod_time = datetime.fromtimestamp(log_file.stat().st_mtime)
        if mod_time < cutoff_time:
            old_logs.append(log_file)
    
    if not old_logs:
        print(f"No logs older than {days} days found")
        return
    
    # Create a ZIP archive
    import zipfile
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    zip_path = archive_path / f"logs_archive_{timestamp}.zip"
    
    with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
        for log in old_logs:
            zipf.write(log, arcname=log.name)
    
    print(f"Archived {len(old_logs)} logs to {zip_path}")
    
    # Ask if we should delete the original files
    confirm = input("Delete the archived log files? (yes/no): ")
    if confirm.lower() == 'yes':
        for log in old_logs:
            log.unlink()
        print(f"Deleted {len(old_logs)} log files")
    else:
        print("Log files preserved")

In [47]:
# Display the log management dashboard (uncomment to use)
# log_management_dashboard()

## 7. Arize Phoenix Integration (Optional)

This section adds optional LLM monitoring and evaluation capabilities using Arize Phoenix.

### Options:

1. **Enable Arize Phoenix** (requires additional setup)
   - Run the dependency installation
   - Restart the kernel
   - Set up API keys
   - Complete the setup process
   
2. **Skip Arize Phoenix** (simpler)
   - Run `skip_arize_setup()`
   - Continue with the notebook without tracing

The case generation system works the same either way, but enabling Phoenix provides additional insights into your LLM's performance.

### Benefits of Phoenix Tracing (if enabled):

- Monitor token usage and costs
- Track generation quality
- Identify performance bottlenecks
- Debug problematic prompts

In [94]:
def launch_phoenix_dashboard():
    """Launch Arize Phoenix dashboard"""
    from IPython.display import display, HTML
    
    # Check if Phoenix is enabled
    phoenix_enabled = globals().get('USE_PHOENIX_TRACING', False) and globals().get('arize_tracing_enabled', False)
    
    if not phoenix_enabled:
        display(HTML("""
        <div style="padding: 20px; background-color: #ffeeee; border-radius: 5px;">
            <h3>⚠️ Phoenix Tracing Not Active</h3>
            <p>Tracing is not currently active. Please check:</p>
            <ol>
                <li>API credentials in .env file</li>
                <li>Installation of required packages</li>
                <li>Phoenix setup execution</li>
            </ol>
        </div>
        """))
        return
    
    # Phoenix dashboard URLs
    cloud_url = "https://app.arize.com/openinference"
    
    # Display links
    display(HTML(f"""
    <div style="padding: 20px; background-color: #f8f9fa; border-radius: 5px;">
        <h3>Arize Phoenix Dashboard</h3>
        
        <div style="padding: 15px; background-color: white; border-radius: 5px; box-shadow: 0 2px 5px rgba(0,0,0,0.1);">
            <p>View all traces in the Arize Cloud dashboard:</p>
            <a href="{cloud_url}" target="_blank" style="
                display: inline-block;
                padding: 10px 15px;
                background-color: #4CAF50;
                color: white;
                text-decoration: none;
                border-radius: 4px;
                font-weight: bold;">
                Open Phoenix Dashboard
            </a>
            <p style="margin-top: 10px; font-size: 0.9em;">
                Project: {PHOENIX_PROJECT_ID}<br>
                Environment: {PHOENIX_ENVIRONMENT}
            </p>
        </div>
    </div>
    """))



In [95]:
# Run this to show the dashboard link
launch_phoenix_dashboard()