# FHIR Test Kit Generator

This notebook generates a consolidated test kit markdown file from FHIR Implementation Guide requirements. The output serves as a complete specification that can be used by an LLM to generate executable test scripts.

#### What it does

- Processes each requirement from a markdown input file
- Generates comprehensive test specifications including:
  - Testability assessment (Automatic/Manual/Hybrid)
  - Implementation strategy with specific FHIR operations
  - Required test data and validation criteria
  - Implementable pseudocode
  - Edge cases and considerations
- Creates a single, well-structured markdown file with a table of contents

#### How to use

1. **Setup**: Individual cert setup may need to be modified in `setup_clients()` function. API keys should be in .env file. Make sure you have API keys for at least one of:
   - Anthropic Claude (`ANTHROPIC_API_KEY`)
   - Google Gemini (`GEMINI_API_KEY`) 
   - OpenAI GPT-4 (`OPENAI_API_KEY`)

2. **Input**: A markdown file with requirements in the following format:
   ```markdown
   # REQ-ID
   **Summary**: Requirement summary
   **Description**: Detailed description
   **Verification**: Test approach
   **Actor**: System component responsible
   **Conformance**: SHALL/SHOULD/MAY
   **Conditional**: True/False
   **Source**: Original requirement sources
   ---
   ```

3. **Run**: Execute the `run_test_kit_generator()` function and follow the prompts:
   - Select which LLM to use
   - Provide the path to your requirements file
   - Enter the Implementation Guide name
   - Specify the output directory, or use the default

4. **Output**: A single markdown file will be generated with the format:
   `[ig_name]_[llm]_test_kit_[timestamp].md`

In [174]:
import re
import os
import logging
import time
import json
from datetime import datetime
from pathlib import Path
from typing import List, Dict, Any, Optional, Tuple

import pandas as pd
from dotenv import load_dotenv
from anthropic import Anthropic, RateLimitError
import google.generativeai as gemini
from openai import OpenAI
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_exception_type

# Set up logging
logging.basicConfig(level=logging.INFO, 
                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)


In [189]:
# Constants
PROJECT_ROOT = Path.cwd().parent  # Go up one level to project root
OUTPUT_DIR = os.path.join(PROJECT_ROOT, '/reqs_extraction/test_plan_output')


In [176]:

# API Configuration
API_CONFIGS = {
    "claude": {
        "model_name": "claude-3-7-sonnet-20250219", 
        "max_tokens": 8192,
        "temperature": 0.3,  # Lower temperature for more consistent output
        "batch_size": 5,
        "delay_between_chunks": 1,
        "delay_between_batches": 3,
        "requests_per_minute": 900,
        "max_requests_per_day": 20000,
        "delay_between_requests": 0.1
    },
    "gemini": {
        "model": "models/gemini-1.5-pro-001",
        "max_tokens": 8192,
        "temperature": 0.3,
        "batch_size": 5,
        "delay_between_chunks": 2,
        "delay_between_batches": 5,
        "requests_per_minute": 900,
        "max_requests_per_day": 50000,
        "delay_between_requests": 0.1,
        "timeout": 60
    },
    "gpt": {
        "model": "gpt-4",
        "max_tokens": 3000,
        "temperature": 0.3,
        "batch_size": 5,
        "delay_between_chunks": 2,
        "delay_between_batches": 5,
        "requests_per_minute": 450,
        "max_requests_per_day": 20000,
        "delay_between_requests": 0.15
    }
}

# System prompts for test generation
SYSTEM_PROMPT = """You are a specialized FHIR testing engineer with expertise in healthcare interoperability.
Your task is to analyze FHIR Implementation Guide requirements and generate practical, implementable test specifications."""



In [177]:
# Main prompt for consolidated test kit
CONSOLIDATED_TEST_KIT_PROMPT = """
Analyze the following FHIR Implementation Guide requirement and create a comprehensive test specification.

For the requirement:
{requirement}

Create a structured test specification with the following sections:

1. Requirement Analysis:
   - Testability Assessment: Classify as Automatic, Manual, or Hybrid
   - Complexity: Simple, Moderate, or Complex
   - Prerequisites: Required system configurations, data, or setup

2. Test Implementation Strategy:
   - Required FHIR Operations: List specific API calls/operations needed
   - Test Data Requirements: Define specific test data needed
   - Validation Criteria: Specific checks to verify conformance
   
3. Pseudocode Implementation:
   - Provide detailed, implementable pseudocode that could be directly translated to a test script
   - Handle both positive and negative test cases where applicable
   - Include proper error handling and edge cases

4. Potential Issues and Edge Cases:
   - Identify corner cases that should be tested
   - Note performance or security considerations

Format your response as markdown with clear headers and code blocks for pseudocode.
"""

In [178]:
def create_rate_limiter():
    """Create a rate limiter state dictionary for all APIs"""
    return {
        api: {
            'requests': [],
            'daily_requests': 0,
            'last_reset': time.time()
        }
        for api in API_CONFIGS.keys()
    }

def check_rate_limits(rate_limiter: dict, api: str):
    """Check and wait if rate limits would be exceeded"""
    if api not in rate_limiter:
        raise ValueError(f"Unknown API: {api}")
        
    now = time.time()
    state = rate_limiter[api]
    config = API_CONFIGS[api]
    
    # Reset daily counts if needed
    day_seconds = 24 * 60 * 60
    if now - state['last_reset'] >= day_seconds:
        state['daily_requests'] = 0
        state['last_reset'] = now
    
    # Check daily limit
    if state['daily_requests'] >= config['max_requests_per_day']:
        raise Exception(f"{api} daily request limit exceeded")
    
    # Remove old requests outside the current minute
    state['requests'] = [
        req_time for req_time in state['requests']
        if now - req_time < 60
    ]
    
    # Wait if at rate limit
    if len(state['requests']) >= config['requests_per_minute']:
        sleep_time = 60 - (now - state['requests'][0])
        if sleep_time > 0:
            time.sleep(sleep_time)
        state['requests'] = state['requests'][1:] 
    
    # Add minimum delay between requests
    if state['requests'] and now - state['requests'][-1] < config['delay_between_requests']:
        time.sleep(config['delay_between_requests'])
    
    # Record this request
    state['requests'].append(now)
    state['daily_requests'] += 1

In [179]:
def setup_clients():
    """Initialize clients for each LLM service"""
    try:
        # Claude setup
        claude_client = Anthropic(
            api_key=os.getenv('ANTHROPIC_API_KEY'),
        )
        
        # Gemini setup
        gemini_api_key = os.getenv('GEMINI_API_KEY')
        if not gemini_api_key:
            raise ValueError("GEMINI_API_KEY not found")
        gemini.configure(api_key=gemini_api_key)
        gemini_client = gemini.GenerativeModel(
            model_name=API_CONFIGS["gemini"]["model"],
            generation_config={
                "max_output_tokens": API_CONFIGS["gemini"]["max_tokens"],
                "temperature": API_CONFIGS["gemini"]["temperature"]
            }
        )
        
        # OpenAI setup
        openai_api_key = os.getenv('OPENAI_API_KEY')
        if not openai_api_key:
            raise ValueError("OPENAI_API_KEY not found")
        openai_client = OpenAI(
            api_key=openai_api_key,
            timeout=60.0
        )
        
        return {
            "claude": claude_client,
            "gpt": openai_client,
            "gemini": gemini_client
        }
        
    except Exception as e:
        logging.error(f"Error setting up clients: {str(e)}")
        raise

In [180]:
def parse_requirements_file(file_path: str) -> List[Dict[str, str]]:
    """
    Parse an INCOSE requirements markdown file into a structured list of requirements
    
    Args:
        file_path: Path to the requirements markdown file
        
    Returns:
        List of dictionaries containing structured requirement information
    """
    with open(file_path, 'r') as f:
        content = f.read()
    
    # Split by requirement sections (separated by ---)
    req_sections = content.split('---')
    
    requirements = []
    for section in req_sections:
        if not section.strip():
            continue
            
        # Parse requirement data
        req_data = {}
        
        # Extract ID from format "# REQ-XXX-XXX-XX"
        id_match = re.search(r'#\s+([A-Z0-9\-]+)', section)
        if id_match:
            req_data['id'] = id_match.group(1)
        
        # Extract other fields
        for field in ['Summary', 'Description', 'Verification', 'Actor', 'Conformance', 'Conditional', 'Source']:
            pattern = rf'\*\*{field}\*\*:\s*(.*?)(?:\n\*\*|\n---|\\Z)'
            field_match = re.search(pattern, section, re.DOTALL)
            if field_match:
                req_data[field.lower()] = field_match.group(1).strip()
        
        if req_data:
            requirements.append(req_data)
    
    return requirements

In [181]:
def format_requirement_for_prompt(requirement: Dict[str, str]) -> str:
    """
    Format a requirement dictionary into markdown for inclusion in prompts
    
    Args:
        requirement: Requirement dictionary
        
    Returns:
        Formatted markdown string
    """
    formatted = f"# {requirement.get('id', 'UNKNOWN-ID')}\n"
    formatted += f"**Summary**: {requirement.get('summary', '')}\n"
    formatted += f"**Description**: {requirement.get('description', '')}\n"
    formatted += f"**Verification**: {requirement.get('verification', '')}\n"
    formatted += f"**Actor**: {requirement.get('actor', '')}\n"
    formatted += f"**Conformance**: {requirement.get('conformance', '')}\n"
    formatted += f"**Conditional**: {requirement.get('conditional', '')}\n"
    formatted += f"**Source**: {requirement.get('source', '')}\n"
    
    return formatted

In [182]:
@retry(
    wait=wait_exponential(multiplier=1, min=4, max=60),
    stop=stop_after_attempt(5),
    retry=retry_if_exception_type((RateLimitError, TimeoutError))
)
def make_llm_request(client, api_type: str, prompt: str, rate_limit_func) -> str:
    """Make rate-limited API request with retries"""
    rate_limit_func()
    
    config = API_CONFIGS[api_type]
    
    try:
        if api_type == "claude":
            response = client.messages.create(
                model=config["model_name"],
                max_tokens=config["max_tokens"],
                messages=[{
                    "role": "user", 
                    "content": prompt
                }],
                system=SYSTEM_PROMPT
            )
            return response.content[0].text
            
        elif api_type == "gemini":
            response = client.generate_content(
                prompt,
                generation_config={
                    "max_output_tokens": config["max_tokens"],
                    "temperature": config["temperature"]
                }
            )
            if hasattr(response, 'text'):
                return response.text
            elif response.candidates:
                return response.candidates[0].content.parts[0].text
            else:
                raise ValueError("No response generated from Gemini API")
                    
        elif api_type == "gpt":
            response = client.chat.completions.create(
                model=config["model"],
                messages=[
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": prompt}
                ],
                max_tokens=config["max_tokens"],
                temperature=config["temperature"]
            )
            return response.choices[0].message.content
            
    except Exception as e:
        logging.error(f"Error in {api_type} API request: {str(e)}")
        raise


In [183]:
def generate_test_specification(
    client, 
    api_type: str,
    requirement: Dict[str, str],
    rate_limit_func
) -> str:
    """
    Generate a comprehensive test specification for a single requirement
    
    Args:
        client: The API client
        api_type: API type (claude, gemini, gpt)
        requirement: Requirement dictionary
        rate_limit_func: Function to check rate limits
        
    Returns:
        Test specification for the requirement
    """
    logger.info(f"Generating test specification for {requirement.get('id', 'unknown')} using {api_type}...")
    
    # Format requirement as markdown
    formatted_req = format_requirement_for_prompt(requirement)
    
    # Create prompt with the requirement
    prompt = CONSOLIDATED_TEST_KIT_PROMPT.format(requirement=formatted_req)
    
    # Make the API request
    return make_llm_request(client, api_type, prompt, rate_limit_func)


In [184]:
def generate_consolidated_test_kit(
    api_type: str,
    requirements_file: str,
    ig_name: str = "FHIR Implementation Guide",
    output_dir: str = OUTPUT_DIR
) -> Dict[str, Any]:
    """
    Process requirements and generate a consolidated test kit
    
    Args:
        api_type: API type (claude, gemini, gpt)
        requirements_file: Path to requirements markdown file
        ig_name: Name of the Implementation Guide
        output_dir: Directory for output files
        
    Returns:
        Dictionary containing path to output file
    """
    logger.info(f"Starting test kit generation with {api_type} for {ig_name}")
    
    # Initialize API clients and rate limiters
    clients = setup_clients()
    client = clients[api_type]
    config = API_CONFIGS[api_type]
    rate_limiter = create_rate_limiter()
    
    def check_limits():
        check_rate_limits(rate_limiter, api_type)
    
    # Create output directory
    os.makedirs(output_dir, exist_ok=True)
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    
    try:
        # Parse requirements from file
        requirements = parse_requirements_file(requirements_file)
        logger.info(f"Parsed {len(requirements)} requirements from {requirements_file}")
        
        # Output file
        test_kit_path = os.path.join(
            output_dir, 
            f"{ig_name.lower().replace(' ', '_')}_{api_type}_test_kit_{timestamp}.md"
        )
        
        # Initialize test kit content
        test_kit = f"# Consolidated Test Kit for {ig_name}\n\n"
        test_kit += f"## Generated on: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n"
        test_kit += "## Table of Contents\n\n"
        
        # Add table of contents
        for req in requirements:
            req_id = req.get('id', 'UNKNOWN-ID')
            req_summary = req.get('summary', 'No summary')
            test_kit += f"- [{req_id}: {req_summary}](#{req_id.lower()})\n"
        
        test_kit += "\n## Test Specifications\n\n"
        
        # Process each requirement
        for i, req in enumerate(requirements):
            req_id = req.get('id', 'UNKNOWN-ID')
            logger.info(f"Processing requirement {i+1}/{len(requirements)}: {req_id}")
            
            # Generate test specification
            test_spec = generate_test_specification(client, api_type, req, check_limits)
            
            # Add to test kit content with proper anchor for TOC linking
            test_kit += f"<a id='{req_id.lower()}'></a>\n\n"
            test_kit += f"### {req_id}: {req.get('summary', 'No summary')}\n\n"
            test_kit += f"**Description**: {req.get('description', '')}\n\n"
            test_kit += f"**Actor**: {req.get('actor', '')}\n\n"
            test_kit += f"**Conformance**: {req.get('conformance', '')}\n\n"
            test_kit += f"{test_spec}\n\n"
            test_kit += "---\n\n"
            
            # Add delay between requests
            if i < len(requirements) - 1:  # No need to delay after the last request
                time.sleep(config["delay_between_chunks"])
        
        # Save consolidated test kit
        with open(test_kit_path, 'w') as f:
            f.write(test_kit)
        logger.info(f"Consolidated test kit saved to {test_kit_path}")
        
        return {
            "requirements_count": len(requirements),
            "test_kit_path": test_kit_path
        }
        
    except Exception as e:
        logger.error(f"Error processing requirements: {str(e)}")
        raise

In [185]:
# Interactive notebook cell for running the generator
def run_test_kit_generator():
    # Load environment variables
    load_dotenv()
    
    # Get input from user or set default values
    print("\nFHIR IG Test Kit Generator")
    print("=" * 50)
    
    # Let user select the API
    print("\nSelect the API to use:")
    print("1. Claude")
    print("2. Gemini")
    print("3. GPT-4")
    api_choice = input("Enter your choice (1-3, default 1): ") or "1"
    
    api_mapping = {
        "1": "claude",
        "2": "gemini",
        "3": "gpt"
    }
    
    api_type = api_mapping.get(api_choice, "claude")
    
    # Get requirements file path
    requirements_file = input("\nEnter path to requirements markdown file: ")
    
    # Check if requirements file exists
    if not os.path.exists(requirements_file):
        logger.error(f"Requirements file not found: {requirements_file}")
        print(f"Error: Requirements file not found at {requirements_file}")
        return
    
    # Get IG name
    ig_name = input("\nEnter Implementation Guide name (default 'FHIR Implementation Guide'): ") or "FHIR Implementation Guide"
    
    # Get output directory
    output_dir = input(f"\nEnter output directory (default '{OUTPUT_DIR}'): ") or OUTPUT_DIR
    
    print(f"\nProcessing requirements with {api_type.capitalize()}...")
    print(f"This may take several minutes depending on the number of requirements.")
    
    try:
        # Process requirements and generate test kit
        result = generate_consolidated_test_kit(
            api_type=api_type,
            requirements_file=requirements_file,
            ig_name=ig_name,
            output_dir=output_dir
        )
        
        # Output results
        print("\n" + "="*80)
        print(f"Test kit generation complete!")
        print(f"Processed {result['requirements_count']} requirements")
        print(f"Consolidated test kit: {result['test_kit_path']}")
        print("="*80)
        
        return result
        
    except Exception as e:
        logger.error(f"Error: {str(e)}")
        print(f"\nError occurred during processing: {str(e)}")
        print("Check the log for more details.")
        return None


In [None]:
# Run the generator when executed in a notebook cell
if __name__ == "__main__":
    run_test_kit_generator()


FHIR IG Test Kit Generator

Select the API to use:
1. Claude
2. Gemini
3. GPT-4


2025-03-19 12:50:12,356 - __main__ - INFO - Starting test kit generation with gpt for Plan Net
2025-03-19 12:50:12,371 - __main__ - INFO - Parsed 18 requirements from /Users/ceadams/Documents/onclaive/onclaive/reqs_extraction/revised_reqs/refined_requirements_gemini_20250319_114029.md
2025-03-19 12:50:12,371 - __main__ - INFO - Processing requirement 1/18: R
2025-03-19 12:50:12,372 - __main__ - INFO - Generating test specification for R using gpt...



Processing requirements with Gpt...
This may take several minutes depending on the number of requirements.


2025-03-19 12:50:35,776 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-19 12:50:37,798 - __main__ - INFO - Processing requirement 2/18: REQ-AUTH-01
2025-03-19 12:50:37,799 - __main__ - INFO - Generating test specification for REQ-AUTH-01 using gpt...
2025-03-19 12:50:55,873 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-19 12:50:57,885 - __main__ - INFO - Processing requirement 3/18: REQ-CLIENT-01
2025-03-19 12:50:57,886 - __main__ - INFO - Generating test specification for REQ-CLIENT-01 using gpt...
2025-03-19 12:51:14,861 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-19 12:51:16,868 - __main__ - INFO - Processing requirement 4/18: REQ-DATA-01
2025-03-19 12:51:16,869 - __main__ - INFO - Generating test specification for REQ-DATA-01 using gpt...
2025-03-19 12:51:35,598 - httpx - INFO - HTTP Request: POST https://a


Test kit generation complete!
Processed 18 requirements
Consolidated test kit: /Users/ceadams/Documents/onclaive/onclaive/reqs_extraction/test_plan_output/plan_net_gpt_test_kit_20250319_125012.md
