# FHIR Requirements Refinement Tool

This tool processes a raw list of FHIR Implementation Guide requirements and uses an LLM to produce a refined, concise list of only the testable requirements.

#### What It Does

- Takes a markdown file containing FHIR requirements (generated from an IG)
- Applies filtering to identify only testable requirements
- Consolidates duplicate requirements and merges related ones
- Formats each requirement with consistent structure
- Outputs a clean, testable requirements list

#### How to Use

1. Run interactive mode in notebook: `run_refinement()` or `result = run_refinement()`
2. Direct notebook to filepath of requirements list of interest
3. The refined requirements will be saved as `revised_reqs_output/{api}_reqs_list_v2_{timestamp}.md`

Notes:
- Supports Claude, Gemini, or GPT-4o
- API keys should be in .env file
- API configurations are set in llm_utils.py- changes to configurations should be made there
- Individual cert setup may need to be modified in `setup_clients()` function in the llm_utils.py file before running this notebook

### Inputs and Setup

In [None]:
import os
import logging
import time
from pathlib import Path
from typing import Dict, Any, Optional, List, Tuple
from datetime import datetime
import sys
import re
from datetime import datetime, timedelta
import tiktoken

from dotenv import load_dotenv

# Set up logging
logging.basicConfig(level=logging.INFO, 
                   format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)


In [13]:
# Define paths
PROJECT_ROOT = Path.cwd().parent  # Parent directory (one level above cwd)
CURRENT_DIR = Path.cwd()  # Current working directory
#DEFAULT_INPUT_DIR = CURRENT_DIR / "initial_reqs_output"  # Default input directory
#DEFAULT_OUTPUT_DIR = CURRENT_DIR / "revised_reqs_output"  # Default output directory

DEFAULT_INPUT_DIR = "/Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/requirements_extraction/markdown"
DEFAULT_OUTPUT_DIR = "/Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/revised_reqs_extraction"

# Create output directory
#DEFAULT_OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

# Load environment variables
load_dotenv()

# Log the directories
logging.info(f"Current working directory: {CURRENT_DIR}")
logging.info(f"Project root: {PROJECT_ROOT}")
logging.info(f"Default input directory: {DEFAULT_INPUT_DIR}")
logging.info(f"Default output directory: {DEFAULT_OUTPUT_DIR}")


2025-07-23 14:25:22,005 - root - INFO - Current working directory: /Users/ceadams/Documents/onclaive/onclaive/reqs_extraction
2025-07-23 14:25:22,006 - root - INFO - Project root: /Users/ceadams/Documents/onclaive/onclaive
2025-07-23 14:25:22,006 - root - INFO - Default input directory: /Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/requirements_extraction/markdown
2025-07-23 14:25:22,006 - root - INFO - Default output directory: /Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/revised_reqs_extraction


In [14]:
import importlib.util
module_path = os.path.join(PROJECT_ROOT, 'llm_utils.py')

spec = importlib.util.spec_from_file_location("llm_utils", module_path)
llm_utils = importlib.util.module_from_spec(spec)
spec.loader.exec_module(llm_utils)

In [15]:
# Import prompt utilities
prompt_utils_path = os.path.join(PROJECT_ROOT, 'prompt_utils.py')
spec = importlib.util.spec_from_file_location("prompt_utils", prompt_utils_path)
prompt_utils = importlib.util.module_from_spec(spec)
spec.loader.exec_module(prompt_utils)

# Setup the prompt environment
prompt_env = prompt_utils.setup_prompt_environment(PROJECT_ROOT)
PROMPT_DIR = prompt_env["prompt_dir"]
REQUIREMENTS_REFINEMENT_PATH = prompt_env["requirements_refinement_path"]

logging.info(f"Using prompts directory: {PROMPT_DIR}")
logging.info(f"Requirements refinement prompt: {REQUIREMENTS_REFINEMENT_PATH}")

2025-07-23 14:25:22,019 - root - INFO - Prompt environment set up at: /Users/ceadams/Documents/onclaive/onclaive/prompts
2025-07-23 14:25:22,020 - root - INFO - Using prompts directory: /Users/ceadams/Documents/onclaive/onclaive/prompts
2025-07-23 14:25:22,020 - root - INFO - Requirements refinement prompt: /Users/ceadams/Documents/onclaive/onclaive/prompts/requirements_refinement.md


### API Configuration

In [16]:
# System prompts
SYSTEM_PROMPTS = {
    "claude": "You are a Healthcare Standards Expert tasked with analyzing and refining FHIR Implementation Guide requirements.",
    "gemini": "Your role is to analyze and refine FHIR Implementation Guide requirements, focusing on making them concise, testable, and conformance-oriented.",
    "gpt": "As a Healthcare Standards Expert, analyze and refine FHIR Implementation Guide requirements to produce a concise, testable requirements list."
}

### Prompt Development

In [17]:
def get_requirements_refinement_prompt(requirements_list: str) -> str:
    """
    Create the prompt for refining requirements list using external prompt file
    
    Args:
        requirements_list: The original list of requirements
        
    Returns:
        str: The prompt for the LLM loaded from external file
    """
    return prompt_utils.load_prompt(
        REQUIREMENTS_REFINEMENT_PATH,
        requirements_list=requirements_list
    )

In [None]:

def estimate_tokens(text: str, api_type: str = "claude") -> int:
    """
    Estimate token count for different models using your existing count_tokens method
    """
    # Create a temporary client instance to use the count_tokens method
    try:
        temp_client = llm_utils.LLMApiClient()
        return temp_client.count_tokens(text, api_type)
    except:
        # Fallback estimation if client creation fails
        if api_type == "gpt":
            try:
                import tiktoken
                encoding = tiktoken.encoding_for_model("gpt-4o")
                return len(encoding.encode(text))
            except:
                return len(text) // 4
        else:
            # Claude and Gemini fallback
            return len(text) // 4

def get_token_limits(api_type: str) -> Tuple[int, int]:
    """
    Get input and output token limits for different APIs
    Uses your existing API_CONFIGS
    """
    # Input limits (conservative estimates for context windows)
    input_limits = {
        "claude": 180000,    # Claude 3.5 Sonnet has 200K, leave buffer
        "gemini": 900000,    # Gemini 2.5 Pro has 1M, leave buffer  
        "gpt": 120000        # GPT-4o has 128K, leave buffer
    }
    
    # Output limits from your config
    output_limits = {
        "claude": llm_utils.API_CONFIGS["claude"]["max_tokens"],
        "gemini": llm_utils.API_CONFIGS["gemini"]["max_tokens"], 
        "gpt": llm_utils.API_CONFIGS["gpt"]["max_tokens"]
    }
    
    return input_limits.get(api_type, 100000), output_limits.get(api_type, 4000)

def split_requirements_by_tokens(requirements_text: str, max_input_tokens: int, 
                                prompt_template: str, api_type: str) -> List[str]:
    """
    Split requirements into chunks that fit within token limits
    """
    # For prompt overhead calculation, use your existing prompt loading function
    sample_prompt = prompt_utils.load_prompt(REQUIREMENTS_REFINEMENT_PATH, requirements_list="PLACEHOLDER")
    prompt_overhead = estimate_tokens(sample_prompt, api_type) - estimate_tokens("PLACEHOLDER", api_type)
    
    # Available tokens for requirements (with safety buffer)
    available_tokens = max_input_tokens - prompt_overhead - 2000  # Larger safety buffer
    
    logger.info(f"Prompt overhead: {prompt_overhead} tokens")
    logger.info(f"Available tokens for requirements: {available_tokens}")
    
    # Split by individual requirements first
    req_pattern = r'(?=^---\s*\n#\s*REQ-\d+)'
    requirements = re.split(req_pattern, requirements_text, flags=re.MULTILINE)
    requirements = [req.strip() for req in requirements if req.strip()]
    
    logger.info(f"Found {len(requirements)} individual requirements to process")
    
    chunks = []
    current_chunk = ""
    current_tokens = 0
    
    for i, req in enumerate(requirements):
        req_tokens = estimate_tokens(req, api_type)
        
        # If single requirement exceeds limit, it needs special handling
        if req_tokens > available_tokens:
            logger.warning(f"Requirement {i+1} exceeds token limit ({req_tokens} > {available_tokens})")
            if current_chunk:
                chunks.append(current_chunk)
                current_chunk = ""
                current_tokens = 0
            
            # Try to split the large requirement (this is tricky and may not work perfectly)
            chunks.append(req)  # Add it anyway, let API handle it
            continue
        
        # Check if adding this requirement would exceed limit
        if current_tokens + req_tokens > available_tokens:
            if current_chunk:
                chunks.append(current_chunk)
            current_chunk = req
            current_tokens = req_tokens
        else:
            current_chunk += "\n\n" + req if current_chunk else req
            current_tokens += req_tokens
    
    if current_chunk:
        chunks.append(current_chunk)
    
    logger.info(f"Split into {len(chunks)} chunks")
    return chunks

def merge_and_renumber_requirements(results: List[str]) -> str:
    """
    Merge multiple result chunks and renumber requirements sequentially
    """
    all_requirements = []
    
    for result in results:
        # Skip safety filter blocked content
        if "[CONTENT BLOCKED BY SAFETY FILTER - SKIPPED]" in result:
            logger.warning("Skipping chunk blocked by safety filter")
            continue
            
        # Extract individual requirements from each result
        req_pattern = r'---\s*\n(#\s*REQ-\d+.*?)(?=---|\Z)'
        matches = re.findall(req_pattern, result, re.DOTALL)
        all_requirements.extend(matches)
    
    # Renumber requirements
    final_output = ""
    for i, req in enumerate(all_requirements, 1):
        # Replace the REQ-XX number
        req_renumbered = re.sub(r'#\s*REQ-\d+', f'# REQ-{i:02d}', req)
        final_output += f"---\n{req_renumbered}\n"
    
    return final_output


def validate_output_completeness(input_count: int, output_count: int, 
                               estimated_output_tokens: int, max_output_tokens: int):
    """
    Validate that output wasn't truncated
    """
    warnings = []
    
    # Check if output might be truncated
    if estimated_output_tokens > max_output_tokens * 0.95:
        warnings.append(f"Output may be truncated. Estimated tokens: {estimated_output_tokens}, Limit: {max_output_tokens}")
    
    # Check if significant requirements were lost (more than 70% reduction might indicate issues)
    if output_count < input_count * 0.3:
        warnings.append(f"Large reduction in requirements count (from {input_count} to {output_count}). Verify this is expected.")
    
    return warnings

def make_api_request_with_limits(client_instance, api_type: str, content: str) -> str:
    """Enhanced API request with token limit checking using your LLMApiClient"""
    
    # Get token limits for this API
    max_input_tokens, max_output_tokens = get_token_limits(api_type)
    
    # Get the full prompt using your existing function
    full_prompt = get_requirements_refinement_prompt(content)
    estimated_tokens = estimate_tokens(full_prompt, api_type)
    
    logger.info(f"Estimated input tokens: {estimated_tokens}")
    logger.info(f"API input limit: {max_input_tokens}")
    logger.info(f"API output limit: {max_output_tokens}")
    
    if estimated_tokens > max_input_tokens * 0.9:  # 90% threshold
        logger.warning("Input may exceed token limits. Splitting into chunks...")
        
        # Load the prompt template from your external file instead of hardcoding
        with open(REQUIREMENTS_REFINEMENT_PATH, 'r') as f:
            prompt_template = f.read()
        
        # Split requirements into manageable chunks
        chunks = split_requirements_by_tokens(content, max_input_tokens, prompt_template, api_type)
        logger.info(f"Split into {len(chunks)} chunks")
        
        all_results = []
        for i, chunk in enumerate(chunks):
            logger.info(f"Processing chunk {i+1}/{len(chunks)}")
            
            # Use your existing prompt loading function for each chunk
            chunk_prompt = get_requirements_refinement_prompt(chunk)
            
            # Use your existing LLMApiClient method
            try:
                response = client_instance.make_llm_request(
                    api_type=api_type,
                    prompt=chunk_prompt,
                    sys_prompt=SYSTEM_PROMPTS[api_type],
                    reformat=False  # We're already providing the formatted prompt
                )
                all_results.append(response)
                
                # Add small delay between chunks
                time.sleep(1)
                
            except Exception as e:
                logger.error(f"Error processing chunk {i+1}: {str(e)}")
                all_results.append(f"[ERROR PROCESSING CHUNK {i+1}: {str(e)}]")
        
        # Merge results and renumber requirements
        return merge_and_renumber_requirements(all_results)
    
    else:
        # Single API call - use your existing method
        return client_instance.make_llm_request(
            api_type=api_type,
            prompt=full_prompt,
            sys_prompt=SYSTEM_PROMPTS[api_type],
            reformat=False  # We're already providing the formatted prompt
        )

# Updated make_api_request function 
def make_api_request(client_instance, api_type: str, content: str) -> str:
    """Make API request with token limit checking using your LLMApiClient"""
    return make_api_request_with_limits(client_instance, api_type, content)



# Updated refine_requirements function
def refine_requirements(input_file: str, api_type: str = "claude", 
                       output_dir: str = None) -> Dict[str, Any]:
    """
    Refine requirements using the specified API with token limit handling
    """
    logger.info(f"Starting requirements refinement with {api_type}")
    
    # Use default output directory if none provided
    if output_dir is None:
        output_dir = DEFAULT_OUTPUT_DIR
    else:
        output_dir = Path(output_dir)
        output_dir.mkdir(parents=True, exist_ok=True)
    
    # Validate input file
    input_path = Path(input_file)
    if not input_path.exists():
        raise FileNotFoundError(f"Input file not found: {input_file}")
    
    # Read input requirements
    with open(input_path, 'r') as f:
        requirements_content = f.read()
    
    # Count original requirements
    original_req_count = count_requirements_in_markdown(requirements_content)
    logger.info(f"Original requirements count: {original_req_count}")
    
    # Check input size
    input_tokens = estimate_tokens(requirements_content, api_type)
    logger.info(f"Input size: {len(requirements_content)} characters, ~{input_tokens} tokens")
    
    # Initialize your LLMApiClient
    try:
        client_instance = llm_utils.LLMApiClient()
        if api_type not in client_instance.clients or client_instance.clients[api_type] is None:
            raise ValueError(f"API client for {api_type} is not available")
    except Exception as e:
        logger.error(f"Error initializing LLMApiClient: {str(e)}")
        raise
    
    try:
        # Process the requirements with enhanced token handling
        logger.info(f"Sending requirements to {api_type} for refinement...")
        refined_requirements = make_api_request(client_instance, api_type, requirements_content)
        
        # Validate output
        refined_req_count = count_requirements_in_markdown(refined_requirements)
        output_tokens = estimate_tokens(refined_requirements, api_type)
        max_input_tokens, max_output_tokens = get_token_limits(api_type)
        
        # Check for potential issues
        warnings = validate_output_completeness(
            original_req_count, refined_req_count, output_tokens, max_output_tokens
        )
        
        if warnings:
            for warning in warnings:
                logger.warning(warning)
                print(f"WARNING: {warning}")
        
        # Generate output filename
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        output_filename = f"{api_type}_reqs_list_v2_{timestamp}.md"
        output_file_path = output_dir / output_filename
        
        # Save refined requirements
        with open(output_file_path, 'w') as f:
            f.write(refined_requirements)
        
        logger.info(f"Requirements refinement complete. Output saved to: {output_file_path}")
        logger.info(f"Refined {original_req_count} -> {refined_req_count} requirements")
        logger.info(f"Output size: {len(refined_requirements)} characters, ~{output_tokens} tokens")
        
        return {
            "input_file": str(input_path),
            "output_file": str(output_file_path),
            "api_used": api_type,
            "timestamp": timestamp,
            "original_requirements_count": original_req_count,
            "requirements_count": refined_req_count,
            "input_tokens": input_tokens,
            "output_tokens": output_tokens,
            "warnings": warnings
        }
        
    except Exception as e:
        logger.error(f"Error refining requirements: {str(e)}")
        raise

### API Call

In [None]:
# def make_api_request(client, api_type: str, content: str) -> str:
#     """Make API request with retries"""

#     prompt = get_requirements_refinement_prompt(content)
    
#     # Create a rate limiter for this request
#     rate_limiter = llm_utils.create_rate_limiter()
#     rate_limit_func = llm_utils.create_rate_limit_function(rate_limiter, api_type)
    
#     return llm_utils.make_llm_request(
#         client=client,
#         api_type=api_type,
#         prompt=prompt,
#         system_prompt=SYSTEM_PROMPTS[api_type],
#         rate_limit_func=rate_limit_func
#     )

### Main Processing Function

In [None]:
# def refine_requirements(input_file: str, api_type: str = "claude", 
#                        output_dir: str = None) -> Dict[str, Any]:
#     """
#     Refine requirements using the specified API
    
#     Args:
#         input_file: Path to the input requirements list markdown file
#         api_type: The API to use ("claude", "gemini", or "gpt")
#         output_dir: Directory to save the output (optional)
        
#     Returns:
#         Dict containing processing results and path to refined requirements
#     """
#     logger.info(f"Starting requirements refinement with {api_type}")
    
#     # Use default output directory if none provided
#     if output_dir is None:
#         output_dir = DEFAULT_OUTPUT_DIR
#     else:
#         output_dir = Path(output_dir)
#         output_dir.mkdir(parents=True, exist_ok=True)
    
#     # Validate input file
#     input_path = Path(input_file)
#     if not input_path.exists():
#         raise FileNotFoundError(f"Input file not found: {input_file}")
    
#     # Read input requirements
#     with open(input_path, 'r') as f:
#         requirements_content = f.read()
    
#     # Initialize API clients
#     clients = llm_utils.setup_clients()
#     if api_type not in clients or clients[api_type] is None:
#         raise ValueError(f"API client for {api_type} is not available")
    
#     client = clients[api_type]
    
#     try:
#         # Process the requirements
#         logger.info(f"Sending requirements to {api_type} for refinement...")
#         refined_requirements = make_api_request(client, api_type, requirements_content)
        
#         # Generate output filename
#         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
#         output_filename = f"{api_type}_reqs_list_v2_{timestamp}.md"
#         output_file_path = output_dir / output_filename
        
#         # Save refined requirements
#         with open(output_file_path, 'w') as f:
#             f.write(refined_requirements)
        
#         # Count refined requirements
#         refined_req_count = count_requirements_in_markdown(refined_requirements)
        
#         logger.info(f"Requirements refinement complete. Output saved to: {output_file_path}")
#         logger.info(f"Identified {refined_req_count} requirements")
        
#         return {
#             "input_file": str(input_path),
#             "output_file": str(output_file_path),
#             "api_used": api_type,
#             "timestamp": timestamp,
#             "requirements_count": refined_req_count
#         }
        
#     except Exception as e:
#         logger.error(f"Error refining requirements: {str(e)}")
#         raise

### Main Execution

In [None]:
def count_requirements_in_markdown(markdown_text):
    """
    Count the number of requirements in a markdown file that follow the REQ-XX format.
    
    Handles both formats:
    # REQ-01
    or
    ## REQ-01
    
    Example of the expected formats:
    # REQ-01
    **Summary**: Some requirement summary
    
    ## REQ-02
    **Summary**: Another requirement summary
    """
    # Pattern for both formats: either # REQ-XX or ## REQ-XX
    req_pattern = r"^\s*(#|##)\s+REQ-\d+"
    
    # Count the occurrences
    lines = markdown_text.split('\n')
    count = 0
    
    for line in lines:
        if re.match(req_pattern, line):
            count += 1
    
    return count

In [None]:
def run_refinement():
    """Run the refinement process with user input"""
    print("\n" + "="*80)
    print("FHIR Requirements Refinement Tool")
    print("="*80)
    
    # Start timing the entire function execution
    start_time = time.time()
    
    # Get input directory or use default
    input_dir = input(f"Enter input directory path or accept default (default '{DEFAULT_INPUT_DIR}'): ") or str(DEFAULT_INPUT_DIR)
    input_dir_path = Path(input_dir)
    
    if not input_dir_path.exists():
        print(f"Warning: Input directory {input_dir} does not exist.")
        input_file = input("Enter full path to requirements markdown file: ")
    else:
        # List all markdown files in the input directory
        md_files = list(input_dir_path.glob("*.md"))
        
        if md_files:
            # Sort files by modification time (newest first)
            md_files.sort(key=lambda x: x.stat().st_mtime, reverse=True)
            
            # Show only the 10 most recent files
            recent_files = md_files[:10]
            
            print("\nMost recent files:")
            for idx, file in enumerate(recent_files, 1):
                # Format the modification time as part of the display
                mod_time = datetime.fromtimestamp(file.stat().st_mtime).strftime("%Y-%m-%d %H:%M")
                print(f"{idx}. {file.name} ({mod_time})")
            
            # Let user select from the list, see more files, or enter a custom path
            print("\nOptions:")
            print("- Select a number (1-10) to choose one of the following most recently generated files")
            print("- Enter 'all' to see all files")
            print("- Enter a full path to use a specific file")
            
            selection = input("\nReview the printed options for choosing a requirements file and enter applicable selection: ")
            
            if selection.lower() == 'all':
                # Show all files with pagination
                all_files = md_files
                page_size = 20
                total_pages = (len(all_files) + page_size - 1) // page_size
                
                current_page = 1
                while current_page <= total_pages:
                    start_idx = (current_page - 1) * page_size
                    end_idx = min(start_idx + page_size, len(all_files))
                    
                    print(f"\nAll files (page {current_page}/{total_pages}):")
                    for idx, file in enumerate(all_files[start_idx:end_idx], start_idx + 1):
                        mod_time = datetime.fromtimestamp(file.stat().st_mtime).strftime("%Y-%m-%d %H:%M")
                        print(f"{idx}. {file.name} ({mod_time})")
                    
                    if current_page < total_pages:
                        next_action = input("\nPress Enter for next page, 'q' to select, or enter a number to choose a file: ")
                        if next_action.lower() == 'q':
                            break
                        elif next_action.isdigit() and 1 <= int(next_action) <= len(all_files):
                            input_file = str(all_files[int(next_action) - 1])
                            break
                        else:
                            current_page += 1
                    else:
                        break
                
                if 'input_file' not in locals():
                    # If we went through all pages without selection
                    file_number = input("\nEnter the file number to process: ")
                    if file_number.isdigit() and 1 <= int(file_number) <= len(all_files):
                        input_file = str(all_files[int(file_number) - 1])
                    else:
                        input_file = file_number  # Treat as a custom path
            
            elif selection.isdigit() and 1 <= int(selection) <= len(recent_files):
                input_file = str(recent_files[int(selection) - 1])
            else:
                input_file = selection  # Treat as a custom path
        else:
            print(f"No markdown files found in {input_dir}")
            input_file = input("Enter full path to requirements markdown file: ")
    
    # Get output directory or use default
    output_dir = input(f"Enter output directory path or accept default (default '{DEFAULT_OUTPUT_DIR}'): ") or str(DEFAULT_OUTPUT_DIR)
    output_dir_path = Path(output_dir)
    
    # Create output directory if it doesn't exist
    output_dir_path.mkdir(parents=True, exist_ok=True)
    
    # Select the API to use
    print("\nSelect the API to use:")
    print("1. Claude")
    print("2. Gemini")
    print("3. GPT-4")
    api_choice = input("Enter your choice of API to use, based on the printed listing (1-3, default 1): ") or "1"
    
    api_mapping = {
        "1": "claude",
        "2": "gemini",
        "3": "gpt"
    }
    
    api_type = api_mapping.get(api_choice, "claude")
    
    try:
        # Run the refinement
        print(f"\nProcessing requirements with {api_type.capitalize()}...")
        result = refine_requirements(input_file, api_type, output_dir_path)
        
        # Calculate total execution time
        total_elapsed_time = time.time() - start_time
        total_elapsed_formatted = str(timedelta(seconds=int(total_elapsed_time)))
        
        print("\n" + "="*80)
        print("Requirements Refinement Complete!")
        print(f"Input file: {result['input_file']}")
        print(f"Refined requirements saved to: {result['output_file']}")
        print(f"API used: {result['api_used']}")
        print(f"Number of requirements identified: {result['requirements_count']}")
        print(f"Total execution time: {total_elapsed_formatted}")
        print("="*80)
        
        return result
    
    except Exception as e:
        logger.error(f"Error: {str(e)}")
        print(f"\nError occurred during refinement: {str(e)}")
        print("Check the log for more details.")
        return None

In [22]:
# Run the interactive version
run_refinement()


FHIR Requirements Refinement Tool

Most recent files:
1. claude_reqs_list_v1_20250723_134708.md (2025-07-23 13:47)
2. claude_reqs_list_v1_20250723_132400.md (2025-07-23 13:24)
3. claude_reqs_list_v1_20250722_083504.md (2025-07-22 08:35)
4. gemini_reqs_list_v1_20250722_082855.md (2025-07-22 08:28)
5. gpt_reqs_list_v1_20250722_081950.md (2025-07-22 08:19)
6. claude_reqs_list_v1_20250620_115908.md (2025-06-20 11:59)

Options:
- Select a number (1-10) to choose one of the following most recently generated files
- Enter 'all' to see all files
- Enter a full path to use a specific file

Select the API to use:
1. Claude
2. Gemini
3. GPT-4


2025-07-23 14:25:30,711 - __main__ - INFO - Starting requirements refinement with claude
2025-07-23 14:25:30,736 - __main__ - INFO - Sending requirements to claude for refinement...



Processing requirements with Claude...


2025-07-23 14:26:31,259 - anthropic._base_client - INFO - Retrying request to /v1/messages in 0.435537 seconds
2025-07-23 14:27:01,112 - httpx - INFO - HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
2025-07-23 14:27:01,117 - __main__ - INFO - Requirements refinement complete. Output saved to: /Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250723_142701.md
2025-07-23 14:27:01,118 - __main__ - INFO - Identified 15 requirements



Requirements Refinement Complete!
Input file: /Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/requirements_extraction/markdown/claude_reqs_list_v1_20250723_134708.md
Refined requirements saved to: /Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250723_142701.md
API used: claude
Number of requirements identified: 15
Total execution time: 0:01:39


{'input_file': '/Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/requirements_extraction/markdown/claude_reqs_list_v1_20250723_134708.md',
 'output_file': '/Users/ceadams/Documents/onclaive/onclaive/pipeline/checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250723_142701.md',
 'api_used': 'claude',
 'timestamp': '20250723_142701',
 'requirements_count': 15}