## FHIR Implementation Guide Testing Pipeline
This notebook provides a comprehensive pipeline for automatically extracting requirements from FHIR Implementation Guides and generating executable test suites. The pipeline transforms Implementation Guide (IG) documentation into structured test code that can validate FHIR server implementations.

#### Overview
This automated pipeline takes FHIR Implementation Guide documentation and produces comprehensive test suites through several integrated stages:

- Implementation Guide Preparation: Convert and clean IG HTML documentation to markdown format
- Requirements Extraction: Use AI to identify and extract testable requirements from the IG
- Requirements Refinement: Consolidate and refine the extracted requirements
- Requirements Downselection: Combine multiple requirement sets and remove duplicates
- Test Plan Generation: Convert requirements into detailed test specifications
- Test Kit Generation: Generate executable Inferno test code

#### Running this Notebook
The notebook is structured to run each stage sequentially. You can either:

- Run the complete pipeline: Execute all cells to process a complete IG
- Run individual stages: Execute specific sections as needed

Inputs and output directories can be customized for each step. The pipeline automatically saves intermediate outputs in checkpoint directories for review and iteration.

#### Output Structure
The pipeline generates organized outputs in checkpoint directories:

checkpoints/

├── markdown1/          # Converted markdown files

├── markdown2/          # Cleaned markdown files  

├── requirements_extraction/   # Initial AI-extracted requirements

├── revised_reqs_extraction/  # Refined requirements lists

├── requirements_downselect/  # Final consolidated requirements

├── testplan_generation/     # Detailed test specifications

└── testkit_generation/      # Executable Inferno test suites

Each stage preserves its outputs, allowing for iteration, review, and alternative processing paths.

## Setup

#### Importing Notebooks as Modules (from the [Jupyter Notebook Documentation](https://jupyter-notebook.readthedocs.io/en/4.x/examples/Notebook/rstversions/Importing%20Notebooks.html))

In [5]:
import inspect
import json
import llm_utils
import importlib
import tiktoken
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse
from glob import glob

## Initializing LLM Clients

In [6]:
importlib.reload(llm_utils)
llm_clients = llm_utils.LLMApiClient()

In [7]:
llm_clients.clients

{'claude': <anthropic.Anthropic at 0x16907a9f0>,
 'gemini': genai.GenerativeModel(
     model_name='models/gemini-2.5-pro',
     generation_config={'max_output_tokens': 8192, 'temperature': 0.3},
     safety_settings={<HarmCategory.HARM_CATEGORY_HARASSMENT: 7>: <HarmBlockThreshold.BLOCK_NONE: 4>, <HarmCategory.HARM_CATEGORY_HATE_SPEECH: 8>: <HarmBlockThreshold.BLOCK_NONE: 4>, <HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: 9>: <HarmBlockThreshold.BLOCK_NONE: 4>, <HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: 10>: <HarmBlockThreshold.BLOCK_NONE: 4>},
     tools=None,
     system_instruction=None,
     cached_content=None
 ),
 'gpt': <openai.OpenAI at 0x10ce50d10>}

## Implementation Guide Preparation

### Stage 1: Text Extraction and Cleaning
- Converts HTML IG files to markdown format
- Cleans unnecessary content (navigation, headers, formatting artifacts)
- Prepares clean, structured text for AI processing

Inputs: HTML files from FHIR IG downloads

Outputs: Clean markdown files

#### 1a) HTML to Markdown Conversion

In [None]:
import html_narrative_extractor #import html extractor module

# Process directory with default settings
result = html_narrative_extractor.convert_local_html_to_markdown(
    "../us-core/test_set", #input directory of downloaded IG HTML files
    "checkpoints/markdown1_test/" #output directory
)

Found 1 HTML files to process
Processed 1/1 files
Conversion complete. Successfully processed 1 files. Encountered 0 errors.


#### 1b) Markdown Post-processing

In [None]:
import markdown_cleaner #import markdown cleaner module
markdown_cleaner.process_directory("checkpoints/markdown1_test", #input directory of IG markdown files
                                   "checkpoints/markdown2_test/") #output directory

Found 1 markdown files in checkpoints/markdown1_test
Cleaned and saved: checkpoints/markdown2_test/CapabilityStatement-us-core-server.md

Processing complete: 1 files successfully cleaned, 0 failed


## Stage 2: Requirements Extraction

### 2a) Prompt-based Requirement Extraction
LLM Requirements Identification
- Processes markdown files using LLM to extract clear, testable requirements
- Formats requirements according to set standards, following INCOSE guidance
- Generates structured requirements with IDs, descriptions, actors, and conformance levels
- Handles large documents through chunking
- Provides source tracking

Inputs: Cleaned IG markdown files

Outputs: Structured requirements list as markdown file

In [26]:
import reqs_extraction #import LLM requirements extraction module

In [None]:
reqs_extraction.run_requirements_extractor(
    'checkpoints/markdown2_test', #input directory of markdown files
    'checkpoints/requirements_extraction/us-core', #output directory
    'claude', #set API type
    llm_clients) #initialize llm clients

INFO:root:Found markdown directory at checkpoints/markdown2_test
INFO:root:Found 1 markdown files
INFO:root:Processing with claude...
INFO:root:Starting processing with claude on directory: checkpoints/markdown2_test
INFO:root:Found 1 markdown files
INFO:root:Organized 1 files into 1 processing groups
INFO:root:Processing single file: CapabilityStatement-us-core-server.md
INFO:root:Split CapabilityStatement-us-core-server.md into 1 chunks using dynamic sizing
INFO:root:Processing chunk 1/1 of CapabilityStatement-us-core-server.md



Processing Implementation Guide with Claude...
This may take several minutes depending on the size of the Implementation Guide.


INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:root:Completed processing 1 files
INFO:root:Generated requirements document saved to checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_234511.md



Processing complete!
Generated requirements document: checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_234511.md
Processed 1 files


### 2b) Requirements Refinement
LLM-Based Requirements Review & Consolidation
- Filters and identifies only testable requirements from raw extractions
- Consolidates duplicate requirements and merges related ones
- Applies consistent formatting and structure
- Removes non-testable assertions and narrative content

Inputs: Raw requirements from extraction stage in markdown format

Outputs: Refined requirements list in markdown format

In [None]:
# import requirements refinement script as module
import reqs_reviewer
importlib.reload(reqs_reviewer)

<module 'reqs_reviewer' from '/Users/ceadams/Documents/onclaive/onclaive/pipeline/reqs_reviewer.py'>

### Large Number of Requirements (500+)

In [None]:
result = reqs_reviewer.run_batch_requirements_refinement(
    input_file="checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_234511.md", #input requirements list markdown file
    llm_client_instance=llm_clients,  #initialize llm clients
    batch_size=50,  #set batch size
    api_type="claude"  #set API type
)

INFO:root:Prompt environment set up at: /Users/ceadams/Documents/onclaive/onclaive/prompts


STARTING BATCH PROCESSING
Input: checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_234511.md
Output: checkpoints/revised_reqs_extraction
Batch size: 50 requirements
API: claude

File size: 31,830 characters
Splitting requirements...
Found 70 total requirements
Will process in 2 batches

BATCH 1/2
   Requirements: 50 (#1-#50)
   Size: 22,791 chars (~5,697 tokens)


INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"


   Completed in 66.3s
   Pausing 2s...
   Progress: 1/2 (50.0%)
   ETA: 1.1 minutes remaining

BATCH 2/2
   Requirements: 20 (#51-#70)
   Size: 9,037 chars (~2,259 tokens)


INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"


   Completed in 19.1s
   Progress: 2/2 (100.0%)
   ETA: 0.0 minutes remaining

COMBINING RESULTS
--------------------
Merging batch results and renumbering...
   Processing batch 1 results...
   Processing batch 2 results...
   Renumbered 70 requirements
BATCH PROCESSING COMPLETE!
Output saved: checkpoints/revised_reqs_extraction/claude_refined_requirements_batch_20250826_234745.md
Original requirements: 70
Final requirements: 70
Successful batches: 2/2
Failed batches: 0/2
Total time: 1.5 minutes
Average per batch: 43.7 seconds


### Small Number of Requirements (<500)

In [14]:
# Manual refinement with specific file
refinement_result = reqs_reviewer.refine_requirements(
    input_file="checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_084443.md",
    api_type="claude",
    output_dir="checkpoints/revised_reqs_extraction",
    llm_client_instance=llm_clients
)

print(f"✅ Refined requirements saved to: {refinement_result['output_file']}")
print(f"📊 {refinement_result['original_requirements_count']} → {refinement_result['requirements_count']} requirements")

if refinement_result.get('warnings'):
    print("⚠️  Warnings:")
    for warning in refinement_result['warnings']:
        print(f"  - {warning}")

INFO:root:Prompt environment set up at: /Users/ceadams/Documents/onclaive/onclaive/prompts
INFO:root:Starting requirements refinement with claude
INFO:root:Original requirements count: 67
INFO:root:Input size: 30785 characters, ~7696 tokens
INFO:root:Sending requirements to claude for refinement...
INFO:root:Estimated input tokens: 8307
INFO:root:API input limit: 180000
INFO:root:API output limit: 8192
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:root:Requirements refinement complete. Output saved to: checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250826_085415.md
INFO:root:Refined 67 -> 66 requirements
INFO:root:Output size: 29582 characters, ~7395 tokens


✅ Refined requirements saved to: checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250826_085415.md
📊 67 → 66 requirements


### 2c) Requirements Downselection
- Combines multiple requirements lists from different extraction runs
- Uses semantic similarity analysis to identify and remove duplicates
- Creates a deduplicated final requirements set

Inputs: Multiple refined requirements files in markdown or JSON format
Outputs: Final consolidated requirements in markdown or JSON format

In [19]:
import requirement_downselect
importlib.reload(requirement_downselect)
requirement_downselect.full_pass(
    md_files=["checkpoints/requirements_extraction/us-core/claude_reqs_list_v1_20250826_084443.md", "checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250826_085415.md"]
    #rag_files=["checkpoints/requirements_extraction/RAG/raw_output.json"]
    )

  headings = list(re.finditer("\*\*\w+\*\*\:", split))
INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: mps
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: all-mpnet-base-v2


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

50 of 133

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

100 of 133

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Final number of requirements: 62
Output saved in Markdown format in directory: checkpoints/requirements_downselect


## Stage 3: Test Plan Generation
- Transforms requirements into detailed test specifications
- Analyzes IG capability statements for context
- Generates implementation strategies with specific FHIR operations
- Creates structured test plans with validation criteria

Inputs: Refined requirements and IG capability statements in markdown format

Outputs: Detailed test plan in markdown format

In [21]:
import logging
llm_clients.logger.setLevel(logging.INFO)

##### Without RAG- Faster

In [25]:
import req_to_testplan
importlib.reload(req_to_testplan)

result = req_to_testplan.generate_consolidated_test_plan(
    llm_clients,
    'claude',  # 'claude' or 'gemini' or 'gpt'
    llm_clients.logger,
    "checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250826_085415.md",
    "../us-core/test_set/CapabilityStatement-us-core-server.html",
    "US Core IG",
    output_dir='checkpoints/testplan_generation/us-core'
)

print(f"Generated test plan with improved capability parsing: {result['test_plan_path']}")

INFO:root:Prompt environment set up at: /Users/ceadams/Documents/onclaive/onclaive/prompts
INFO:llm_utils:Starting test plan generation with claude for US Core IG
INFO:llm_utils:Parsed 66 requirements from checkpoints/revised_reqs_extraction/claude_reqs_list_v2_20250826_085415.md
INFO:llm_utils:Parsed capability statement from ../us-core/test_set/CapabilityStatement-us-core-server.html
INFO:llm_utils:Identifying group for requirement REQ-01 using claude...
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:llm_utils:Identifying group for requirement REQ-02 using claude...
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:llm_utils:Identifying group for requirement REQ-03 using claude...
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
INFO:llm_utils:Identifying group for requirement REQ-04 using claude...
INFO:httpx:HTTP Request: POST https://api.anthropic.com/v1/messag

Generated test plan with improved capability parsing: checkpoints/testplan_generation/us-core/claude_test_plan_20250826_230821.md


##### With RAG

In [None]:
import warnings
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Set logging level to reduce noise
import logging
logging.getLogger("urllib3.connectionpool").setLevel(logging.ERROR)
logging.getLogger("backoff").setLevel(logging.ERROR)

import req_to_testplan_rag #import test plan generation script as module
importlib.reload(req_to_testplan_rag)

#clearing any existing capability statements from vector database
req_to_testplan_rag.clear_capability_collection("capability_statements")

No existing collection found: capability_statements


In [10]:
req_to_testplan_rag.generate_consolidated_test_plan(
    llm_clients, 
    'claude', #api
    llm_clients.logger, 
    "checkpoints/revised_reqs_extraction/claude_refined_requirements_batch_20250826_234745.md", #input requirements list markdown file
    "checkpoints/markdown2/CapabilityStatement-us-core-server.md", #capability statement
    "US Core IG", #name of IG
    output_dir='checkpoints/testplan_generation/us-core', #output directory
    verbose=True)

ERROR:backoff:Giving up send_request(...) after 4 tries (requests.exceptions.SSLError: HTTPSConnectionPool(host='us.i.posthog.com', port=443): Max retries exceeded with url: /batch/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)'))))



Processing Group: AllergyIntolerance (5 requirements)

Processing REQ-008: Server SHALL support search-type and read interactions for AllergyIntolerance

RAG RETRIEVAL FOR REQ-008
Query: Server SHALL support search-type and read interactions for AllergyIntolerance "**SHALL** support `search-type`, `read`." Required interactions for AllergyIntolerance resource in US Core Server US Core Server FHIR SHALL
Searching for 2 most relevant capability chunks...

Found 2 matching chunks:

  Match 1 (distance: 0.532333493232727):
  Length: 90628 chars
  Preview: FHIR Resource Detail: #### 14.3.3.2 AllergyIntolerance

Conformance Expectation: **SHALL**

Supporte...
  ...d read/write formats for notes on the server.

---

  Match 2 (distance: 0.7728408575057983):
  Length: 1452 chars
  Preview: FHIR Major Section: ## 14.3 CapabilityStatement: US Core Server CapabilityStatement

|  |  |  |  |  ...
  ...pi.json) | [Download](us-core-server.openapi.json)

Sending request to CLAUDE API...


ERROR:backoff:Giving up send_request(...) after 4 tries (requests.exceptions.SSLError: HTTPSConnectionPool(host='us.i.posthog.com', port=443): Max retries exceeded with url: /batch/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)'))))


Completed test specification for REQ-008

Processing REQ-009: Server SHALL support returning AllergyIntolerance resource by ID

RAG RETRIEVAL FOR REQ-009
Query: Server SHALL support returning AllergyIntolerance resource by ID "A Server **SHALL** be capable of returning a AllergyIntolerance resource using: `GET [base]/AllergyIntolerance/[id]`" Required capability for retrieving specific AllergyIntolerance resources US Core Server FHIR SHALL
Searching for 2 most relevant capability chunks...

Found 2 matching chunks:

  Match 1 (distance: 0.4979751706123352):
  Length: 90628 chars
  Preview: FHIR Resource Detail: #### 14.3.3.2 AllergyIntolerance

Conformance Expectation: **SHALL**

Supporte...
  ...d read/write formats for notes on the server.

---

  Match 2 (distance: 0.8217650055885315):
  Length: 2229 chars
  Preview: FHIR Resource/Component: ### 14.3.2 FHIR RESTful Capabilities

The US Core Server **SHALL**:

1. Sup...
  ... **MAY** support the `history-system` interaction.

Sending

{'requirements_count': 71,
 'group_count': 29,
 'test_plan_path': 'checkpoints/testplan_generation/us-core/claude_test_plan_20250827_083723.md'}

## Stage 4: Test Kit Generation
- Converts test specifications into executable Inferno Ruby tests
- Generates complete test suites with proper file organization
- Creates modular test structures following Inferno framework patterns
- Includes validation and alignment checking

Inputs: Test plan specification in markdown format

Outputs: Complete Inferno test kit

In [None]:
import plan_to_tests
import importlib
importlib.reload(plan_to_tests)

plan_to_tests.generate_inferno_test_kit(
    llm_clients, #initialize llm clients
    'claude',  #set API
    'checkpoints/testplan_generation/us-core/claude_test_plan_20250827_083723.md',  #input test plan file
    #'../test_kit_dev/inferno-guidance.md', #not used right now
    output_dir='checkpoints/testkit_generation/us_core/test',  #output directory
    expected_actors=["Server", "Client"] #not used right now
)

  1. Each 'require_relative' statement should reference the actual file path INCLUDING the module folder


Found 70 requirements with updated pattern
Processing requirement: REQ-008
Added requirement REQ-008 to section AllergyIntolerance
Processing requirement: REQ-009
Added requirement REQ-009 to section AllergyIntolerance
Processing requirement: REQ-010
Added requirement REQ-010 to actor-based section US Core Server
Processing requirement: REQ-011
Added requirement REQ-011 to actor-based section US Core Client
Processing requirement: REQ-012
Added requirement REQ-012 to actor-based section US Core Server
Processing requirement: REQ-001
Added requirement REQ-001 to section Capability Statement
Processing requirement: REQ-004
Added requirement REQ-004 to section Capability Statement
Processing requirement: REQ-005
Added requirement REQ-005 to section Capability Statement
Processing requirement: REQ-013
Added requirement REQ-013 to section CarePlan
Processing requirement: REQ-014
Added requirement REQ-014 to section CarePlan
Processing requirement: REQ-015
Added requirement REQ-015 to sectio

