# IG to Test Kit FULL pipeline

## Setup

### Importing Notebooks as Modules (from the [Jupyter Notebook Documentation](https://jupyter-notebook.readthedocs.io/en/4.x/examples/Notebook/rstversions/Importing%20Notebooks.html))

In [12]:
import inspect
import json
import llm_utils
import importlib

## Initializing LLM Clients

In [13]:
importlib.reload(llm_utils)
llm_clients = llm_utils.LLMApiClient()

In [14]:
llm_clients.clients

{'claude': <anthropic.Anthropic at 0x12fb732c0>,
 'gemini': genai.GenerativeModel(
     model_name='models/gemini-2.5-pro-preview-03-25',
     generation_config={'max_output_tokens': 8192, 'temperature': 0.3},
     safety_settings={},
     tools=None,
     system_instruction=None,
     cached_content=None
 ),
 'gpt': <openai.OpenAI at 0x12beeb470>}

## Text Extraction

### HTML to Markdown Conversion Using Markdownify (Langchain Tool)

In [4]:
import HTML_extractor

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [6]:
urls = [
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/index.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/ChangeHistory.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/examples.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/implementation.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/profiles.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/artifacts.html",
    "https://hl7.org/fhir/us/davinci-pdex-plan-net/CapabilityStatement-plan-net.html"
]

HTML_extractor.convert_urls_to_markdown(urls, output_dir="text_extraction/PlanNet/site/markdown")

Fetching pages: 100%|##########| 7/7 [00:00<00:00,  7.93it/s]


Created: index.md
Created: ChangeHistory.md
Created: examples.md
Created: implementation.md
Created: profiles.md
Created: artifacts.md
Created: CapabilityStatement_plan_net.md


### Markdown Post-processing

In [7]:
import markdown_cleaner
markdown_cleaner.process_directory("checkpoints/text_extraction/PlanNet/site/markdown", "checkpoints/post_processing/")

Found 7 markdown files in checkpoints/text_extraction/PlanNet/site/markdown
Cleaned and saved: checkpoints/post_processing/implementation.md
Cleaned and saved: checkpoints/post_processing/examples.md
Cleaned and saved: checkpoints/post_processing/profiles.md
Cleaned and saved: checkpoints/post_processing/ChangeHistory.md
Cleaned and saved: checkpoints/post_processing/artifacts.md
Cleaned and saved: checkpoints/post_processing/index.md
Cleaned and saved: checkpoints/post_processing/CapabilityStatement_plan_net.md

Processing complete: 7 files successfully cleaned, 0 failed


## Requirements Extraction

### Prompt-based Requirement Extraction

In [15]:
importlib.reload(reqs_extraction)

NameError: name 'reqs_extraction' is not defined

In [18]:
import reqs_extraction


In [None]:
reqs_extraction.run_requirements_extractor(
    'checkpoints/post_processing', 
    'checkpoints/requirements_extraction/markdown', 
    'claude', 
    llm_clients)

### RAG-based Requirement Extraction

This extraction requirement extraction method differs from the first in that, as a part of the creation of its prompt, it performs a semantic search on example sections of FHIR IG text and the human-generated requirements that were produced in reference to those sections of text to find the most similar section(s) of FHIR IG text in the database and their associated requirement(s). Those sets of IG text and requirement(s) are then supplied to the LLM as few-shot examples

In [3]:
import rag_reqs_extraction

In [None]:
importlib.reload(rag_reqs_extraction)
rag_reqs_extraction.full_pass(llm_clients, 'claude', "checkpoints/post_processing/", "checkpoints/requirements_extraction/RAG")

2 of 2

## Requirement Downselection

In [24]:
import requirement_downselect
importlib.reload(requirement_downselect)
requirement_downselect.full_pass(
    md_files=["checkpoints/requirements_extraction/claude_reqs_list_v1_20250429_081756.md"],
    rag_files=["checkpoints/requirements_extraction/RAG/plan_net_reqs.json"]
    )

Pair 53824 of 53824

## Test Plan Generation

In [31]:
import logging
llm_clients.logger.setLevel(logging.INFO)

In [10]:
import req_to_testplan
importlib.reload(req_to_testplan)

req_to_testplan.generate_consolidated_test_plan(
    llm_clients, 
    'claude', 
    llm_clients.logger, 
    "/Users/ceadams/Documents/onclaive/onclaive/reqs_extraction/revised_reqs_output/plan-net-requirements.md", 
    "../full-ig/markdown7_cleaned/CapabilityStatement_plan_net.md", 
    "Plan-Net IG",
    output_dir='checkpoints/testplan_generation')

2025-06-20 11:48:36,696 - root - INFO - Prompt environment set up at: /Users/ceadams/Documents/onclaive/onclaive/prompts
2025-06-20 11:48:36,697 - llm_utils - INFO - Starting test plan generation with claude for Plan-Net IG
2025-06-20 11:48:36,698 - llm_utils - ERROR - Error processing requirements: Expecting value: line 1 column 1 (char 0)


JSONDecodeError: Expecting value: line 1 column 1 (char 0)

## Test Kit Generation

In [8]:
import plan_to_tests
importlib.reload(plan_to_tests)

plan_to_tests.generate_inferno_test_kit(
    llm_clients,
    'claude',
    '/Users/ceadams/Documents/onclaive/onclaive/test_kit_dev/test_plan_output/example_claude_test_plan_20250416_143409.md',
    #'../test_kit_dev/inferno-guidance.md',
    output_dir='checkpoints/testkit_generation',
    expected_actors=["Health Plan API Actor", "Application Actor"]
)

2025-06-19 15:10:47,505 - plan_to_tests - INFO - Starting Inferno test generation with claude for PlanNet
2025-06-19 15:10:47,516 - plan_to_tests - INFO - Parsed test plan into 11 sections
2025-06-19 15:10:47,516 - plan_to_tests - INFO - Found 11 total requirements
2025-06-19 15:10:47,518 - plan_to_tests - INFO - Loaded Inferno DSL guidance
2025-06-19 15:10:47,518 - plan_to_tests - INFO - Processing section: Application-Level Requirements with 1 requirements
2025-06-19 15:10:47,518 - plan_to_tests - INFO - Generating tests for section: Application-Level Requirements
2025-06-19 15:10:47,519 - plan_to_tests - INFO - Generating test for requirement: REQ-08
2025-06-19 15:10:47,520 - plan_to_tests - INFO - Requirement REQ-08: Sending 989 tokens to claude API (limit: 16000)


Found 11 potential requirements
Processing requirement: REQ-08
Added requirement REQ-08 to section Application-Level Requirements
Processing requirement: REQ-01
Added requirement REQ-01 to section Authentication
Processing requirement: REQ-09
Added requirement REQ-09 to section Base Requirements
Processing requirement: REQ-07
Added requirement REQ-07 to section CORE Conformance
Processing requirement: REQ-06
Added requirement REQ-06 to section Cross-Resource
Processing requirement: REQ-04
Added requirement REQ-04 to section General Requirements
Processing requirement: REQ-05
Added requirement REQ-05 to section Global
Processing requirement: REQ-11
Added requirement REQ-11 to section OrganizationAffiliation
Processing requirement: REQ-02
Added requirement REQ-02 to section Plan-Net API Security
Processing requirement: REQ-10
Added requirement REQ-10 to section PractitionerRole
Processing requirement: REQ-03
Added requirement REQ-03 to section Security


2025-06-19 15:10:58,370 - httpx - INFO - HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
2025-06-19 15:10:58,371 - plan_to_tests - INFO - Successfully generated test for requirement: REQ-08
2025-06-19 15:10:58,372 - plan_to_tests - INFO - Validating test for requirement: REQ-08
2025-06-19 15:10:58,372 - plan_to_tests - INFO - Validation for test: Sending 1528 tokens to claude API
2025-06-19 15:11:07,548 - httpx - INFO - HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
2025-06-19 15:11:07,552 - plan_to_tests - INFO - Successfully validated test for requirement: REQ-08
2025-06-19 15:11:10,556 - plan_to_tests - INFO - Processing section: Authentication with 1 requirements
2025-06-19 15:11:10,558 - plan_to_tests - INFO - Generating tests for section: Authentication
2025-06-19 15:11:10,562 - plan_to_tests - INFO - Generating test for requirement: REQ-01
2025-06-19 15:11:10,565 - plan_to_tests - INFO - Requirement REQ-01: Sending 983 token

{'total_sections': 11,
 'total_requirements': 11,
 'generated_tests': 11,
 'module_dir': 'checkpoints/testkit_generation/plannet',
 'module_file': 'checkpoints/testkit_generation/plannet.rb'}