The idea is to focus on just the generative portion of the model and test new models as they are available. We will start by testing Marco o1

In [1]:
import json
from typing import Any, Dict, List, Optional, Tuple
import requests

In [2]:
def extract_examples(data: List[Dict], index: int) -> Tuple[str, List[Tuple[str, str]]]:
    """
    Extract the base finding and reference examples from the evaluation results.
    
    Args:
        data: List of dictionaries containing the evaluation results
        index: Index of the item to extract
        
    Returns:
        Tuple containing:
            - Base finding string
            - List of tuples, each containing (finding, conclusion) pairs from reference examples
    """
    item = data[index]
    base_finding = item['finding']
    
    reference_examples = [
        (example['finding'], example['conclusion'])
        for example in item['reference_examples']
    ]
    
    return base_finding, reference_examples

def create_prompt(
    base_finding: str,
    reference_examples: List[Tuple[str, str]],
    base_prompt: Optional[str] = None
) -> str:
    """
    Create a formatted prompt combining the base finding and reference examples.
    
    Args:
        base_finding: The primary finding to generate conclusions for
        reference_examples: List of (finding, conclusion) pairs to use as examples
        base_prompt: Optional custom instructions for the prompt
        
    Returns:
        Formatted prompt string
    """
    if base_prompt is None:
        base_prompt = """Please generate the "Conclusions" section of a medical report based on the provided "Findings" section. 
Below are several example pairs of "Findings" and their corresponding "Conclusions" to help guide the format and style of your response.

After reviewing these examples, please generate an appropriate conclusion for the new finding provided at the end."""

    # Format reference examples
    examples_text = "\n\n".join([
        f"Example {i+1}:\n"
        f"Findings:\n{finding}\n\n"
        f"Conclusion:\n{conclusion}"
        for i, (finding, conclusion) in enumerate(reference_examples)
    ])
    
    # Combine all parts
    full_prompt = f"""{base_prompt}

{examples_text}

New Finding to generate conclusion for:
{base_finding}"""
    
    return full_prompt

In [3]:
with open("evaluation_results.json", "r") as f:
    data = json.load(f)

In [4]:
base_finding, reference_examples = extract_examples(data, 0)

In [5]:
base_finding

'Orthogonal view whole body images dated May 11, 2023 are provided for review (total of 2 images). The cardiac silhouette is generally broad on both images, measuring above the limits of normal. This is most notable in the apex on the ventrodorsal image, with a broad-point of contact with the diaphragm. No distinct chamber enlargement is identified. No aortic bulge is noted. No pulmonary vasculature dilation is identified. The lungs are well-inflated and normal in appearance. No esophageal dilation is noted. The trachea is normal in diameter and position. The stomach and small intestine contain moderate granular mixed mineral and opaque ingesta and are within normal limits for a juvenile. The hepatic silhouette is within normal limits for a juvenile patient. The visible abdomen is normal. The surrounding osseous structures are normal.'

In [6]:
prompt = create_prompt(base_finding, reference_examples)
print(prompt)

Please generate the "Conclusions" section of a medical report based on the provided "Findings" section. 
Below are several example pairs of "Findings" and their corresponding "Conclusions" to help guide the format and style of your response.

After reviewing these examples, please generate an appropriate conclusion for the new finding provided at the end.

Example 1:
Findings:
Orthogonal whole body images dated May 30, 2024 are provided for review (total of 2 images). The three images dated May 22, 2021 are not evaluated. The cardiac silhouette margins are generally rounded. There is right-sided axial rotation on the ventrodorsal image, inhibiting evaluation in the region of the left atrium and left auricle. The axial rotation also causes artifactual more severe rounding of the right side of the cardiac silhouette. The visible cardiac silhouette margins are otherwise normal. The pulmonary vasculature are normal in size and margin. A longitudinal soft tissue opacity overlies the dorsal 

In [7]:
def run_ollama_inference(
    prompt: str,
    model: str = "mistral",
    system_prompt: Optional[str] = None,
    temperature: float = 0.7,
    max_tokens: Optional[int] = None
) -> str:
    """
    Run inference using an Ollama model.
    
    Args:
        prompt: The input text to process
        model: Name of the Ollama model to use (default: "mistral")
        system_prompt: Optional system prompt to set context
        temperature: Sampling temperature (default: 0.7)
        max_tokens: Maximum tokens to generate (optional)
        
    Returns:
        Generated text response
        
    Raises:
        requests.exceptions.RequestException: If the API call fails
    """
    url = "http://localhost:11434/api/generate"
    
    payload = {
        "model": model,
        "prompt": prompt,
        "stream": False,
        "temperature": temperature
    }
    
    if system_prompt:
        payload["system"] = system_prompt
    if max_tokens:
        payload["max_tokens"] = max_tokens

    try:
        response = requests.post(url, json=payload)
        response.raise_for_status()
        return response.json()["response"]
    except requests.exceptions.RequestException as e:
        raise Exception(f"Failed to get response from Ollama: {str(e)}")

In [8]:
!ollama list

NAME                    ID              SIZE      MODIFIED          
opencoder:latest        cd882db52297    4.7 GB    About an hour ago    
smollm2:latest          cef4a1e09247    1.8 GB    About an hour ago    
marco-o1:latest         007603b83a96    4.7 GB    2 days ago           
qwen2.5:latest          845dbda0ea48    4.7 GB    3 weeks ago          
gemma2:latest           ff02c3702f32    5.4 GB    3 weeks ago          
qwen2.5-coder:latest    4a26c19c376e    4.7 GB    5 weeks ago          
internlm2:latest        5050e36678ab    4.5 GB    4 months ago         
llama3:latest           365c0bd3c000    4.7 GB    4 months ago         


In [9]:
system_prompt = "You are an AI radiology assistant, helping to generate conclusions from findings in radiology reports. Please make sure that all output is in english."


In [None]:
# response = run_ollama_inference(prompt=prompt, model='qwen2.5', system_prompt=system_prompt)
# response = run_ollama_inference(prompt=prompt, model='gemma2', system_prompt=system_prompt)
response = run_ollama_inference(prompt=prompt, model='smollm2', system_prompt=system_prompt)
response = run_ollama_inference(prompt=prompt, model='qwq', system_prompt=system_prompt)
# response = run_ollama_inference(prompt=prompt, model='marco-o1', system_prompt=system_prompt) # This one doesn't seem to work well

In [12]:
base_finding

'Orthogonal view whole body images dated May 11, 2023 are provided for review (total of 2 images). The cardiac silhouette is generally broad on both images, measuring above the limits of normal. This is most notable in the apex on the ventrodorsal image, with a broad-point of contact with the diaphragm. No distinct chamber enlargement is identified. No aortic bulge is noted. No pulmonary vasculature dilation is identified. The lungs are well-inflated and normal in appearance. No esophageal dilation is noted. The trachea is normal in diameter and position. The stomach and small intestine contain moderate granular mixed mineral and opaque ingesta and are within normal limits for a juvenile. The hepatic silhouette is within normal limits for a juvenile patient. The visible abdomen is normal. The surrounding osseous structures are normal.'

In [11]:
print(response)

Based on the provided findings, there is mild to moderately broad cardiac enlargement with an otherwise normal study. This could be related to cardiomegaly or possible vacuolar degeneration/regeneration in the heart muscles and may require further evaluation through echocardiogram for a definitive diagnosis and appropriate therapeutic recommendations. Additionally, considering the clinical signs of a cough that persists despite antibiotic treatment, consider further workup with thoracic and abdominal ultrasound, as well as follow-up blood work to correlate with biochemical results to determine if there are any underlying causes contributing to this condition.


In [None]:
# Alternative implementation using the ollama package
def run_ollama_client(
    prompt: str,
    model: str = "mistral",
    system_prompt: Optional[str] = None,
    temperature: float = 0.7
) -> str:
    """
    Run inference using the Ollama client package.
    
    Args:
        prompt: The input text to process
        model: Name of the Ollama model to use (default: "mistral")
        system_prompt: Optional system prompt to set context
        temperature: Sampling temperature (default: 0.7)
        
    Returns:
        Generated text response
    """
    try:
        import ollama
    except ImportError:
        raise ImportError("Please install ollama package: pip install ollama")
    
    try:
        response = ollama.generate(
            model=model,
            prompt=prompt,
            system=system_prompt if system_prompt else None,
            temperature=temperature
        )
        return response['response']
    except Exception as e:
        raise Exception(f"Failed to get response from Ollama: {str(e)}")