# PharmAI Demo: Building an LLM Powered Pharmacometric Workflow
This notebook is the practical, hands-on guide accompanying the blog post '[The Foundation: LLM Powered Pharmacometric Workflows](https://arianthony.github.io/demo/2025/06/09/from-task-to-code-building-reliable-llm-workflows-with-langchain.html)'. It turns the vision of an AI-powered pharmacometric workflow into executable code.

We will build the foundational 'atom' of an AI agent, which operates on a **Plan, Act, Observe, Reflect** cycle:
1.  **Plan:** The LLM analyzes a task and creates an R script—its plan of action.
2.  **Act:** We execute that script in a controlled, safe environment.
3.  **Observe:** We capture the results of the action—the `STDOUT` and `STDERR`.
4.  **Reflect:** We feed these observations back to the LLM to evaluate the outcome and generate a corrected plan if needed.

This iterative feedback loop is what transforms a simple script into a basic AI agent capable of self-correction. We achieve this using:
- **LangChain:** For orchestrating the agentic loop.
- **Local LLMs via Ollama:** For secure, on-premises execution, ensuring no data leaves your environment.
- **Pydantic:** For reliable, structured output, making our agent's actions auditable and reproducible.
- **Safe R Execution:** For the crucial 'Act' and 'Observe' steps of the cycle.

This approach is perfect for pharmaceutical environments requiring security, reliability, and reproducibility. 

**On Data Security:** In this demo I am using Ollama to run local LLMs, ensuring that all data remains within your environment. However, foundational model providers like [Anthropic](https://www.anthropic.com/), [OpenAI](https://openai.com/), and [AWS Bedrock](https://aws.amazon.com/bedrock/) offer enterprise-grade security. Using these models will greatly enhance the capabilities of this agentic workflow, and thanks to LangChain's modular design, you can easily swap in these providers without changing the core logic of the agent (e.g., ChatBedrock instead of ChatOllama).

**Windows Ollama Installation:**

1. Download the Windows installer from [https://ollama.com/download](https://ollama.com/download)
2. Run the installer and follow the prompts
3. After installation, open a new terminal and run:
   ```powershell
   ollama run qwen2.5-coder:7b
   ```
   (This will download and start the model)
4. For more details, see the [Ollama Windows Guide](https://github.com/ollama/ollama/blob/main/docs/windows.md)


In [27]:
# Install requirements if needed
# !pip install langchain langchain-community pydantic ollama pandas

import subprocess
from pathlib import Path
from pydantic import BaseModel, Field
import pandas as pd


# Check Ollama is running
try:
    result = subprocess.run(['ollama', 'list'], capture_output=True, text=True)
    print("✅ Ollama is running")
    print("Available models:", result.stdout.split('\n')[1] if result.stdout else "No models found")
except:
    print("❌ Ollama not found. Install with: curl -fsSL https://ollama.com/install.sh | sh")
    print("Then: ollama pull qwen2.5:7b")

LLM_MODEL = "qwen2.5-coder:7b" # High performing model with tool use capabilities

✅ Ollama is running
Available models: qwen2.5-coder:7b                       dae161e27b0e    4.7 GB    3 hours ago      


## Step 1: Define Our Data Structures

Using Pydantic for consistent structured output from our LLM calls. This is essential for ensuring the LLM-generated code adheres to our expected format and can be safely executed within our workflow. See the LangChain structured output documentation for more details: [LangChain Structured Output](https://python.langchain.com/docs/concepts/structured_outputs/).



In [28]:
# Define our input schema 
class TaskInput(BaseModel):
    """Structure for pharmacometric analysis tasks"""
    task_name: str = Field(description="Short name for the analysis")
    task_details: str = Field(description="Natural language description of what to do")
    data_directory: str = Field(default="./data", description="Path to data files")

# Define our output schema
class AnalysisOutput(BaseModel):
    """What our LLM returns"""
    script: str = Field(description="Complete executable R script")
    thoughts: str = Field(description="LLM reasoning about the approach")
    status_complete: bool = Field(default=False, description="Whether the task is complete")


## Step 2: Build up the context for the LLM
The critical step for successful LLM interaction is providing it with the right context. This should include all the information we would need to do the analysis. So we will start by providing a preview of our data (as a string). Then we will give it an example of the task we want to perform (nlmixr2 in this case). Then we will format the prompt. 

### Putting data into context 

In [29]:
# Generalized data preview for all CSVs in DATA_DIR
data_context = ""
DATA_DIR = "./data"
for csv_file in Path(DATA_DIR).glob("*.csv"):
    try:
        df_head = pd.read_csv(csv_file).head().to_string()
        data_context += f"\n--- pd.head() for file: {csv_file.name} ---\n{df_head}\n"
    except Exception as e:
        data_context += f"\n--- {csv_file.name} ---\nError reading file: {e}\n"
print("Data context loaded:\n", data_context)

Data context loaded:
 
--- pd.head() for file: simulated_pk_data.csv ---
   ID  TIME        DV  AMT  EVID
0   1   0.0  0.010000  100     1
1   1   0.5  0.971317    0     0
2   1   1.0  1.394780    0     0
3   1   2.0  2.022982    0     0
4   1   4.0  1.545126    0     0



In [30]:
# Example for nlmixr2 documentation
example_context = """
library(nlmixr2)

## The basic model consists of an ini block that has initial estimates
one.compartment <- function() {
  ini({
    tka <- log(1.57); label("Ka")
    tcl <- log(2.72); label("Cl")
    tv <- log(31.5); label("V")
    eta.ka ~ 0.6
    eta.cl ~ 0.3
    eta.v ~ 0.1
    add.sd <- 0.7
  })
  # and a model block with the error specification and model specification
  model({
    ka <- exp(tka + eta.ka)
    cl <- exp(tcl + eta.cl)
    v <- exp(tv + eta.v)
    d/dt(depot) <- -ka * depot
    d/dt(center) <- ka * depot - cl / v * center
    cp <- center / v
    cp ~ add(add.sd)
  })
}

## The fit is performed by the function nlmixr/nlmixr2 specifying the model, data and estimate
fit <- nlmixr(one.compartment, theo_sd, "saem",
              control=list(print=0), 
              table=list(cwres=TRUE, npde=TRUE))

# Print summary of parameter estimates and confidence intervals
print(fit)

# Basic Goodness of Fit Plots
plot(fit)
"""

## Step 2: Build the Initial LLM Chain

This is where the magic happens - converting natural language to R code. I will first demonstrate this in a linear "toy" example so you can get a feel for how LLMs ingest context and produce output. Technically this is a chain and might not be called agentic yet.

In [31]:
from langchain_ollama import ChatOllama
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# Define a sample analysis task
sample_task = TaskInput(
    task_name="fit_poppk_model",
    task_details="""
    Load population pharmacokinetic data from the data directory.
    Fit a nonlinear mixed effects model using nlmixr2. 
    Print a summary table of parameter estimates and confidence intervals.
    Use ggplot2 for visualization.
    Write complete, executable R scripts using tidyverse principles
    """,
    data_directory="./data"
)

# Define the system prompt 
system_prompt = f"""You are an expert pharmacometrician generating R scripts. Data for your analysis is located in {sample_task.data_directory}.

Here is a summary of the data:
<data_context>
{data_context}
</data_context>

<task> 
{sample_task.task_name} 
</task> 

<task_details>
{sample_task.task_details}
</task_details>

<example_context>
{example_context}
</example_context>
"""


## Step 3: Define the LLM Chain
Now we will define the LLM chain that will take our task description and convert it into executable R code. This is where we leverage LangChain's structured output capabilities to ensure the generated code is safe and adheres to our expected format.

In [32]:
# Build the messages list
messages = [
    SystemMessage(content=system_prompt),
    HumanMessage(content="Please generate the R script for the above task.")
]

# Set up the LLM and parser
llm = ChatOllama(
    model=LLM_MODEL,
    temperature=0.1
)

# Use structured output with Pydantic
llm_struct = llm.with_structured_output(AnalysisOutput)

# Call the LLM (this may take up to a few minutes depending on the task complexity)
print("🧠 Generating analysis script...")
generated_analysis = llm_struct.invoke(messages)
print("✅ Analysis script generated successfully!")

🧠 Generating analysis script...
✅ Analysis script generated successfully!


In [33]:
print("✅ Analysis generated!")
print(f"Completed Status: {generated_analysis.status_complete}")
print(f"\nLLM thoughts: {generated_analysis.thoughts}")
print(f"\n📝 Generated R script ({len(generated_analysis.script)} characters):")

✅ Analysis generated!
Completed Status: False

LLM thoughts: The script loads the population pharmacokinetic data from a CSV file. It then defines a one-compartment model with inter-individual variability using nlmixr2 syntax. The model is fitted using the SAEM algorithm, and the results are printed along with basic goodness-of-fit plots.

📝 Generated R script (922 characters):


## Step 4: R Script Execution

Now that we've done a single pass we can write the script to a file and execute it. This is where we can use the `subprocess` module to run the R script in a controlled environment.
We will also capture the output and any errors to ensure we can debug if something goes wrong.

In [34]:
# Write R script to the current directory
script_path = Path(sample_task.task_name.replace(" ", "_") + "_analysis.R")
script_path.write_text(generated_analysis.script)


922

### Check out the Script
This is a good place to stop and look at the generated R script. This is the code that will be executed in the next step. You can review it and get a feel for how the LLM has interpreted the task and generated the code.


## Step 5: Execute the R Script

Finally, we will execute the generated R script using the `subprocess` module. This allows us to run the code in a controlled environment and capture any output or errors for debugging purposes. Note that this is not a deterministic process, so the output may vary depending on the LLM's interpretation of the task and the data provided. During my testing, this ran successfully about 50% of the time. If you're testing fails don't worry! This is just a simple linear chain and we have not added the iterative agentic capabilities yet.

In [35]:
# Execute R script in the current directory
result = subprocess.run(
    ["Rscript", str(script_path)],
    cwd=".",
    capture_output=True,
    text=True,
    timeout=180
)

output = f"STDOUT:\n{result.stdout}\n\nSTDERR:\n{result.stderr}"
print(output)

STDOUT:
[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 

[====|====|====|====|====|====|====|====|====|====] 0:00:00 



## Step 6: Giving the Model Some Feedback
Now that we have executed the R script, we can provide feedback to the model. This is an important step in the iterative process of refining the LLM's output. We will capture the output and any errors from the R script execution and use this information to improve the model's understanding of the task. Even if the script was executed sucessfully, we can still provide feedback to help the model learn and improve based on the output of the R script. 

In [36]:
# Append the model output and the script results to the messages
new_messages = messages.copy()
new_messages.append(
    AIMessage(content=generated_analysis.model_dump_json())
)

new_messages.append(
    HumanMessage(content=output)
)

# Pass the updated messages back to the LLM to try again
new_generated_analysis = llm_struct.invoke(new_messages)

In [37]:
# Let's look at what how the model reacted by printing the thoughts
print("🧠 LLM thoughts after execution:"
      f"\n{new_generated_analysis.thoughts}")

# Print the status
print("✅ Final status complete:", new_generated_analysis.status_complete)

# Let's look at the updated script
# Write R script to the current directory
script_path = Path(sample_task.task_name.replace(" ", "_") + "_analysis.R")
script_path.write_text(new_generated_analysis.script)

🧠 LLM thoughts after execution:
The script loads the population pharmacokinetic data from a CSV file. It then defines a one-compartment model with inter-individual variability using nlmixr2 syntax. The model is fitted using the SAEM algorithm, and the results are printed along with basic goodness-of-fit plots.
✅ Final status complete: True


922

## Step 7: Iterative Agentic Capabilities
Now that we've seen the power of feedback, let's put it into a function so that we can loop through the process multiple times. This will allow us to refine the LLM's output iteratively, improving the generated R code with each pass. We will also add some error handling to ensure that if something goes wrong, we can capture it and provide feedback to the model.


In [38]:
from llm_feedback import llm_feedback_loop

results = llm_feedback_loop(
    llm=llm_struct,
    messages=new_messages,
    task=sample_task,
    max_iterations=3
)

✅ Analysis generated!
Completed Status: True

LLM thoughts: The script loads the population pharmacokinetic data from a CSV file. It then defines a one-compartment model with inter-individual variability using nlmixr2 syntax. The model is fitted using the SAEM algorithm, and the results are printed along with basic goodness-of-fit plots.

📝 Generated R script (922 characters):
LLM response: completed successfully.


## What We Just Built: The Foundational Atom of an AI Agent

🎉 Congratulations! You've successfully built and executed a workflow that represents the foundational atom of a true AI partner. By implementing the **Plan, Act, Observe, Reflect** cycle, we've solved several core challenges:

1.  **Solved for Reliability:** We used Pydantic for structured outputs, moving beyond fragile scripts to the **auditable, reproducible workflows** required in our industry.
2.  **Solved for Decision-Making:** We built an iterative feedback loop, creating a basic agent that can **interpret results and self-correct**, a leap beyond traditional automation.
3.  **Solved for Security:** We used Ollama to run a powerful local LLM, ensuring all proprietary data and analysis **remain securely on-premises**.
4.  **Solved for Expertise:** We showed how to provide data and code context, the first step towards encoding **deep domain knowledge** into our automated systems.

This simple, powerful atom is the key building block for more advanced agents that can tackle increasingly complex tasks.

## Next Steps

With this foundation, we can now look ahead. Future explorations will build on this 'atom' to create more sophisticated capabilities:

1. **Advanced Task Orchestration**: Composing our agentic atoms to tackle multi-step analyses.
2. **RAG for Domain Knowledge**: Granting our agent access to a knowledge base of past analyses, allowing it to learn from experience.
3. **Production Deployment**: Building in the validation, monitoring, and compliance checks needed for real-world use.

## Try Your Own Tasks

Modify the `task_details` in the notebook to experiment with:
- Different types of PK/PD analysis
- Population modeling with nlmixr2
- Exposure-response analysis
- Dose optimization scenarios

The same framework scales from basic NCA to complex PopPK models!