PV Signal Intelligence Workbench

Local-LLM pharmacovigilance platform · Drug-agnostic · Clinician-designed

A production-grade pharmacovigilance (PV) platform that pairs 18 years of ICU/critical care clinical expertise with agentic AI and PV data science. Built for a PharmD, BCPS, BCCCP consultant managing multiple client drugs in parallel — every AI output is a draft reviewed by the clinician before any regulatory use.

The Problem This Solves

PV consultants face a daily information overload: FAERS adverse event feeds, PubMed literature, MedDRA coding decisions, E2B(R3) narrative drafts, and regulatory signal management across FDA and EMA frameworks — often for multiple drugs simultaneously. Commercial platforms are expensive, cloud-dependent, and not built for solo consultants. This workbench brings the full PV workflow local, private, and clinician-controlled.

System Architecture

graph TB
    subgraph Sources["📥 Data Sources"]
        FAERS["OpenFDA FAERS API\nAdverse Event Reports"]
        PubMed["PubMed / Entrez API\nLiterature"]
        Vault["Obsidian Vault\nRegulatory Guidelines\nICH · FDA · EMA GVP"]
    end

    subgraph RAG["🔍 RAG Pipeline"]
        Chunker["Header-Aware\nMarkdown Chunker"]
        Embed["nomic-embed-text\nOllama Embeddings"]
        ChromaDB["ChromaDB\nPersistent Vector Store"]
    end

    subgraph LLMs["🤖 Local LLM Stack — Ollama"]
        G26b["gemma4:26b\nThinking Mode\nReasoning · Analysis · MedDRA"]
        G4b["gemma4:e4b\nDrafting · Prose · Digests"]
    end

    subgraph Modules["⚙️ PV Modules"]
        M1["Module 1\nRegulatory Q&A\nRAG over FDA + EMA guidelines"]
        M2["Module 2\nMedDRA Coder\nPT suggestion with reviewer flag"]
        M3["Module 3\nFAERS Signal Detection\nPRR · Chi² · Evans criteria"]
        M4["Module 4\nICSR Narrative Generator\nE2B(R3)-aligned draft"]
        M5["Module 5\nLiterature Monitor\nPubMed digest + Telegram"]
    end

    subgraph Projects["📁 Multi-Drug Project Layer"]
        PC["ProjectConfig\nper-drug collection\ncomparator · pubmed terms"]
    end

    subgraph Output["📊 Outputs"]
        Dash["Streamlit Dashboard\nSenior Reviewer Interface"]
        Discord["Discord #lit-monitor\nWeekly Digest"]
        JFILE["Signal JSON\nAudit Trail"]
    end

    Vault --> Chunker --> Embed --> ChromaDB
    ChromaDB --> M1 & M2 & M3 & M5
    FAERS --> M3
    PubMed --> M5
    G26b --> M1 & M2 & M3
    G4b --> M4 & M5
    PC --> M1 & M2 & M3 & M4 & M5
    M1 & M2 & M3 & M4 & M5 --> Dash
    M5 --> Discord["Discord\n#lit-monitor Digest"]
    M3 --> JFILE

    subgraph HW["💻 Hardware"]
        GPU["RTX 5060 Ti · 16 GB VRAM"]
        RAM["32 GB System RAM"]
    end
    LLMs -.->|runs on| GPU

Design principle: All AI outputs carry is_draft=True and reviewer_flag. The platform is a junior analyst — the PharmD is the senior reviewer.

Tech Stack

Layer	Choice	Why
Reasoning LLM	`gemma4:26b` (256K ctx, Thinking Mode)	MedDRA deliberation, signal interpretation, regulatory Q&A — needs long context and structured reasoning
Drafting LLM	`gemma4:e4b`	ICSR narratives, lit digests — fast, coherent prose without heavy compute
Embeddings	`nomic-embed-text` via Ollama	Local, no API key, strong retrieval performance
LLM Serving	Ollama `http://127.0.0.1:11434`	Single-command model management, GPU scheduling
Orchestration	LangChain (`ChatOllama` + `ChatPromptTemplate`)	Structured prompt→parse pipelines per module
Vector Store	ChromaDB (persistent, per-drug collections)	Drug isolation without separate servers
Knowledge Base	Obsidian vault → header-aware chunker	Clinician-editable; wikilinks resolved at ingest
Dashboard	Streamlit	Rapid iteration; works locally, no frontend build step
Signal API	OpenFDA FAERS (quarterly-partitioned pagination)	Bypasses 5000-result cap; deduplicates by `safetyreportid`
Literature	Biopython Entrez (PubMed)	Standard; handles date-windowed search across multiple query terms
Notifications	Discord REST API (discord_utils.py)	Weekly digest delivery with escalation alerts to #lit-monitor
Hardware	RTX 5060 Ti 16 GB · 32 GB RAM	26B model fits comfortably; no cloud dependency

Project Spotlight: FAERS Signal Detection Pipeline

The FAERS signal detection module is the most technically demanding piece — combining statistical disproportionality analysis with LLM-powered clinical interpretation.

What It Does

Fetch — Retrieves adverse event reports from OpenFDA using quarterly date-range partitioning. The FAERS API caps results at 5000 per search; partitioning into calendar quarters yields the complete dataset without truncation.
Compute PRR — Calculates Proportional Reporting Ratio (PRR) against a configurable comparator drug (default: meropenem) using the Evans criteria: PRR ≥ 2.0 AND N ≥ 3 AND χ² ≥ 4.0.
Statistical rigor — Three production-grade adjustments:
- Artifact exclusion: Administrative FAERS PTs (off label use, no adverse event, drug ineffective, etc.) are filtered before analysis
- Continuity correction: b=0 reactions (drug-specific signals with no background cases) use b=0.5 instead of being silently dropped — preserves novel signals for new drugs
- Yates' χ² correction: Applied when any expected cell count < 5, reducing false positives common in sparse FAERS data for recently-approved drugs
Clinical interpretation — One batched gemma4:26b call interprets all positive signals with confounding analysis (critical for last-resort antibiotics where severity bias inflates mortality PRRs), ICH E2A regulatory action classification, and reviewer notes.

PRR Formula

PRR = (a / (a+c)) / (b / (b+d))

         Drug    Comparator
React.     a          b
No React.  c          d

Signal if: PRR ≥ 2.0 AND a ≥ 3 AND χ² ≥ 4.0

Clinically-Aware Design

Confounding by indication is explicitly flagged in the interpretation prompt: last-resort antibiotics (cefiderocol, colistin) treat critically ill patients who would have high mortality regardless of the drug. gemma4:26b is instructed to identify this bias in its CONFOUNDING field and weight it in the regulatory action recommendation.

Worked Example: Cefiderocol (Real Pipeline Output)

Cefiderocol is a siderophore cephalosporin approved for gram-negative infections with limited treatment options — a true last-resort antibiotic with a small, critically ill patient population. This makes it an ideal test case for signal detection methodology.

Pipeline run: May 2026 — faers_pipeline/output/

REACTION PT                              N    BG       PRR    Chi²   Signal
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
therapy non-responder                   11     1     17.71   54.36    YES
pneumonia pseudomonal                    8     1     16.10   37.85    YES
treatment failure                       54    11      9.88  196.97    YES
death                                  109    24      9.34  389.77    YES
ototoxicity                              9     3      7.25   25.59    YES
pulmonary bacterial infection            7     2      6.96   20.75    YES
septic shock                            13     6      5.16   19.63    YES
nephrotoxicity                          10     5      4.73   12.87    YES

24 signals detected · 92 reactions analyzed · artifact-filtered
† Continuity correction applied to b=0 reactions

Clinical interpretation (applied by gemma4:26b):

Death signal (PRR 9.34, N=109): High PRR warrants attention but almost certainly reflects confounding by indication. Cefiderocol is reserved for carbapenem-resistant gram-negative infections — the sickest patients in the ICU who face high baseline mortality from the underlying infection, not the drug. A naive statistical read would flag this as a safety signal; clinical context explains it.
Treatment failure / therapy non-responder (PRR 17.71 / 9.88): Clinically important — consistent with emerging carbapenem-resistant organism resistance patterns and the "last-resort" population this drug treats. Warrants signal validation against published MIC data.
Ototoxicity and nephrotoxicity: Both known adverse effects of beta-lactam antibiotics in critically ill patients with polypharmacy. Signals are expected and serve as a positive control — the pipeline is detecting real, known effects.
Artifact exclusion working correctly: "no adverse event" (would have been PRR 28.75) and "drug ineffective" filtered pre-analysis. Without this correction, these administrative PTs dominate the signal table and obscure clinically meaningful reactions.

This is the kind of nuanced, clinically-contextualized analysis that distinguishes a PV professional from a data scientist running a PRR formula.

Five Modules

#	Module	Model	Status	Output
1	Regulatory Q&A	gemma4:26b	✅ Complete	Cited answer with confidence + source notes
2	MedDRA Coder	gemma4:26b	✅ Complete	Primary PT, SOC, alternatives, reviewer flag
3	FAERS Signal Detection	gemma4:26b + PRR	✅ Complete	Signal table + clinical interpretation per reaction
4	ICSR Narrative Generator	gemma4:e4b	✅ Complete	E2B(R3)-aligned narrative draft
5	Literature Monitor	gemma4:e4b + PubMed	✅ Complete	Weekly digest + Discord escalation alerts

Drug-Agnostic Design

Every module is parameterized by drug, not hardcoded. A ProjectConfig object carries:

@dataclass
class ProjectConfig:
    drug_name: str                    # e.g. "cefiderocol", "colistin", "vancomycin"
    comparator: str = "meropenem"     # FAERS background comparator
    collection_name: str = ""         # auto: pv_{drug} — isolated ChromaDB collection
    vault_folder: str = ""            # auto: Drugs/{DrugName} — per-drug vault notes
    pubmed_terms: list[str] = ...     # configurable search queries

Projects persist to projects.json. The Streamlit dashboard loads all active projects at startup and exposes a drug selector in the sidebar. Switching drugs swaps all module contexts without code changes.

Regulatory Knowledge Base

The RAG pipeline indexes structured markdown notes (Obsidian vault) covering:

ICH Guidelines

E2A — Clinical Safety Data: definitions, seriousness, expedited timelines
E2B(R3) — Electronic ICSR transmission: data elements, narrative requirements
E2E — Pharmacovigilance planning: signal management, PSUR/PBRER, RMP

EMA

GVP Module VI — Signal management: PRAC, EudraVigilance, EU timelines

Signal Detection

Evans Criteria — PRR formula, thresholds, biases, worked examples

Coding

MedDRA Conventions — PT selection rules, hierarchy, common decisions

Notes use frontmatter tags and source fields. The ingester strips [[wikilinks]], chunks by header hierarchy, and upserts idempotently using SHA-256 chunk IDs.

Benchmark: 100% Precision@5, MRR 0.861 on a 31-question gold standard covering all five module domains.

Retrieval Benchmark Results

Questions: 31 | P@5: 31/31 (100.0%) | MRR: 0.861
Domain breakdown: Regulatory(8/8) · Signal(7/7) · MedDRA(6/6) · ICSR(5/5) · Lit(5/5)

Clinical Oversight Model

AI Output ──► is_draft=True ──► reviewer_flag=True ──► Senior Reviewer Sign-off
                                                              │
                                                    PharmD · BCPS · BCCCP
                                                    18 years ICU/critical care

The platform never makes final regulatory determinations. is_draft cannot be set to False by any module function — it is a design invariant, not a configuration option.

Quick Start

# 1. Clone and set up environment (Python 3.11 required — chromadb wheels)
git clone https://github.com/molszewskiPV/PV-Signal-Intelligence-Workbench
cd PV-Signal-Intelligence-Workbench
python3.11 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Pull models (requires Ollama — https://ollama.ai)
ollama pull nomic-embed-text
ollama pull gemma4:26b
ollama pull gemma4:e4b

# 3. Ingest knowledge base
PYTHONPATH=src python -m ingester.vault_ingester

# 4. Verify retrieval quality
PYTHONPATH=src python tests/benchmark/run_benchmark.py
# Target: P@5=100%, MRR≥0.85

# 5. Launch dashboard
PYTHONPATH=src streamlit run src/dashboard/app.py

Interfaces

Desktop — Streamlit Dashboard (Primary)

The Streamlit dashboard is the main workbench interface, running locally at localhost:8501 on Debian 13:

Project selector — switch between client drugs instantly; all modules update to reflect the active drug
All 5 modules — full UI for regulatory Q&A, MedDRA coding, signal detection with charts, ICSR drafting, literature digests
Rich visualizations — PRR signal charts, MedDRA hierarchy views, ICSR draft editor, lit digest display
Real-time status — Ollama model health, ChromaDB connectivity, active pipeline status
Vault manager — add/edit knowledge base notes, re-ingest, run benchmark

Discord — Argus Bot (Mobile Mirror)

The Argus Discord bot (src/argus_bot.py) mirrors the full workbench over Discord — making the platform accessible on mobile (iPhone, iPad) anywhere with internet:

#regulatory-qa       → answer_regulatory_question()  → formatted embed with citations
#meddra-coding       → suggest_meddra_pt()           → PT + SOC + reviewer flag embed
#signal-detection    → signal context from RAG        → embed with source notes
#icsr-generator      → draft ICSR narratives          → formatted case report
#lit-monitor         → literature digests             → key findings + escalation alerts
#argus-status        → bot status, pipeline runs      → health embeds
#workbench-logs      → audit trail of all queries

Argus is an intelligent router: gemma4:26b handles all conversation and clinical reasoning; gemma4:e4b handles narrative generation; the workbench RAG pipeline provides real-time guideline retrieval. The routing is transparent to the user — Argus feels like one unified assistant regardless of which model handles a specific task.

Voice interface: !voice join → Argus joins your voice channel. !voice speak <text> plays TTS via edge-tts. Audio file transcription via !transcribe uses faster-whisper on the RTX 5060 Ti GPU. Real-time voice conversation requires the Hermes voice stack (Hermes Discord gateway + VoiceReceiver).

Shared state: Pipeline runs triggered from Discord update the Streamlit dashboard, and vice versa. Both interfaces share the same backend module functions.

Technical Showcase

Local-First AI Infrastructure

No data leaves the machine. All LLM inference runs on an RTX 5060 Ti 16 GB, served by Ollama:

Model	VRAM	Role
`gemma4:26b` (A4B sparse)	~8 GB active	Clinical reasoning, signal interpretation, regulatory Q&A
`gemma4:e4b`	~3 GB	ICSR narratives, lit digests, prose drafting
`medgemma:27b` (planned)	~14 GB	Medical entity extraction, clinical NLP tasks
`qwen3:30b` (planned)	~16 GB	Alternative reasoning, multilingual regulatory documents
`nomic-embed-text`	~0.3 GB	Vault embeddings (retrieval only)

Layer offloading to 32 GB system RAM handles models that exceed VRAM. Ollama manages GPU scheduling automatically.

Signal Detection Pipeline

Statistical rigor matching industry standards:

PRR/ROR disproportionality using the 2×2 contingency table
Evans criteria: PRR ≥ 2.0 AND N ≥ 3 AND χ² ≥ 4.0 — all three required simultaneously
Continuity correction: b=0 reactions use b=0.5 instead of silent discard — preserves drug-specific signals
Yates' χ² correction: applied when any expected cell < 5 — reduces false positives in sparse FAERS data
Artifact exclusion: 12 FAERS administrative PTs filtered before analysis
Quarterly pagination bypass: _fetch_quarter() per calendar quarter overcomes the 5000-result OpenFDA API cap

Vector Database

ChromaDB with nomic-embed-text embeddings:

Header-aware chunking: splits on H1/H2/H3 boundaries, not arbitrary character count
Idempotent ingestion: SHA-256 chunk IDs; re-running is safe, changed notes update in place
Dual-jurisdiction filtering: jurisdiction metadata field (FDA, EMA, ICH, BOTH); query_vault(jurisdiction="FDA") restricts retrieval
Benchmark: 100% Precision@5, MRR 0.861 on 31-question gold standard across all 5 module domains

Regulatory Knowledge Base (Dual Jurisdiction)

13 structured notes covering FDA and EMA regulatory frameworks:

ICH (applies to both jurisdictions)

E2A — ICSR criteria, seriousness definitions, expedited timelines
E2B(R3) — Electronic ICSR transmission, data elements
E2E — PV planning, signal management, PSUR/PBRER

FDA

21 CFR Part 312 — IND safety reporting: 7-day/15-day reports, causality standards
FDA MedWatch and FAERS — Post-marketing reporting, FAERS data structure and biases

EMA GVP

Module I — PV systems and quality (PSMF, QPPV requirements)
Module V — Risk management systems (RMP structure, aRMMs)
Module VI — Adverse reaction management and reporting
Module VII — PBRER/PSUR periodic safety reports
Module IX — Signal management (PRAC, EVDAS, BCPNN methodology)

Cross-jurisdictional

Evans Criteria — PRR formula, signal thresholds, statistical biases
MedDRA Coding Conventions — PT selection, hierarchy navigation

Infrastructure

OS: Debian 13 (Trixie), Linux 6.12
GPU: RTX 5060 Ti 16 GB VRAM — inference + STT (faster-whisper CUDA)
RAM: 32 GB — Ollama layer offloading for large models
Ollama: Custom configuration for model scheduling, context length, layer distribution
Python: 3.11 (venv) — chromadb wheels not available for 3.13
Discord bot: discord.py 2.7.1, PyNaCl, ffmpeg, edge-tts, faster-whisper

How It's Built — Multi-AI Development Workflow

This workbench is itself a demonstration of AI-augmented development methodology. Three AI systems collaborate under human supervision to build the platform:

┌─────────────────────────────────────────────────────────────┐
│                   Human Supervision                         │
│           PharmD · BCPS · BCCCP · 18 years ICU             │
│  Clinical judgment · Architectural decisions · QA sign-off  │
└────────────┬──────────────┬──────────────────┬─────────────┘
             │              │                  │
    ┌────────▼─────┐ ┌──────▼──────┐  ┌───────▼────────┐
    │  Claude Code  │ │   Hermes    │  │  Gemma 4 26B   │
    │  (Architect)  │ │(Implementor)│  │  (Reasoner)    │
    │               │ │             │  │                │
    │ Architecture  │ │  Module     │  │ Clinical       │
    │ Task specs    │ │  implement- │  │ interpretation │
    │ Complex fixes │ │  ation      │  │ Signal analysis│
    │ Code review   │ │  Tool calls │  │ MedDRA coding  │
    │ Vault design  │ │  Skills     │  │ Regulatory Q&A │
    └───────────────┘ └─────────────┘  └────────────────┘
             │              │                  │
    ┌────────▼──────────────▼──────────────────▼─────────────┐
    │                   Shared Codebase                       │
    │             ~/pv-workbench/  (this repo)                │
    └─────────────────────────────────────────────────────────┘

Division of AI labor:

Claude Code (Anthropic): Architecture decisions, vault design, complex statistical fixes, task spec authoring, code review. High-level thinking, broad context. Sessions are expensive, used strategically.
Hermes + Gemma 4 (local): Module implementation from Claude's task specs. Tool-calling agent that writes and tests code autonomously using the Hermes skill system. Free to run, handles well-specified implementation tasks.
Gemma 4 26B (Ollama, reasoning): Clinical reasoning at inference time — signal interpretation, MedDRA deliberation, regulatory Q&A. Not used in development, used in production.
Gemini CLI (Google): Code auditing, cross-file consistency checks, large-context document review.

Why this matters: The multi-AI workflow demonstrates that a solo consultant can maintain a production-grade clinical AI platform with near-zero cloud costs by using each AI system for what it does best. Claude's architectural judgment × Hermes' implementation throughput × Gemma's local reasoning × human clinical expertise = a system that would require a full engineering team to build traditionally.

This is the methodology, not just the tool.

About

Built by a PharmD, BCPS, BCCCP with 18 years of ICU and critical care experience who got tired of waiting for enterprise PV platforms to catch up with what local AI can already do. The clinical judgment layer isn't a guardrail bolted on — it's the reason the system exists.

Stack philosophy: Local-first. No cloud dependencies for core function. Data stays on-machine. The 26B reasoning model runs on consumer hardware (RTX 5060 Ti) and outperforms cloud-hosted GPT-3.5-class models on structured clinical PV tasks.

All AI outputs are drafts. Clinical and regulatory determinations require qualified human review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PV Signal Intelligence Workbench

The Problem This Solves

System Architecture

Tech Stack

Project Spotlight: FAERS Signal Detection Pipeline

What It Does

PRR Formula

Clinically-Aware Design

Worked Example: Cefiderocol (Real Pipeline Output)

Five Modules

Drug-Agnostic Design

Regulatory Knowledge Base

Retrieval Benchmark Results

Clinical Oversight Model

Quick Start

Interfaces

Desktop — Streamlit Dashboard (Primary)

Discord — Argus Bot (Mobile Mirror)

Technical Showcase

Local-First AI Infrastructure

Signal Detection Pipeline

Vector Database

Regulatory Knowledge Base (Dual Jurisdiction)

Infrastructure

How It's Built — Multi-AI Development Workflow

About

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
faers_pipeline		faers_pipeline
scripts		scripts
src		src
tasks		tasks
tests		tests
vault		vault
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PV Signal Intelligence Workbench

The Problem This Solves

System Architecture

Tech Stack

Project Spotlight: FAERS Signal Detection Pipeline

What It Does

PRR Formula

Clinically-Aware Design

Worked Example: Cefiderocol (Real Pipeline Output)

Five Modules

Drug-Agnostic Design

Regulatory Knowledge Base

Retrieval Benchmark Results

Clinical Oversight Model

Quick Start

Interfaces

Desktop — Streamlit Dashboard (Primary)

Discord — Argus Bot (Mobile Mirror)

Technical Showcase

Local-First AI Infrastructure

Signal Detection Pipeline

Vector Database

Regulatory Knowledge Base (Dual Jurisdiction)

Infrastructure

How It's Built — Multi-AI Development Workflow

About

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages