A few months ago I found out about OpenTrials.net by Prof. Ben Goldacre. It was a project aimed at "locate, match, and share all publicly accessible data and documents, on all trials conducted, on all medicines and other treatments, globally".

This edition is a show of appreciation for Prof. Goldacre's previous work.

GregoryAi is a modest answer to the same problem. In this version we are making a number of improvements to the way we fetch clinical trials from the world's top 3 registries. We now focus more on the identifiers to ensure the data is sound; with a tradeoff that now we may have a few duplicates if a trial is in two or more registries.

Subscribers to Brain-Regeneration alerts may get some duplicate alerts. It's a problem I am trying to solve by first keeping a chronological record of the raw data to analyse in more detail.

Gregory AI v25

Range: v24 (2026-05-30) → main (2026-06-10). 15 merged PRs, ~113 commits.

Highlights

Clinical trial identity was rebuilt around registry identifiers. Trials with the same title are no longer merged into one record when their registry IDs say they are different studies, and a trial's link no longer flip-flops between sources on every import.
Nothing gets lost anymore: a new links field on trials and articles keeps one URL per source. The main link is set by whichever source arrived first and stays put.
Richer trial data: a dedicated parser for the EU CTIS feed, new fields from the WHO/ICTRP export (acronym, secondary sponsor, results information), and a fix for EU dates that were being read month-first — 8 December was becoming 12 August.
The trials API now exposes every field, adds lookups by registry identifier (NCT, EudraCT, EUCT, CTIS), and exports to Excel.
Categories distinguish manual curation from automatic matching: rebuilds never touch assignments made by a human, and the pipeline only re-categorizes content that changed instead of everything, every time.
The test suite now runs on GitHub Actions on every push, in parallel.

⚠️ Before you upgrade

Three things to know before deploying. Details in the Breaking changes and Upgrade sections below.

API clients: the trial field retrospective_flag is now called prospective_registration (same values, clearer name). Update anything that reads the old name.
Migration 0050 adds and drops indexes on the largest tables — plan a short maintenance window.
Migration 0054 refuses to apply if the database holds real duplicate registry IDs. New commands help you find and merge those duplicates first.

What's new

Trial identity and de-duplication

GregoryAI ingests the same real-world trials from ClinicalTrials.gov, the EU registers, and the WHO portal. Until now, deciding whether two incoming records were "the same trial" leaned too much on the title — which merged distinct studies that shared a title, and let two sources fight over a single trial's link, overwriting each other on every import.

Registry identifiers now lead. A record with a matching title is no longer treated as the same trial when its registry identifiers point to a different study.
The database no longer enforces one globally unique title. Instead, each registry identifier (NCT, EudraCT, EUCT, EUCTR, CTIS) is unique on its own.
A trial's main link is set once, by the first source that reported it. The new links field keeps one URL per source — keyed by registry or hostname — so every source's address is preserved and visible in the admin.
Three new management commands:
- audit_trial_merges — flags historical records that were probably merged wrongly, so you can review them.
- merge_trials — merges confirmed duplicates into one trial, moving all related data before deleting the spares.
- capture_trial_streams — records the raw inbound trial feeds to a file without touching the database, useful for analysing what the registries actually send.

Trial ingestion

New parser for the EU CTIS RSS feed extracts far more detail from each entry.
New fields from the WHO/ICTRP export: trial acronym, secondary sponsor, whether results are available, and the plan for sharing individual participant data.
New results_posted field, with proper parsing of results from ClinicalTrials.gov.
Fixed a date bug in the EU feed: dates are day-first (DD/MM/YYYY) but were being parsed month-first, silently shifting dates like 8 December to 12 August.
Feeds no longer overwrite existing data with blanks when a source omits a field.
The WHO importer records proper change history again.
Plain-language labels and help texts throughout the trial admin, with references to where each value comes from.

Trials API and exports

The API now returns every trial field, including all the new ones.
New filters to look up trials by registry identifier: ?identifiers= matches across all registries at once, or scope to one with ?nct=, ?eudract=, ?euct=, ?ctis=. All accept comma-separated lists, match case-insensitively, and are backed by new database indexes.
Filter by ?acronym= and ?has_results=true (results posted, a results date, a results link, or "results available: yes"). Acronyms are now populated from all three sources: the WHO/ICTRP export, the live ClinicalTrials.gov feed (captured from this release onwards), and a one-time backfill_trial_acronyms command that fills historical CTGov rows from the registry API — idempotent and safe to rerun.
New export_trials_xlsx command produces an Excel workbook with one sheet per subject, scoped to a team.
Trial CSV downloads now stream like article CSVs, so large exports no longer time out.

Articles and subjects

Articles get the same links treatment as trials: the canonical link is whichever source arrived first, and every other source URL is kept in the new links field.
Subjects now keep edit history.

Ops, settings, and CI

New GitHub Actions workflow runs the test suite on every push, in parallel, with faster Docker image builds.
DEBUG is now driven by the DJANGO_DEBUG environment variable and defaults to off. The container picks the right server automatically: gunicorn in production, Django's dev server when debugging.
Database optimization (migration 0050): adds indexes to the hot paths on Articles and Trials and drops redundant ones, with a step-by-step production runbook.
The admin bulk action "Disable all emails" now also unsubscribes the person from every list, so the global flag and per-list subscriptions can no longer drift apart.
New prepare_v24_upgrade helper for anyone still upgrading from v23, and the v24 release documents moved to docs/releases/v24/.

⚠️ Breaking changes

retrospective_flag renamed to prospective_registration on the Trials model and in the API. It is a pure rename — values are unchanged — but any client reading the old field name must update.
Trial title uniqueness replaced by per-registry identifier uniqueness. Migration 0054 checks for real duplicate registry IDs first and fails loudly if it finds any, listing them. Run audit_trial_merges to review and merge_trials to fix, then re-run migrations.
DEBUG defaults to off. Local development setups must set DJANGO_DEBUG=True in .env (see example.env).

Upgrade

Back up the database and confirm the dump is restorable.
Add DJANGO_DEBUG to your .env (False in production, True for local development).
Apply migrations during a short maintenance window — migration 0050 rebuilds indexes on the largest tables. See apply_0050_prod_runbook.md.
If migration 0054 fails with duplicate registry IDs, review them with audit_trial_merges, merge with merge_trials, and re-run migrations.
Update API clients that read retrospective_flag to use prospective_registration.
Backfill acronyms for historical ClinicalTrials.gov trials: docker exec gregory python manage.py backfill_trial_acronyms. The command is idempotent — rerunning is safe if interrupted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v25 The Ben Goldacre edition

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Gregory AI v25

Highlights

⚠️ Before you upgrade

What's new

Trial identity and de-duplication

Trial ingestion

Trials API and exports

Categories

Articles and subjects

Ops, settings, and CI

⚠️ Breaking changes

Upgrade

Uh oh!