Skip to content

bogwi/litplan

Repository files navigation

Litplan

Litplan turns a research paper into structured document data, a checked and versioned PipelineSpec, optional approval records, and either a reference execution path or workflow files for your own runtime. It is the governed-plan layer, not a replacement for Snakemake, Nextflow, Airflow, or your HPC stack.

Start Here

Prerequisites

  • Python 3.11+
  • uv
  • Docker, only if you want live PDF ingest via GROBID
  • An LLM API key, only if you want live goal-driven planning

Run these commands from the repository root:

uv sync
cp .env.example .env
uv run litplan environment
uv run litplan ingest fixtures/tei/structural_minimal.tei.xml \
  --parser-route tei_fixture_replay \
  --document-ir

What to expect:

  • uv sync installs the package and CLI from uv.lock.
  • cp .env.example .env gives you a local config file; it defaults to LITPLAN_LLM_PROVIDER=stub, so this path stays offline.
  • uv run litplan environment prints JSON hints about paths and config resolution.
  • The offline ingest command should return JSON with "ok": true and inline DocumentIR output.

Choose Your Path

Path Use it when Needs First step
Offline demo You want a first successful run with no external services uv Run the tei_fixture_replay ingest command above
Live PDF ingest You want to parse a real PDF through GROBID Docker + GROBID docker compose -f docker-compose.grobid.yml up -d
LLM planning You want plan-compile to draft a pipeline from a goal A real provider in .env plus a matching API key Set LITPLAN_LLM_PROVIDER and provider key in .env
Workflow export You want Snakemake, Nextflow, or Airflow files from a stored run A persisted run in the local DB Compile with --create-run, then approve, then export-workflow

Fastest Successful Runs

1. Offline ingest demo

This is the fastest self-contained path and works in a fresh checkout after uv sync.

uv run litplan ingest fixtures/tei/structural_minimal.tei.xml \
  --parser-route tei_fixture_replay \
  --document-ir

Expected outcome:

  • You should get JSON output with "ok": true.
  • You should get inline DocumentIR JSON.
  • No Docker, database, or API key is required.

2. Persist a local run

Use this when you want approvals, execution, status, or export commands to work against the local SQLite store.

export LITPLAN_HOME="$(pwd)/.litplan-home"
uv run alembic upgrade head
RUN_ID="examples-linear-chain-run-$(date +%s)"
uv run litplan compile fixtures/workflowgen/pipeline_specs/linear_chain.json \
  --pipeline-id examples.linear_chain \
  --lockfile uv.lock \
  --run-id "$RUN_ID" \
  --create-run

Expected outcome:

  • This creates .litplan-home/ and initializes the SQLite schema there.
  • The compile command should return "ok": true.
  • The run and compiled revision are now persisted for later approve, run-status, execute, and export commands.

3. Live PDF ingest with GROBID

If PDF ingest is the feature you care about, this is the shortest working setup.

Start GROBID:

docker compose -f docker-compose.grobid.yml up -d
curl -fsSL http://127.0.0.1:8070/api/isalive

Then set LITPLAN_GROBID_URL=http://127.0.0.1:8070 in .env and run:

uv run litplan ingest \
  fixtures/papers/pdf/batatia-2022-mace-force-fields-arxiv-2206.07697v2.pdf \
  --document-ir

Expected outcome:

  • The health check should print an alive response from GROBID.
  • The ingest command should return "ok": true.
  • You should get parsed document metadata and, with --document-ir, inline DocumentIR JSON for the PDF.

4. Live LLM planning

For this path, change .env from the default stub provider to a real provider and add its key. Example:

LITPLAN_LLM_PROVIDER=gemini
GOOGLE_API_KEY=your-real-key

Then run:

uv run litplan plan-compile "build linear noop pipeline" \
  --pipeline-id examples.goal_demo \
  --lockfile uv.lock \
  --expanded-spec

Expected outcome:

  • The command should return "ok": true when the provider is configured correctly.
  • You should get a compiled pipeline result and expanded spec JSON.
  • Use compile instead of plan-compile when you already have full PipelineSpec JSON and do not want an LLM involved.

5. Workflow export

Once you have a persisted run, you can approve it and generate engine-native workflow files.

uv run litplan approve "$RUN_ID"
uv run litplan export-workflow "$RUN_ID" snakemake ./tmp/workflow-out

Expected outcome:

  • approve records a non-pending run status for RUN_ID.
  • export-workflow writes files under ./tmp/workflow-out.
  • You can switch snakemake to nextflow or airflow.

Main Commands

Command What it does
litplan ingest Parse a PDF or TEI file into DocumentIR; can include inline JSON output
litplan compile Validate and compile a PipelineSpec JSON input
litplan plan-compile Use a goal-driven planner, then run the same compiler
litplan approve Record approval or other status transitions for a run
litplan execute Execute a persisted approved run through the reference Prefect path
litplan run-status Show unified run, planning, and execution status
litplan export-audit-bundle Write a reproducibility-oriented bundle for a run
litplan export-workflow Generate Snakemake, Nextflow, or Airflow files from a stored revision
litplan environment Print non-secret environment and path hints

Run uv run litplan --help for the full CLI and option details.

Project Surfaces

  • CLI: uv run litplan
  • MCP server: uv run litplan-mcp
  • Python library: import litplan
  • Streamlit UI (optional): uv run streamlit run src/litplan/ui/plan_draft_app.py

The CLI and MCP server call the same shared JSON tool layer, so they expose the same core operations.

Streamlit UI

The optional Streamlit app is a browser UI on top of the same SQLite project database. You can edit PlanDraft text and compare it to a prior draft, append approval records, append hallucination flags with chunk provenance, and load unified run status (LangGraph checkpoints, run timeline, execute-phase node checkpoints, OpenTelemetry hints). Ingest, compile, plan-compile, approve, execute, export, and other pipeline steps still run from the CLI or MCP; the UI does not replace those tools.

Litplan Streamlit PlanDraft UI

From the repository root, after uv sync and uv run alembic upgrade head for your chosen LITPLAN_HOME:

export LITPLAN_HOME="$(pwd)/.litplan-home"
uv run streamlit run src/litplan/ui/plan_draft_app.py

For a full step-by-step (including diff panels, approvals, flags, and run explorer), see Part 3 in MANUAL_CLI_MCP_UI_WORKFLOW.md.

Examples

Library-first examples live in examples/:

  • examples/paper_repro_sketch_tei.py: saved TEI -> DocumentIR -> chunking -> hand-authored PipelineSpec -> compile
  • examples/paper_repro_sketch_pdf.py: PDF path -> live GROBID ingest -> the same chunk/plan/compile flow

Run them from the repository root:

uv run python examples/paper_repro_sketch_tei.py
uv run python examples/paper_repro_sketch_pdf.py \
  fixtures/papers/pdf/batatia-2022-mace-force-fields-arxiv-2206.07697v2.pdf

What To Expect

  • The package version is 0.0.1 (see pyproject.toml); pre-1.0 semver releases may still evolve public APIs.
  • Offline and stub-backed flows remain supported for local development and tests.
  • The bundled execute path demonstrates orchestration, retries, checkpoints, and bookkeeping; use your own workflow engine or cluster stack for production-scale execution.

Where To Go Next

About

Governed work plans from research literature papers with agentic planning

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages