Skip to content

chaoxu/coverify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Coverify

Coverify is a command-line harness for using LLMs and other proof backends on a mathematical knowledge base without losing source grounding, verification, or durable project state. It is designed to work with Cosheaf, and uses Coflat as the mathematical document format.

In practice, Coverify does four things:

  1. Takes a user question plus a bounded source bundle from a Cosheaf workspace or local directory.
  2. Prepares the relevant context, preferably with an agentic gatherer that can inspect the allowed files directly.
  3. Calls a reasoner or verifier backend and records what was asked, what came back, and which sources were used.
  4. Returns a checked answer, or publishes it as a Cosheaf issue comment, review, PR artifact, or knowledge-base update.

The useful loop is:

question + thread + source bundle
  -> prepare relevant context
  -> call a reasoner
  -> verify the candidate answer
  -> return or publish the checked result

Coverify is for workflows such as:

  • answering a branch-scoped chat question using only the current project files
  • asking a strong model for a proof attempt while recording an audit trail
  • verifying whether a proposed argument is source-backed and gap-free enough to publish
  • turning useful model output into a reviewed Cosheaf PR or comment
  • evaluating whether context preparation found the passages a task actually needed

Coverify deliberately does not:

  • prove theorems by itself
  • replace a formal verifier or human mathematical review
  • own the wiki, chat UI, issue tracker, or PR system
  • keep hidden long-term project memory outside Cosheaf
  • browse the web or read unrelated local files by default
  • ship project-specific research tools in the public CLI

Principles

  • Durable mathematical state lives in Cosheaf.
  • Local files are tools, prompts, tests, and reproducibility scripts.
  • Python exposes stable mechanisms: source bundles, backend calls, audit bundles, schema checks, path/range validation, and verifier gates.
  • Agents and oracles handle judgment: context selection, route choice, mathematical reasoning, and correctness review.
  • Deterministic code is added only when the behavior is stable, replayable, and simpler than asking an agent to inspect the allowed context.

Install

python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e .
coverify --help

From a checkout without installing:

PYTHONPATH=src python3 -m coverify --help

Main Commands

  • ask-oracle: send one prompt to a configured backend and record an audit bundle.
  • repo-oracle ask: answer against a local source bundle.
  • repo-oracle gather: inspect which source passages the gatherer selects.
  • repo-oracle eval-gather: run JSONL checks for context-preparation quality.
  • chat ask: create or append to a branch-scoped chat issue and answer it.
  • chat-reply: read a Cosheaf issue thread, run the oracle, and post a reply.
  • run-eval: run JSONL eval cases against a fixture, script, or enabled Codex backend.
  • seed-research-evals: seed selected research eval candidates into a Cosheaf workspace.
  • Cosheaf primitives: tree, read-file, write-file, delete-file, create-branch, issue commands, PR commands, reviews, and merge.

Examples

Direct oracle call:

PYTHONPATH=src python3 -m coverify ask-oracle \
  --prompt "Give a concise proof attempt for the current lemma." \
  --json

Self-verifying fixture smoke:

PYTHONPATH=src python3 -m coverify ask-oracle \
  --backend verifying \
  --verify-inner-backend fixture \
  --prompt "Prove that there are infinitely many prime numbers." \
  --json

Run against a local source bundle:

PYTHONPATH=src python3 -m coverify repo-oracle ask \
  --source-bundle /path/to/source-bundle \
  --message "What does the current branch prove?" \
  --json

Evaluate gather quality:

PYTHONPATH=src python3 -m coverify repo-oracle eval-gather \
  --source-bundle /path/to/source-bundle \
  --cases evals/gather/sample-math-workspace.jsonl

Reply to a Cosheaf issue chat:

PYTHONPATH=src python3 -m coverify chat-reply \
  --workspace my-workspace \
  --issue 23 \
  --backend verifying

Backend Configuration

Common environment variables:

COSHEAF_API_URL=http://localhost:3030/api/v1
COSHEAF_TOKEN=<token>
COSHEAF_USERNAME=<username>
COSHEAF_PASSWORD=<password>
COSHEAF_REVIEW_TOKEN=<review-token>
COVERIFY_BACKEND=codex
COVERIFY_CODEX_MODEL=gpt-5.5
COVERIFY_CODEX_REASONING_EFFORT=xhigh
CHATGPT_CLI=chatgpt-cli
COVERIFY_CHATGPT_TIMEOUT_SECONDS=6000

Backends share the same basic contract: prompt in, answer out, with audit metadata recorded locally. The current backend surfaces include fixture, script, Codex, ChatGPT CLI adapter, QED adapter, and the composite verifying backend.

Skills

Coverify ships repo-owned Codex skills in skills. They are the preferred operational interface for longer runs:

  • context building
  • exploration planning
  • proof attempts
  • KB writing
  • PR review
  • KB management
  • lightweight run loop

Install or verify the skills:

python3 scripts/link_skills.py
python3 scripts/link_skills.py --check

Documents

Prompt files under docs/prompts are compatibility references for older flows. Prefer the skills for new operational work.

Checks

python3 scripts/check_skills.py
python3 scripts/link_skills.py --check
PYTHONPATH=src python3 -m coverify --help
PYTHONPATH=src python3 -m unittest discover -s tests

GitHub Actions runs the skill manifest check, installed CLI help, Python syntax compile, and the unit test suite.

About

CLI harness for LLM-backed mathematical proof work over Cosheaf

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages