New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression proving with Coq: past, present, and future #9262

Open
palmskog opened this Issue Dec 21, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@palmskog
Copy link
Contributor

palmskog commented Dec 21, 2018

In ongoing work, we are investigating regression proving techniques for Coq. Regression proving is concerned with re-checking affected proofs after a change has been made to a verification project.

Our main goal is to increase the productivity of proof engineers in large scale projects, where a short "edit-check cycle" is critical. Inspiration comes primarily from regression testing research in software engineering: we consider Coq proofs analogous to program tests. In particular, we assume engineers make generous use of CI and other automation as in software projects.

Here is an attempt at a summary of the current state of our work and ideas for future improvements. Many of these improvements require changes in Coq itself.

Background

Our starting point is the document-oriented model and underlying proof processing architecture introduced in Coq 8.5. Specifically, the .vio file format introduced in that version supported two key features for the batch processing toolchain:

  1. asynchronous checking of a single opaque proof in a file (e.g., coqc -check-vio-tasks 1 A)
  2. fine-grained proof checking parallelism (e.g., coqc -schedule-vio-checking 3 A.vio B.vio)

These two features mirror the possibilities in software batch test execution, where individual tests can be run in isolation, and where test execution can be parallelized both at the test method and test class level.

iCoq: sequential regression proof selection

Regression test selection techniques for Java-like languages track dependencies among classes and methods and use this information to perform change impact analysis when a project is modified.
Regression test selection tools are typically run in CI and persist dependency metadata between runs.

iCoq is our analogue of Java regression test selection tools for Coq and builds on asynchronous checking of opaque proofs. It analyzes dependencies between definitions and lemmas, and finds and checks (only) impacted proofs. It can be run either locally or in CI.

iCoq implementation highlights:

  • plugin for dependency extraction from .vio files using the approach from dpdgraph
  • plugin for smart checksumming of elaborated Coq terms ("AST fingerprinting")
  • modification of asynchronous proof checking to print proof term dependencies (like dpdgraph)

See an example of how iCoq works.

Highlights from empirical study on version histories of large projects:

  • up to 10x speedup in CI environment compared to checking from scratch
  • up to 3x speedup on local machines compared to coq_makefile

Main limitations of iCoq:

  • only supports Coq 8.5 (slow proof checking)
  • no analysis of custom tactics or tracking of tactic dependencies in proofs
  • no support for parameterized modules
  • brittle parsing of .v files to (1) extract proof tasks and (2) compute checksums of proof scripts
  • dependency graph analysis and command generation implemented using Java rather than OCaml
  • no checking of universe consistency

Possible improvements:

  • analysis of tactic dependencies
  • tracking all tactics used by proofs
  • hook into asynchronous proof checking without modifying Coq itself for dependency extraction
  • interface directly with .vio files to extract proof tasks and their kernel names, and proof script line numbers for checksumming
  • compile unimpacted, unchanged files to .vos instead of .vio (since .vio overhead not needed there)
  • implement smart checksumming using Template-Coq
  • no need to dump dependencies for proofs produced from unchanged proof scripts with no tactic behavior changes

piCoq: parallel regression proving

iCoq performed only sequential checking. piCoq extends iCoq to parallel checking and includes classical modes for incremental, file-level parallel checking (similar to coq_makefile). We included more histories of large projects for comparison.

Implementation highlights:

  • implemented essentially all combinations of parallelism and selection, according to schema below
  • introduced coqc option for parallel checking of proofs spread out over many files (generalization of -schedule-vio-checking)
No selection File selection Proof selection
File parallelism
  • .vo
  • .vo
  • .vo
Proof parallelism
  • .vio
  • .vio
  • .vio

Highlights of empirical study on version histories of large projects (including Verdi, Flocq, and Coquelicot):

  • several projects had very restrictive file dependency graphs which made file parallelization slow (manual reorganization could probably fix this, though)
  • proof selection combined with proof parallelization was always faster than other modes, for all projects
    • up to 28x speedup compared to sequential checking from scratch
    • up to 3x speedup compared to sequential proof selection (iCoq)

Possible improvements:

  • smarter checksumming of .v files, ignoring comments and noise (via SerAPI?)
  • use .vos and .vok whenever checking all proofs in a file
  • compile unimpacted, unchanged files to .vos instead of .vio (since .vio overhead not needed there)

Suggestions towards future regression proving support

  • efficient regression proving depends crucially on both 1 and 2 above; the new vos/vok toolchain leaves out the former and implements a restricted form of the latter, so we hope it can be extended to support the missing functionality (#8783, #8642, #8767)
  • documentation of -check-vio-tasks option for coqc in manual
  • external tools should be able to control fine-grained proof scheduling, or at least provide their own data to improve scheduling (instead of only relying on .aux files)
  • a dependency extraction plugin should ideally be bundled with Coq
  • asynchronous proving should be able to use elaborated kernel names (e.g., Lib.Mod.my_proof) instead of task numbers
  • need a simple API to extract metadata from .vos/.vio files
  • possibility to "fill in" proofs in .vos/.vio files (tasks are "holes")
  • possibility for more sophisticated debugging output when checking proofs asynchronously on demand, e.g., coqc -verbose -check-vio-tasks 1 A shows name of lemma and statistics
@palmskog

This comment has been minimized.

Copy link
Contributor

palmskog commented Dec 21, 2018

@gares @charguer let me know what you think and if you have questions or additions. Thus far, I feel our work has mostly been about exploring feasibility. But with help from Coq developers, we believe regression proving tools could become robust enough to be adopted by large projects, especially in their CI.

We will also try to prepare specific API suggestions that can be discussed at CoqPL.

cc: @ejgallego

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment