gmalecha edited this page Oct 5, 2012 · 11 revisions

Coq-ext-lib is a project to improve on Coq standard library.

Why a GitHub project for such a library?

Easy to commit to. Central repository (avoid having several different, and incompatible, yet related contributions).


Why use Coq at all?

  • Proving Coq is a proof assistant to develop machine checkable proofs.
  • Programming Coq is a powerful programming language. This library is aimed at both of these goals. Much of it is inspired by Haskell and it makes extensive use of type classes. The proving is done in a slightly more "after-the-fact" style. I don't feel that this is ideal in all instances, but it seems to be desireable from an engineering standpoint since programmatically manipulating proof objects is expensive.

What should be in such a library?

  • We could start with data-structure libraries. A decent replacement to List, ListTheory inspired by what is done in ssreflect seq.
  • However, providing a duplicate of the whole standard library is most certainly a bad idea. We could focus on data-structures (and algorithms) that are often handy in programing languages research. (Non exhaustive list include: lists, heterogeneous lists, maps, sets...)

What should not be in such a library?

  • The whole standard library

What is the best way to contribute to such a library?

What is the official style?

Quoting Aaron Bohannon, from Coq-club:

As a prerequisite for such improvement, it seems critical to develop an official, written, agreed-upon style guide that is as thorough and detailed as possible (*). The library code should (obviously) follow the style guide and be a perfect model for how users should write their own code. Furthermore, whenever a choice is made by a library contributor that is not covered by the style guide (perhaps because it is specific to the mathematics of the library), it should really be accompanied by a clearly explained design rationale. In general, anything in a library that is not obvious should be accompanied by comments. Adhering to these concepts would help ensure that we end up with new libraries that are better than what we have now.

(*) The style guide should cover the parts of the Coq code that can be seen when looking through a library's coqdocs, i.e., the Gallina and generic tactic definitions. A style guide for proof scripts would be a whole other can of worms and seems much less urgent.

Following in that thread, Thomas Braibant said:

While I agree that such a style guide is crucial, I think that, at the moment, no one can tell whether, e.g., a partial function should return an option, or a default value. Moreover, writing the guide lines on how to write perfect code can take an infinite amount of time.

Later in that thread, Hugo Herbelin answered:

In order to ease the reviewing of extensions or corrections of existing libraries by users, having guidelines seems indeed a prerequisite. One of the obvious problem is that Coq standard library is an heterogenous collection of works made at different times of the history of Coq when tactics were not necessary very powerful, when the specification language was not necessarily as developed as it is now and when some convenient formalization or proving techniques were not yet "invented".

Nevertheless, I started to make public on Cocorico a set of tentative guidelines that "we" expect users to follow before submitting an addition or a modification to a library. I put the text on the wiki (page http://coq.inria.fr/cocorico/HowToContributeToTheStandardLibrary) so that anyone can discuss the guidelines, add new questions about unclear or missing recommendations, comment about whether such or such guideline is reasonable or not, etc. I hope that at some time, we will have converged to something comprehensive and useful for everyone.

What kind of (backward/forward) compatibility do we want to ensure?

What kind of development model?

Thomas Braibant started the aforementioned coq-club thread saying:

We think that the community could profit from a repository for such extensions, to gather lemmas that could be useful to other people. This raw materials could then be peer-reviewed, improved, and when meeting some quality-levels, moved to a blessed repo of more stable extensions.

From these lemmas, the community could then organize more coherent libraries, but from our point of view, an easy-to-contribute library is much better, and we should aim to a state where every ten lines can be submitted. At this point it must be stressed that we do not want to target whole contribs, only the lemmas that exists "in the wild".

and Guillaume Melquiond answered:

There exist some other initiatives of this kind for other languages and it might be worth looking at them. I will only talk about my own experience with the Boost project. Boost is a C++ repository that was created to account for all the features that did not make it into the original ISO/C++ standard library (named STL in the following). It is also released a bit more often than the STL so that it helps bring fresh blood to the language. (Think one release every 25 years for the STL. That's not a typo, it's really a quarter century.) And finally, and perhaps the most important point for some people, it ensures that the libraries that are ultimately added to the STL have actually been widely tested: a library that would have not gone through Boost would have few chances of being standardized in the STL.

More to the point, Boost is a repository of C++ library. Anybody is free to submit a new library. Informal (heated) discussions follow. A review manager is then nominated and checks whether there are enough interest and whether the library is finished enough. The review wizard sets a review period: one or two weeks of review, depending on the size of the library. Note that no two reviews are scheduled the same week, so that the attention is not diluted. The formal review is public, it happens on the main mailing-list, and anybody is free to contribute a review. At the end of the review period, the review manager takes into account all the reviews (possibly tens of detailed reviews) and summarize them into an official one, stating whether the library is accepted after minor revisions, or major revisions (another review is then scheduled), or rejected.

It looks like a rather heavyweight process, and it is. Much more than an article submission, since you might have to discuss with tens of reviewers for several months on an everyday basis. (In the case of my library, the whole process took about 5 months.) But at the end of the day, the Boost libraries are "perfect". Meaning that they could be/are inserted into the STL without raising an eyebrow, and that their documentation could be/are published into peer-reviewed international journals.

So Boost isn't what you had in mind, since it is aimed at whole contributions rather than small additions. But I think it is worth keeping such a model in mind, since it is rather successful (one million downloads per month, not counting alternate packaging like deb/rpm/..., and repository accesses) and it has a major impact on the language: it is one of the main reasons the language is still widely used.

My point is that the "easy-to-contribute library" might not be the best choice on the long run. I would be interested in examples of cases where it was successful.

Note that the development of the ssreflect library is more like the Boost development model: a coherent library, built over years by a small team.

Also, remark that building such coherent libraries may have a profond impact on the definition of Coq itself. For instance, ssreflect provides a complete and coherent set of tactics that can be used in replacement of usual Coq tactics.

Yet, ssreflect may not be the right compromise to use Coq in a PL context.