New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for an update of the recommended style in programming Coq. #267
Conversation
This is an excellent idea, for instance for engineers joining the development. One thing that comes to my mind: the document talks a bit about match on one line line. Maybe that should be discouraged, because it is not robust to insertion of new constructors. That's a minor point anyway. I believe one of the motivation for avoiding OCaml's polymorphic equality is performance. Sometimes, fixing the types of arguments makes the compiler use more efficient equality functions. Typically, let f x y = Int.equal x y will be much faster than let f x y = x = y (of course, it's an over-simplified example). Another argument is that polymorphic equality breaks parametricity. |
I don't buy your argument about discouraging match on one line. If ever one eventually needs an extra constructor, one can just reindent the code, the same way as e.g. "let in" line which becomes too long would be split over several lines. But maybe you have a specific example in mind which would motivate your view. Equality: I did not know. If I understand correctly, this means that OCaml does not implement some form of "type class" selection of the appropriate equality when the type is known? [I thought it did it for float, already?] In any case, constraining oneself to choose the equality by hand is somehow a burden, and you have to know that you have to do it, but I guess the issue is well-known from OCaml developers. Btw, I added a file api.txt to start listing some recommendations on using correctly the API. |
welcome if it can make comments more readable (then | ||
"toggle-enable-multibyte-characters" can help when using the | ||
debugger in emacs) | ||
- all of initial "open File", or of small-scope File.(...), or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is that the typical litany of "open" at the beginning of the file is bad in OCaml, since the order is relevant. So one shoult limit that to what replaces/augments the standard prelude (cUtill).
What I found to be reasonable in Matita is to start with
module R = WhateverReduction
module TC = TypeSomethingChecker
... R.whd ... TC.typeof ...
i.e. shorten names. The eventually open these modules locally. Such practice serves a documentation purpose (in the "header" you see the dependencies) and you can still get compact code (repeating long module names).
Such practice can only be applied if the number of modules one has to use is limited. Coq has too many little modules, so an API refactoring is somewhat needed to make my proposal applicable.
I think it is a very good initiative. I've a simple remark that I attached to the patch. I have a more general one on the "multiple choices" on naming. Pick "evd" or "sigma" but then always use that everywhere. A newcomer seeing two different names thinks they are different things! To me such document has to go in tandem with an equivalent document on files (file names, files size, file place). Well, file names are non uniform. Given That OCaml forces you to start with a capital, we should probably go toward CamlCaseModuleNames. We have too many little files and this complicates navigating the sources. What I found to work reasonably well is to use modules inside files to separate related but different functions. E.g. what Matej did for the context/Rel/Named. Also "proxy files", ie mapping almost 1:1 two APIs are quite annoying. The can be a valuable transition strategy, in the spirit of a CompatV8X module, but they should not be part of the current sources. |
About ";;": I fully agree, this is the best way to find the origin of a syntax error of the "unbalanced" style. In term of style, Coq is uniform here and I don't feel a change of trend coming, so I would recommend keeping things as they are, reminding however that the ";;" trick is indeed useful for debugging syntax errors. About file names, the trend is apparently changing ( About the Matita convention, I know it from Claudio who used it extensively in the xml plugin, but I must confess that it was for me more a source of noise in reading his code than an help. To me, explicit qualification is useful when the name (or 1 or 2-letter code of the name) of the file gives information about what the function does (List.map style), but otherwise heavy. Anyway, there is no accounting for tastes, so this is why I would recommend to rather move the focus on finding a good balance between uniformity of style, some reasonable stability over time, as well as developers feeling globally comfortable with the conventions adopted. |
It does indeed for base types such as integers or constant constructors, but it is fragile in the sense that it is easy to loose monomorphisation when reordering code (e.g. by inadvertently creating a let-binding that is polymorphic). Furthermore, as Maxime said, it breaks parametricity. People writing code such as
instead of
probably deserve to die in a horrible well-deserved suffering (and I did see such instances in the Coq codebase). Structural equality should never be used on an abstract type because it may not be the relevant one, and nothing in the type-checker makes this robust to code change. So until we get modular implicits, I think we should discourage the use of Pervasives.(=). |
0302c0e
to
b1cf452
Compare
Thanks for the explanation, I made an update. |
Would
be supported? I perfectly know the reasons why = is bad in OCaml, but I find the infix syntax way better. |
The real question is, to me, what would be easier to port to whatever form of correct = OCaml will provide. To me, Map.(x = y) would require just the erasure of Map, and is both readable today and will stay readable later. |
We could do this indeed. Note that I believe it is still better to use As of today, there is still a problem though, which is that it is easy to shoot yourself in the foot because |
@gares: I don't know the opinion of Matthieu and others, but if you are ready to implement a uniform naming scheme of ML files, together with a sed script to allow an easy migration, this would probably be a good way not only to rely on a good basis for further extensions, but also an easier path for explaining the code. |
8087682
to
8c3272c
Compare
A minor comment I have is how does the recommended coding style interact with popular Ocaml development tools such as Another issue on style in the current codebase is the presence of tabs. IMO tab use should be deprecated and even cleaned up. |
8c3272c
to
6084feb
Compare
Starting listing some recommendations in using the API.
6084feb
to
6eefbf2
Compare
@ejgallego: when Jean-Christophe started Coq V7, he was using the default configuration of tuareg with the only modification "(setq tuareg-in-indent 2)". If my memories are correct, the default configuration of tuareg evolved, and, as a consequences, new indentation styles arrived in the code! The document lists a few settings of tuareg which were relevant in 2008. I don't know if it they are still relevant for current tuareg. If you could provide an up-to-date list of settings to apply to tuareg or ocp, I guess that that would be useful. About TAB, I guess they were inserted automatically by tuareg. Your point is that they may cause trouble with diff tools, or with early versions of merlin, etc. is that correct? Personally, I don't care if someone wants to replace all TABS with sequences of spaces, as we removed all spaces preceding newlines at some time a few years ago. Additionally, we should also configurate our tools to avoid they come back inadvertently. |
I think a major point would be to write self documenting code. For instance, by using records instead of tuples. No function should take an argument of type bool * bool * bool. Likewise, arguments of type int or bool should carry a label. Avoid e.g. a function of type int -> int -> constr -> constr. In general, the type of functions should give a rough idea of the flow of data manipulated by the function. The other day, while looking for a bug, I saw for the first time: set_var_scope : Loc.t -> Names.Id.Map.key -> bool -> intern_env -> (bool ref * A map that contains refs that contain a ref on a boolean, a ref on an option of an option of something and a list of the same thing. I can't get even a vague idea of what this is here for. In practice, I resigned and moved on to the next bug. |
Maybe I should mention that I'm currently preparing a PR that turns some tuples into records, so that we don't do the work twice. |
@herbelin In my limited experience the default tuareg+merlin+ocp setup seems fairly close to the current style, and won't insert tabs anymore. Given history, it could make sense indeed to recommend tuareg+merlin+ocp defaults then. |
Also, generic union types (Interface.union, CSig.union and Util.union) should generally be avoided because they don't document what the alternatives mean. |
@maximedenes, @ejgallego: all this looks very reasonable methodology to me! |
instantiated on a particular type (e.g. use List.mem_f, | ||
List.assoc_f, rather than List.mem, List.assoc, etc, unless it is | ||
absolutely clear that "=" will implement the intended equality, and | ||
with the right complexity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, using Pervasives.(=)
is a good way to express the intent to use polymorphic =
after careful consideration. (Another option would be for Coq to export it under another name.) It is also robust to the technique of using let (=) = ()
at the beginning of a file to make sure to find all potentially-dangerous uses of polymorphic equality, turned into compilation failures -- I think that @ppedrot used it in the past.
- Common OCaml pitfalls | ||
- in "match ... with Case1 -> try ... with ... -> ... | Case2 -> ...", or in | ||
"match ... with Case1 -> match ... with SubCase -> ... | Case2 -> ...", or in | ||
parentheses are needed around the "try" and the inner "match" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: using begin .. end
instead of parentheses for nested matches is a good way to have a style close to Coq's grammar (which is superior to OCaml's in this case), especially if begin match
are kept together.
begin match ...
| ...
end
(Whether or not to update the existing codebase to harmonize the indentation style is another choice.) |
Just my 2 cents: In the Elm community, people are going crazy about a tool called elm-format, which enforces very strict formatting guidelines. It does not leave much choice to the developer and is applied at file saving so what people really like about it is: 1) not having to even think about formatting; 2) clean diffs. I don't know much the various tools that exist for OCaml so I have no idea how much coding style could be automatically enforced this way but that could be something to aim at. |
@gashe that sounds great! I guess if we set some rules we should start by implementing them on the existing codebase ourselves no? |
So I just tried to see whether I could come up with a set of For reference, the various knobs provided by ocp-indent are explained in its own configuration file: https://github.com/OCamlPro/ocp-indent/blob/master/.ocp-indent (We could easily implement more if required.) |
Thats very useful, thanks @gasche . Indeed there seems to be many different indentation styles on the codebase. IMVHO, I would agree on a common ocp-indent set of rules and suggest a soft policy on indentation. By the way, this will be discussed in today's Coq WG (sorry for the late notice) see coq-dev for details. |
This PR seems to have unrelated commits. @herbelin can you filter them out? Could we merge it? We can still add more info on the recommended style later. |
Rebase done wrt v8.6, but cherry-picking or rebasing to trunk should be immediate. Note that there were comments on using ocp, but I don't feel able to write on that so I would prefer someone else to do something about this. So, as far as I'm concerned, I consider I'm done about what I could do. |
Merged in trunk, thanks a lot! |
Post-merge comment, maybe it can be added in another PR. This came up in some code awhile ago, IIRC. Would a good style rule be, don't rebind the same identifier, that is, don't write code like:
because if you're looking at |
@psteckler doesn't your IDE tell you that? In fact I think it is often a very good practice to do it, when the two let evd = .... in
let evd = .... in
... I would even strongly suggest to never use the other style in these cases: let evd = .... in
let evd' = .... in
...
let evd'''''' = .... in
...
`` |
Indeed it is sometimes useful to alias bindings, however, I am not sure if this is a good idea at the module toplevel, which the codebase does quite often. I find it confusing when reading code to have to versions of function |
Sure, Merlin tells me via Also, if you have several rebindings, and you remove one in the middle, using distinct names will generate an error, which you might miss otherwise. |
In the late 90s, Jean-Christophe Filliâtre wrote a small document with writing guidelines. This is a proposal for an update of this document covering: