Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle single-quotes in arguments to -arg #590

Merged
merged 1 commit into from May 25, 2022

Conversation

Columbus240
Copy link
Contributor

@Columbus240 Columbus240 commented Jun 25, 2021

A haphazard sketch to issue #589. It is not an acceptable final solution, I think

coq_makefile parses _CoqProject files in a slightly weird way. If we write -arg "a b" in _CoqProject then it passes the arguments a and b to coqc, but if we write -arg "'a b'" it passes a b to coqc, as a single argument.

I tried to adapt PG to this behavior. It works for my use-case, but I haven’t tested where the differences in behaviour (between PG and coq_makefile) lie now. A weird fact is, that coq_makefile produces some string, which it stores in Makefile.coq.conf, which later gets processed by make and passed on to coqc. So a "good" solution would try to reproduce the complete processing chain, i.e. first the processing done by coq_makefile, then the processing done by make.

Maybe we could call lib/coqProject_file.mli from elisp and hand over the parsing to the OCaml code of the coq repo.

A solution sketch to issue ProofGeneral#589.

coq_makefile parses _CoqProject files in a slightly weird way. If we
write `-arg "a b"` in _CoqProject then it passes the arguments `a` and
`b` to coqc, but if we write `-arg "'a b'"` it passes `a b` to coqc, as
a single argument.
This commit tries to adapt to this behavior.
@Columbus240 Columbus240 marked this pull request as ready for review July 1, 2021 14:29
@Columbus240
Copy link
Contributor Author

Columbus240 commented Jul 1, 2021

Somehow this seems to work in the cases I want to use. But I expect it doesn’t mirror the behaviour of coq_makefile & make in all cases.
If such cases arise in practice, the algorithm can be changed accordingly.

@Matafou
Copy link
Contributor

Matafou commented Aug 15, 2021

Hi, this seems like a good patch, my main concern is about already written _CoqProject files that would break...
I have the feeling that neither -arg "a b" not -arg "'a b'" is used very much.
I also have the feeling that this patch reduces differences between pg and coqc so even if both do not match perfectly it still makes the situation better than it is now.
Relying on ocaml code seems too complicated.
One day dune will save us all... Or not.
@cpitclaudel @erikmd do you have an opinion?

@Matafou
Copy link
Contributor

Matafou commented Aug 27, 2021

I will merge this soon if no one objects.

@Matafou
Copy link
Contributor

Matafou commented Aug 27, 2021

But we need to add something to the CHANGES file, since it may break a few use case relying on the old (and wrong) behaviour.

@Columbus240
Copy link
Contributor Author

Hi. Since I recently changed how coq_makefile interprets the _CoqProject, we might adapt the algorithm, so it precisely mirrors the behavior of coq.

@Matafou
Copy link
Contributor

Matafou commented Sep 5, 2021

Hi @Columbus240 can you be more precise please? What is the new behaviour? In what versions of coq(_makefile) can we expect it to be available?

@Matafou
Copy link
Contributor

Matafou commented Sep 12, 2021

Pinging @Columbus240 you said you had changed the behaviour of coq_makefile, can you tell us more about this change please?

@erikmd
Copy link
Member

erikmd commented Sep 17, 2021

Hi, this seems like a good patch, my main concern is about already written _CoqProject files that would break...

Yes, exactly.

To this aim, I'd just suggest browsing the repositories of Coq libraries that are part of Coq's CI (but there are many of them 😅)
→ cf. https://github.com/coq/coq/blob/master/Makefile.ci#L114
→ and https://github.com/coq/coq/tree/master/dev/ci

And among these libraries, a number of them have some -arg-based _CoqProject, e.g.:
https://github.com/coq/coq/tree/master/dev/ci https://github.com/JasonGross/fiat-crypto/blob/master/_CoqProject.in#L2

I have the feeling that neither -arg "a b" not -arg "'a b'" is used very much.
I also have the feeling that this patch reduces differences between pg and coqc so even if both do not match perfectly it still makes the situation better than it is now.

I didn't take a close look but I believe you're right; if the fix is "conservative" w.r.t. most libraries I suggested above, +1 for merging indeed.

@Matafou
Copy link
Contributor

Matafou commented Sep 19, 2021

I am still waiting for @Columbus240 to clarify these sentence:

Hi. Since I recently changed how coq_makefile interprets the _CoqProject, we might adapt the algorithm, so it precisely mirrors the behavior of coq.

If coq_makefile changed its behaviour we should try to understand how it works now before merging this I guess.

@Columbus240
Copy link
Contributor Author

Sorry for letting you wait.
The new behaviour is documented here: https://github.com/coq/coq/blob/master/doc/sphinx/practical-tools/utilities.rst. In particular the paragraph The Grammar of _CoqProject should contain the necessary information.
A very meandering text is also available in the corresponding PR, coq/coq#14558.

The parsing algorithm is implemented in lib/coqProject_file.ml if you've got the time to decipher code.

Regarding availability: the PR with the new behaviour has been merged in July but there has not been a release since then. It is thus only available in development versions of Coq. The first version in which the new behaviour will be available is v8.14.

I'm not sure, if the patch I made in this PR is still compatible with the new behaviour for simple cases. I didn't get around to checking it. Certainly, this PR does not mirror the old nor the new behaviour of coq_makefile. If I remember correctly, the parsing code of ProofGeneral processes some escape sequences (it's the call to split-string-and-unquote I think). This is something coq_makefile currently does certainly not do.

To give a short overview of the parsing algorithm:

  1. Tokenization. Read the file byte by byte and produce a list of strings. Explained in the documentation linked above.
  2. Process the list of strings as if they were command line arguments. See process_cmd_line. Note that a string following "-arg" is further processed (see line 242) by process_extra_args.

I made an exemplary _CoqProject file and went through the processing steps. I hope it's useful and that I didn't make a mistake.

lorem "ipsum dolor" "sit'amet" #consectetur "adipiscing \n elit
-arg sed -arg do -arg "eiusmod tempor"
-arg 'incididunt ut' -arg "labore 'et dolore' magna"
-arg "'aliqua ut'"
-R "enim ad" -R "'minim veniam'"
"quis # nostrud"

Tokenization produces the following list of strings (where a newline indicates the end of a string)

lorem
ipsum dolor
sit'amet
-arg
sed
-arg
do
-arg
eiusmod tempor
-arg
'incidunt
ut'
-arg
labore 'et dolore' magna
-arg
'aliqua ut'
-R
enim ad
-R
'minim veniam'
quis # nostrud

Of course the next step wouldn't be able to interpret the arbitrary strings as filenames or arguments. But lets ignore this and consider what lists the -arg and the -R arguments collected. I.e. how coqc would get called.
The argument list by -arg (the one stored in COQMF_OTHERFLAGS in the makefile) will look like this: (the ut' is removed, because it doesn't count as "belonging to -arg" anymore. Processing -arg only looks at the immediately following token)

sed
do
eiusmod
tempor
incidunt
labore
et dolore
magna
aliqua ut

While the argument list concerning -R will look like this. I wanted to illustrate here, that the ' aren't specially interpreted here.

-R
enim ad
-R
'minim veniam'

Please ask further questions if something is unclear.

@erikmd
Copy link
Member

erikmd commented Oct 11, 2021

Thanks a lot @Columbus240 for your detailed answer! (and sorry for late reply as well)

As already said by @cpitclaudel, the inherent discrepancy between both _CoqProject parsers (either elisp or (ocaml+make)-based) has always been an issue; anyway your patch looks very sensible and as @Matafou I agree we should merge this soon,

and as we have a planned PG team telco on this Wednesday evening, we'll be able to discuss whether it can be merged at once… or maybe after adding some unit tests 🙂 (to check that coq_makefile and PG's parsing behave the same on, at least, a few specific cases; relying on PG's CI that can test PG against several different Coq versions…)

BTW (just to be sure, I just skimmed https://github.com/coq/coq/pull/14558/commits), is there another recent change in coq's _CoqProject parser algorithm that should be taken into account for PG? (beyond this support of single-quotes in -arg arguments)
→ in this case, we could just as well add some conditional behavior in PG to support both Coq 8.14+ and Coq ≤ 8.13.

Thanks again!

@Matafou
Copy link
Contributor

Matafou commented Oct 12, 2021

From reading https://github.com/coq/coq/blob/master/doc/sphinx/practical-tools/utilities.rst I think that the only thing to change is the way double and simple quotes are interpreted.
@Columbus240 the documentation implies that spaces and newlines are treated exactly the same. Am I right?

@Matafou
Copy link
Contributor

Matafou commented Oct 15, 2021

I think this PR should come with a test of the coq version. Am I wrong?
Let us sum up: we need to do/decide no to do the following things:

  • leave full compliance with coq_makefile new semantic (in particular wrt newlines) for another PR?
  • add a coq version testing (>=8.14?)?
  • add tests?

@erikmd
Copy link
Member

erikmd commented Oct 16, 2021

Hi all,

  • add tests?

Indeed @Matafou I guess that adding unit or integrations tests in this PR would be very nice (all the more as we can test from now on _CoqProject + PG with Coq 8.14.* since there is an image coqorg/coq:8.14).

These tests could for example, be inspired by the links I gave in this old comment of mine.

  • leave full compliance with coq_makefile new semantic (in particular wrt newlines) for another PR?
  • add a coq version testing (>=8.14?)?

However I believe that these two other items can wait for another PR: all the more as if/when we have _CoqProject-dedicated tests, it will be easy to detect if the semantics is broken for a given version of Coq (Coq 8.14 or Coq 8.15), and add conditional elisp code in this case.

@Matafou @Columbus240 if you agree with my remarks above,
I volunteer to help adding one or two commits dedicated to this CI tests in the PR, although I'm afraid this task won't be done before the October-25th week I think (unless if @hendriktews could and would like to have a look meanwhile)

@Columbus240
Copy link
Contributor Author

You're welcome!

Regarding:

BTW (just to be sure, I just skimmed https://github.com/coq/coq/pull/14558/commits), is there another recent change in coq's _CoqProject parser algorithm that should be taken into account for PG? (beyond this support of single-quotes in -arg arguments) → in this case, we could just as well add some conditional behavior in PG to support both Coq 8.14+ and Coq ≤ 8.13.

I now skimmed over https://github.com/coq/coq/commits/master/lib/coqProject_file.ml and don't think anything noteworthy happened to the parsing of _CoqProject in the last few years other than my change.

@Columbus240 the documentation implies that spaces and newlines are treated exactly the same. Am I right?

The only difference between spaces, horizontal tabs and newlines is, that only newlines end comments. Otherwise they are always treated the same. (A short look at coqProject_file.ml can confirm this)

I think implementing version-dependent behaviour is a good intention. But the current behaviour of PG isn't even compliant with coq v8.14 and earlier, because there the semantics of coq_makefile and _CoqProject were also influenced by how Make (and the shell called by Make) would escape and interpret characters.
I don't recall how precisely arguments of -arg were interpreted before my patch, but I'm sure it was messy and would even involve environment variables (though luckily nobody abused this misfeature, as far as I could find).
I'm not sure whether version-dependent behaviour is actually worth the effort, because I don't know how many projects would break by the new interpretation of single/double quotes.

I had to reduce how much I use Coq and with that PG as well. So I'm not particularly waiting for this feature.

@Matafou
Copy link
Contributor

Matafou commented Oct 20, 2021

Thanks @Columbus240. I think we should merge this and see. I am quite sure almost nobody will notice the difference, and this goes in the right direction.

@Matafou
Copy link
Contributor

Matafou commented Mar 28, 2022

This is bit-rotting. Should we merge this immediately?

@Matafou Matafou merged commit ac9ba11 into ProofGeneral:master May 25, 2022
@Matafou
Copy link
Contributor

Matafou commented May 25, 2022

This was waiting for much too long. I don't see much risk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants