New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement responsefiles support PR7050 #748

Closed
wants to merge 2 commits into
base: 4.04
from

Conversation

Projects
None yet
6 participants
@bschommer
Contributor

bschommer commented Aug 10, 2016

This is a first implementation of support for giving arguments via response file as @file. The quoting conventions are the ones of the gnu tools. The implementation itself is a variation of the patch attached to the original mantis issue http://caml.inria.fr/mantis/view.php?id=7050.
The functionality is implemented via two new functions in the Arg module.

bschommer added some commits Aug 10, 2016

Added expandargv and writeargv.
Teh functions expandargv and writeargv implement the nearly the
same functionality as the expandargv and writeargv functions of
the libiberty, which is used by the gnu tools.

Expandargv allows it to specify response files for too long
command lines and writeargv prints an array of arguments in a
compatible way.
Call expandargv before parsing commandline args.
This allows the ocaml compiler to read arguments from
responsefiles.
@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Aug 11, 2016

Contributor

Dear @bschommer,

Once upon a time, the Win32 version of OCaml had expansion of @responsefile built-in. It was performed by C code (less sophisticated than your Caml code) and run systematically at program start-up time, so that by the time we got into Arg parsing, the @responsefiles were already expanded, just like in your implementation.

Then it was noticed this conflicted with the -w and -warn-error options of the OCaml compilers, whose argument can start with @, e.g. -w @ae. Rather than rethinking the syntax of -w options, it was decided to turn off expansion of responsefiles entirely, with a comment that "nobody needs it". Commit: 9c275a2

In conclusion: your pull request reintroduces the problem with expanding @responsefile before parsing the options using Arg, and for this reason it cannot be accepted.

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users can write

   "-@", Arg.Responsefile, "<file>  Read command-line arguments from <file>";

in their argument parsing specifications.

Contributor

xavierleroy commented Aug 11, 2016

Dear @bschommer,

Once upon a time, the Win32 version of OCaml had expansion of @responsefile built-in. It was performed by C code (less sophisticated than your Caml code) and run systematically at program start-up time, so that by the time we got into Arg parsing, the @responsefiles were already expanded, just like in your implementation.

Then it was noticed this conflicted with the -w and -warn-error options of the OCaml compilers, whose argument can start with @, e.g. -w @ae. Rather than rethinking the syntax of -w options, it was decided to turn off expansion of responsefiles entirely, with a comment that "nobody needs it". Commit: 9c275a2

In conclusion: your pull request reintroduces the problem with expanding @responsefile before parsing the options using Arg, and for this reason it cannot be accepted.

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users can write

   "-@", Arg.Responsefile, "<file>  Read command-line arguments from <file>";

in their argument parsing specifications.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Aug 11, 2016

Contributor

Another problematic point is that we still haven't agreed on a concrete syntax for response files.

The one proposed in this PR is fine for Unix-like platforms, as it inverts Filename.quote just fine. However, unless I misread the patch, it is useless for Win32.

Also, @alainfrisch adamantly champions the "one line = one option" syntax.

See, the problem is not so much to code a reasonable solution; it is to agree on the specs of a reasonable solution.

Contributor

xavierleroy commented Aug 11, 2016

Another problematic point is that we still haven't agreed on a concrete syntax for response files.

The one proposed in this PR is fine for Unix-like platforms, as it inverts Filename.quote just fine. However, unless I misread the patch, it is useless for Win32.

Also, @alainfrisch adamantly champions the "one line = one option" syntax.

See, the problem is not so much to code a reasonable solution; it is to agree on the specs of a reasonable solution.

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 11, 2016

Contributor

It was more of a first throw, I should have made it clearer that this is not some finished implementation but rather some first try.

Then it was noticed this conflicted with the -w and -warn-error options of the OCaml compilers, whose argument can start with @, e.g. -w @AE. Rather than rethinking the syntax of -w options, it was decided to turn off expansion of responsefiles entirely, with a comment that "nobody needs it".

That is the reason why I called it before any argument parsing, however with the current implementation you still could not write @AE if there is a file called ae in the working directory.

The one proposed in this PR is fine for Unix-like platforms, as it inverts Filename.quote just fine. However, unless I misread the patch, it is useless for Win32.

My thoughts for using the Unix syntax is that is widely used and gcc as well as clang both support it. For the compiler itself one could support different quoting styles and choose them according to the OS or some ENV variable. This would also allow the simple one arg = one line approach.

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users can write

"-@", Arg.Responsefile, "<file> Read command-line arguments from <file>";
in their argument parsing specifications.

This seems actually a pretty good idea. How about adding the spec to the Arg module and adding some new module for response file support containing some basic functions for reading and writing such files in different syntax?

Contributor

bschommer commented Aug 11, 2016

It was more of a first throw, I should have made it clearer that this is not some finished implementation but rather some first try.

Then it was noticed this conflicted with the -w and -warn-error options of the OCaml compilers, whose argument can start with @, e.g. -w @AE. Rather than rethinking the syntax of -w options, it was decided to turn off expansion of responsefiles entirely, with a comment that "nobody needs it".

That is the reason why I called it before any argument parsing, however with the current implementation you still could not write @AE if there is a file called ae in the working directory.

The one proposed in this PR is fine for Unix-like platforms, as it inverts Filename.quote just fine. However, unless I misread the patch, it is useless for Win32.

My thoughts for using the Unix syntax is that is widely used and gcc as well as clang both support it. For the compiler itself one could support different quoting styles and choose them according to the OS or some ENV variable. This would also allow the simple one arg = one line approach.

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users can write

"-@", Arg.Responsefile, "<file> Read command-line arguments from <file>";
in their argument parsing specifications.

This seems actually a pretty good idea. How about adding the spec to the Arg module and adding some new module for response file support containing some basic functions for reading and writing such files in different syntax?

@bschommer bschommer changed the title from Implement responsefiles support PR7050 to [WIP] Implement responsefiles support PR7050 Aug 11, 2016

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Aug 25, 2016

Contributor

For the compiler itself one could support different quoting styles and choose them according to the OS or some ENV variable.

Please, don't do that.

Concerning the choice of a syntax, I maintain my position that the "one line = one arg" solution is the simplest one, both in terms of implementation and of usability (for all build systems that will need to produce response file), and also the most portable one (including for cross-compilation scenarios). What are the benefits of requiring any quoting convention?

Contributor

alainfrisch commented Aug 25, 2016

For the compiler itself one could support different quoting styles and choose them according to the OS or some ENV variable.

Please, don't do that.

Concerning the choice of a syntax, I maintain my position that the "one line = one arg" solution is the simplest one, both in terms of implementation and of usability (for all build systems that will need to produce response file), and also the most portable one (including for cross-compilation scenarios). What are the benefits of requiring any quoting convention?

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 25, 2016

Contributor

I think the only implementing the gnu quoting conventions is that they allow writing the command line in the same way than passing it through the actually command line.
But I agree that for writing responsefiles by hand the "one line = one arg" solution is the simplest one.

Contributor

bschommer commented Aug 25, 2016

I think the only implementing the gnu quoting conventions is that they allow writing the command line in the same way than passing it through the actually command line.
But I agree that for writing responsefiles by hand the "one line = one arg" solution is the simplest one.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 25, 2016

Member

What about '\000' terminated arguments? Many command line tools (grep, xargs, ...) support reading&writing entries terminated by a '\000', so such files are easy to produce with shell commands.

There are two advantages:

  • you can pass arguments containing newlines via responsefiles
  • you don't need to be careful about not writing arguments with newlines (as they would be split into two arguments)

It's just a bit annoying if you want to write a responsefile by hand, so maybe we could support both: -@ and -@0.

Member

diml commented Aug 25, 2016

What about '\000' terminated arguments? Many command line tools (grep, xargs, ...) support reading&writing entries terminated by a '\000', so such files are easy to produce with shell commands.

There are two advantages:

  • you can pass arguments containing newlines via responsefiles
  • you don't need to be careful about not writing arguments with newlines (as they would be split into two arguments)

It's just a bit annoying if you want to write a responsefile by hand, so maybe we could support both: -@ and -@0.

@bschommer bschommer referenced this pull request Aug 25, 2016

Merged

Expand option #778

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 25, 2016

Contributor

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users >can write

"-@", Arg.Responsefile, " Read command-line arguments from ";
in their argument parsing specifications.

I opened a PR request with the implementation of this.

Contributor

bschommer commented Aug 25, 2016

What you could investigate instead is to add to Arg.spec a new action, say, Responsefile, so that users >can write

"-@", Arg.Responsefile, " Read command-line arguments from ";
in their argument parsing specifications.

I opened a PR request with the implementation of this.

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Aug 25, 2016

Contributor

It's just a bit annoying if you want to write a responsefile by hand, so maybe we could support both: -@ and -@0.

I suspect that nobody is ever going to need (and thus probably use) the -@0 variant, since command-line arguments containing newlines are extremely uncommon.

Contributor

alainfrisch commented Aug 25, 2016

It's just a bit annoying if you want to write a responsefile by hand, so maybe we could support both: -@ and -@0.

I suspect that nobody is ever going to need (and thus probably use) the -@0 variant, since command-line arguments containing newlines are extremely uncommon.

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 25, 2016

Contributor

The only arguments I think of are defines for the C source code, but those can be passed in response file for the C compiler.

Contributor

bschommer commented Aug 25, 2016

The only arguments I think of are defines for the C source code, but those can be passed in response file for the C compiler.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 25, 2016

Member

I guess one example would be an interpreter where you want to pass some input from the command line.

But more generally, even though having newlines in filenames is a bad idea, it's not technically forbidden. So by using \n as a separator you make the implicit assumption that your input is free from newlines. On the other hand, I know of no systems allowing NUL in filenames

Member

diml commented Aug 25, 2016

I guess one example would be an interpreter where you want to pass some input from the command line.

But more generally, even though having newlines in filenames is a bad idea, it's not technically forbidden. So by using \n as a separator you make the implicit assumption that your input is free from newlines. On the other hand, I know of no systems allowing NUL in filenames

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 25, 2016

Contributor

But in other options it could appear, that is basically the problem with any separator sign. However these kind of options could still be passed via the command line.

Contributor

bschommer commented Aug 25, 2016

But in other options it could appear, that is basically the problem with any separator sign. However these kind of options could still be passed via the command line.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 25, 2016

Member

You cannot have NUL in command line arguments, so it's not really a problem if responsefiles don't support it either

Member

diml commented Aug 25, 2016

You cannot have NUL in command line arguments, so it's not really a problem if responsefiles don't support it either

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 26, 2016

Contributor

Another the problem with NUL is that it is writing such files by hand and/or by application can be a little bit more complicated.

Contributor

bschommer commented Aug 26, 2016

Another the problem with NUL is that it is writing such files by hand and/or by application can be a little bit more complicated.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 26, 2016

Member

That's right, newlines are better for files written by humans (you can use a normal editor) while NUL characters are better for files generated by programs (no need to compromise)

Member

diml commented Aug 26, 2016

That's right, newlines are better for files written by humans (you can use a normal editor) while NUL characters are better for files generated by programs (no need to compromise)

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 29, 2016

Contributor

How about using both? The newline as separators and the NUL for quoting options that contain a new line. This allow easy writing of command line options for most cases and still one could have a new line separated option if it was quoted with NUL.

Contributor

bschommer commented Aug 29, 2016

How about using both? The newline as separators and the NUL for quoting options that contain a new line. This allow easy writing of command line options for most cases and still one could have a new line separated option if it was quoted with NUL.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Aug 29, 2016

Contributor

In the Mantis discussion http://caml.inria.fr/mantis/view.php?id=7050 , I suggested to split at NUL bytes if the response file contains NUL bytes, otherwise split at newlines. This was ignored, but it seems we're on our way to propose it again. This discussion is really going in circles.

Contributor

xavierleroy commented Aug 29, 2016

In the Mantis discussion http://caml.inria.fr/mantis/view.php?id=7050 , I suggested to split at NUL bytes if the response file contains NUL bytes, otherwise split at newlines. This was ignored, but it seems we're on our way to propose it again. This discussion is really going in circles.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 29, 2016

Member

Sorry, I didn't read the mantis discussion initially. Splitting at NUL bytes if there are some seems best to me

Member

diml commented Aug 29, 2016

Sorry, I didn't read the mantis discussion initially. Splitting at NUL bytes if there are some seems best to me

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Aug 30, 2016

Contributor

It seems we are converging to Xavier's solution 1b in http://caml.inria.fr/mantis/view.php?id=7050#c14736

What would be the exact rules? In particular:

  • When there is a NUL byte: would consecutive NUL bytes create empty arguments? is a final NUL byte allowed (not creating a final empty argument)?
  • When there is no NUL byte: same questions as above (with \n instead of NUL); and: are lines whitespace-trimmed (removing \032, \r, \t; at least from the end of the lines)?
Contributor

alainfrisch commented Aug 30, 2016

It seems we are converging to Xavier's solution 1b in http://caml.inria.fr/mantis/view.php?id=7050#c14736

What would be the exact rules? In particular:

  • When there is a NUL byte: would consecutive NUL bytes create empty arguments? is a final NUL byte allowed (not creating a final empty argument)?
  • When there is no NUL byte: same questions as above (with \n instead of NUL); and: are lines whitespace-trimmed (removing \032, \r, \t; at least from the end of the lines)?
@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 30, 2016

Contributor

I would ignore consecutive NUL bytes and also empty lines. I would not whitespace-trimm the lines and would take the arguments as is.

Contributor

bschommer commented Aug 30, 2016

I would ignore consecutive NUL bytes and also empty lines. I would not whitespace-trimm the lines and would take the arguments as is.

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Aug 30, 2016

Contributor

I would ignore consecutive NUL bytes and also empty lines.

This makes it impossible to pass arbitrary arguments (it's not like empty arguments are very common, but neither are those with newlines).

I would also suggest that at least for the text version (i.e. no NUL), we trim a final \r character on each line (assuming lines are split on \n) to avoid problems if Windows tools are used to produce (in text mode) those response files.

Contributor

alainfrisch commented Aug 30, 2016

I would ignore consecutive NUL bytes and also empty lines.

This makes it impossible to pass arbitrary arguments (it's not like empty arguments are very common, but neither are those with newlines).

I would also suggest that at least for the text version (i.e. no NUL), we trim a final \r character on each line (assuming lines are split on \n) to avoid problems if Windows tools are used to produce (in text mode) those response files.

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 30, 2016

Member

The various tools that takes a -0 or -print0 argument consider that NUL is used to terminate entries; so a final NUL byte is allowed and doesn't create a final empty argument. I think we should do the same, at least it allows to use the output of find -print0 as it.

For the same reason I think a final newline shouldn't create a final empty argument.

Ignoring consecutive NUL bytes doesn't seem right to me; given that such files will always be machine generated, programmers can manually filter empty arguments if they want to

Member

diml commented Aug 30, 2016

The various tools that takes a -0 or -print0 argument consider that NUL is used to terminate entries; so a final NUL byte is allowed and doesn't create a final empty argument. I think we should do the same, at least it allows to use the output of find -print0 as it.

For the same reason I think a final newline shouldn't create a final empty argument.

Ignoring consecutive NUL bytes doesn't seem right to me; given that such files will always be machine generated, programmers can manually filter empty arguments if they want to

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Aug 30, 2016

Contributor

Fine with me. About the implementation: My suggestion would be to add the Expand spec and for the response file reading and writing as separate functions either in Arg (if one wants to make them available for all) or in some misc (for internal use only).

Contributor

bschommer commented Aug 30, 2016

Fine with me. About the implementation: My suggestion would be to add the Expand spec and for the response file reading and writing as separate functions either in Arg (if one wants to make them available for all) or in some misc (for internal use only).

@diml

This comment has been minimized.

Show comment
Hide comment
@diml

diml Aug 31, 2016

Member

Agreed. I would add the reading and writing functions directly to the Arg module as it will be useful for all tools calling or wrapping the compiler

Member

diml commented Aug 31, 2016

Agreed. I would add the reading and writing functions directly to the Arg module as it will be useful for all tools calling or wrapping the compiler

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Sep 29, 2016

Member

I totally agree with @diml on all points.

Member

damiendoligez commented Sep 29, 2016

I totally agree with @diml on all points.

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Sep 29, 2016

Member

BTW you will have to open a new PR against trunk, as this is not going into 4.04.

Member

damiendoligez commented Sep 29, 2016

BTW you will have to open a new PR against trunk, as this is not going into 4.04.

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Sep 29, 2016

Contributor

I will open a new PR once PR #778 is merged.

Contributor

bschommer commented Sep 29, 2016

I will open a new PR once PR #778 is merged.

@bschommer

This comment has been minimized.

Show comment
Hide comment
@bschommer

bschommer Sep 29, 2016

Contributor

Concerning the scope: I would also like to add responsefile support for the other tools, e.g. ocamldep, ocamldoc, etc.

Contributor

bschommer commented Sep 29, 2016

Concerning the scope: I would also like to add responsefile support for the other tools, e.g. ocamldep, ocamldoc, etc.

@bschommer bschommer closed this Oct 10, 2016

@bschommer bschommer deleted the bschommer:responsefile branch Oct 10, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment