New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit how much data we read from a command substitution #3822

Closed
krader1961 opened this Issue Feb 7, 2017 · 12 comments

Comments

Projects
None yet
5 participants
@krader1961
Copy link
Contributor

krader1961 commented Feb 7, 2017

At the moment we don't impose any limits on the amount of data that we accept from read or a command substitution (e.g., (cmd)). Issue #3712 proposes setting a limit of 100 MiB for any single read execution. This issue is to discuss whether the same, or a similar, limit should be imposed on command substitution output. See issue #3817 for examples of what can happen without such a limit.

@krader1961 krader1961 added this to the fish-future milestone Feb 7, 2017

@ChrisJefferson

This comment has been minimized.

Copy link

ChrisJefferson commented Feb 7, 2017

Also (don't know if this should be a separate issue), ctrl+c doesn't work while a command substitution is running, leaving no method of escaping from a bad substitution.

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Feb 7, 2017

Yes, @ChrisJefferson, that is a distinct issue separate from this one. In fact, it's already been reported. See issue #3527.

@ChrisJefferson

This comment has been minimized.

Copy link

ChrisJefferson commented Feb 8, 2017

thanks, didn't want it lost. Sorry.

@faho

This comment has been minimized.

Copy link
Member

faho commented Feb 9, 2017

ARG_MAX is probably of interest here. See e.g. http://www.in-ulm.de/~mascheck/various/argmax/ and http://pubs.opengroup.org/onlinepubs/7908799/xsh/limits.h.html.

It is defined as "Maximum length of argument to the exec functions including environment data. Minimum Acceptable Value: _POSIX_ARG_MAX ".

So that means once we exceed it, we can't rely on external commands being launchable anymore. We could allow more for our builtins, so instead of doing one invocation of an external command with a huge amount of data you'd store it in a variable and then run the external command with smaller chunks.

But the goal here isn't to disallow "illegal" invocations, it's to avoid a hung shell, so we should probably just pick a high limit and use it for everything. If you need to e.g. remove a gazillion files, use rm -r or find -delete instead of a command substitution.

(On my system, ARG_MAX is 2MiB, btw)

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Feb 11, 2017

We should not distinguish between the ARG_MAX limit for external commands and builtins (or functions) not subject to that limit. Doing so just leads to inconsistencies that confuses users. At least with respect to limiting the text interpolated by a command substitution. Not least because other text on the command line outside of the command substitution could easily result in ARG_MAX being exceeded. The two failure modes are independent of each other.

Should we use the maximum of ARG_MAX or 10 MiB for command substitution? I vote no for consistency with the read command.

@zanchey

This comment has been minimized.

Copy link
Member

zanchey commented Feb 27, 2017

Which bit are you voting no to? :-)

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Feb 27, 2017

No to using ARG_MAX. Using that limit for any single command substitution would not guarantee that an external command could be launched because other arguments on the command line could push us over the OS limit. Too, on many systems ARG_MAX can be quite small (e.g., a few tens of KB). It would also be inconsistent with the recently imposed limit for the read command and unnecessarily limit the utility of fish builtins and functions.

@floam

This comment has been minimized.

Copy link
Member

floam commented Mar 3, 2017

Should this perhaps be something queryable/set through our ulimit builtin rather than an environment variable?

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Mar 3, 2017

Should this perhaps be something queryable/set through our ulimit builtin rather than an environment variable?

Yes, but only if we can be reasonably confident it won't need to be set prior to reading the users ~/.config/fish/config.fish. That's because all the other ulimit options affect child processes whereas this would not. So using ulimit is not 100% equivalent to using an env var for this particular setting. Unless, of course, we magically mapped it to an env var and vice-versa. But if we did that why bother involving ulimit since this is a var, like fish_use_posix_spawn, that we do not expect anyone to ever need to set except for debugging; e.g., like I did in the read unit tests.

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Mar 4, 2017

Since no one has argued for not implementing a limit on how much data we will accept from a command substitution I'm going to remove the RFC label.

The only question I think still needs an answer is what to do if the limit is exceeded for a non-interactive shell. We should obviously emit a warning to stderr whether or not the shell is interactive. But if non-interactive should the shell immediately exit or simply treat that command substitution has having failed with a non-zero status (and zero characters substituted)?

@krader1961 krader1961 removed the RFC label Mar 4, 2017

@zanchey

This comment has been minimized.

Copy link
Member

zanchey commented Jun 1, 2017

I think the behaviour should mirror that of read - that is, exit with a non-zero status without any substitution.

@krader1961 krader1961 modified the milestones: fish-3.0, fish-future Jun 23, 2017

@krader1961

This comment has been minimized.

Copy link
Contributor

krader1961 commented Jun 23, 2017

We snuck in the same change to the read command in the 2.6.0 release. Which was arguably wrong to do since it wasn't backward compatible. We should only do the same thing for command substitutions in a major release. So retargeting this to the fish 3.0.0 release it looks like we're going to do a major release in the near future.

Also, it is a good sign that no one has reported a problem with the 10 MiB limit on read commands since 2.6.0 was released. That increases my confidence this is a low risk change that will only make it harder for badly written fish script to result in a denial of service type problem.

@krader1961 krader1961 self-assigned this Jul 27, 2017

krader1961 added a commit to krader1961/fish-shell that referenced this issue Aug 1, 2017

implement limits on command substitution output
This makes command substitutions impose the same limit on the amount
of data they accept as the `read` builtin. It does not limit output of
external commands or builtins in other contexts.

Fixes fish-shell#3822

krader1961 added a commit to krader1961/fish-shell that referenced this issue Aug 1, 2017

implement limits on command substitution output
This makes command substitutions impose the same limit on the amount
of data they accept as the `read` builtin. It does not limit output of
external commands or builtins in other contexts.

Fixes fish-shell#3822

krader1961 added a commit to krader1961/fish-shell that referenced this issue Aug 1, 2017

implement limits on command substitution output
This makes command substitutions impose the same limit on the amount
of data they accept as the `read` builtin. It does not limit output of
external commands or builtins in other contexts.

Fixes fish-shell#3822

krader1961 added a commit that referenced this issue Aug 4, 2017

implement limits on command substitution output
This makes command substitutions impose the same limit on the amount
of data they accept as the `read` builtin. It does not limit output of
external commands or builtins in other contexts.

Fixes #3822

@krader1961 krader1961 closed this Aug 4, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment