Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't let users accidentally create a function with the name of a builtin #3000

Closed
zanchey opened this issue May 6, 2016 · 32 comments
Closed

Comments

@zanchey
Copy link
Member

zanchey commented May 6, 2016

Fairly regularly, somebody writes a function called test, and then their entire environment spews garbage and often becomes unresponsive, as many shipped functions and prompts expect test to be the builtin (or at least behave the same way).

Options include:

  • banning a test function
  • converting all shipped functions &c to use either builtin test or [.
@zanchey zanchey added the cleanup label May 6, 2016
@zanchey zanchey added this to the fish-future milestone May 6, 2016
@krader1961
Copy link
Contributor

I vote for the first option. Having to preface all uses with builtin is too fragile and annoying to contemplate. And at least within the fish code base I'd prefer we standardize on test rather than [ for clarity and consistency and to make it less likely for people to think they can use [[ ... ]].

@ghost
Copy link

ghost commented May 6, 2016

Does "banning a test function" mean test will become a keyword? How does bash solve this problem?

@krader1961
Copy link
Contributor

If we implemented that proposal then, yes, test would effectively become a keyword even if the parser wasn't modified to treat it as such. Bash will happily let you shoot yourself in the foot:

bash-4.3$ function test { echo wtf; return 1; }
bash-4.3$ if test -n "abc" ; then
> echo yes
> fi
wtf
bash-4.3$

Do we want to let people hurt themselves this way? Is there a legitimate use case for doing function test ? Sure, someone might want to write a wrapper around builtin test (e.g., to log the arguments) but is allowing that compelling enough to allow doing so when most of the time I suspect that people doing function test don't understand why that's a bad idea?

@ghost
Copy link

ghost commented May 7, 2016

It is not bad if test become a keyword, since [ should also be taken as a keyword:

prompt> [ ]; or echo empty
empty
prompt> []; or echo empty
fish: Unknown command '[]'
fish:  []; or echo empty
       ^
empty
prompt> [$str]; or echo empty
Commands may not contain variables. In fish, please use 'eval [$str]'.
fish:  [$str]
       ^
empty
prompt> [ $str]; or echo empty
[: the last argument must be ']'
empty

There are many moments you can shoot yourself if you think builtin makes the script verbose. (not just about the builtin test)

@occivink
Copy link
Contributor

occivink commented May 8, 2016

I'm not sure about reserving just that particular keyword. What about other generic-sounding builtin like set or string? The latter especially is just being introduced in 2.3, so it was fair game to have such a function before. In my opinion it's better in terms of consistency and resilience to prepend builtin everywhere in internal functions.
Sure the user could still shoot themselves in the foot but I don't think it's especially opaque what's happening when you define a function test and then use it interactively.

@krader1961
Copy link
Contributor

In my opinion it's better in terms of consistency and resilience to prepend builtin everywhere in internal functions.

And what about 3rd-party and user defined functions? I'm not convinced we should reserve test (or any other builtin) but I'm quite certain we should not pepper every use of those commands with builtin. As you say, there are an infinite number of ways for people to shoot themselves in the foot. We all recognize that we can't guard against all of them. The question is whether redefining test happens often enough to warrant special protection.

@krader1961
Copy link
Contributor

I propose adding a --shadow-builtin flag to the function command. It would have to be used to define a function with the same name as a builtin and would cause an error if used with a non-builtin name. That way we keep inexperienced users from borking their shell while will allowing experienced users to do creative things such as extending a builtin.

@faho
Copy link
Member

faho commented May 8, 2016

I propose adding a --shadow-builtin flag to the function command.

@krader1961: That's reasonable, and I'd be okay with it.

@krader1961 krader1961 modified the milestones: next-2.x, fish-future May 8, 2016
@krader1961 krader1961 self-assigned this May 8, 2016
@krader1961 krader1961 changed the title Avoid explosions when defining function test don't let users accidentally create a function with the name of a builtin May 8, 2016
krader1961 added a commit that referenced this issue May 8, 2016
I'm going to modify these functions as part of dealing with issue #3000
and don't want those changes to be masked by running the files through
`make style`.
@krader1961 krader1961 added the release notes Something that is or should be mentioned in the release notes label May 15, 2016
@floam
Copy link
Member

floam commented Jul 1, 2016

This change will in the future have the effect that any scripts which do override a builtin and work fine on 2.3.0/2.3.1 will need this new --shadow-builtin option, to be able work on the next version, which is fine. But these function definitions when used on fish 2.2.0, 2.3.0, 2.3.1 cause fish to get all crashy when it doesn't like this unexpected option on function. It becomes not easy for a user to make or us to provide a function cd wrapper that works everywhere. That's less than super convenient.

@krader1961
Copy link
Contributor

That's less than super convenient.

That's the whole idea. It's not supposed to be convenient to shoot yourself in the foot.

@floam
Copy link
Member

floam commented Jul 1, 2016

The part where that broke forwards and backwards compatibility to take advantage of this is the inconvenient part. I don't think we want to punish people for having used our software. The error output from master cd.fish on Integration-2.3.1 is extremely loud and unhelpful (it literally prints the whole manpage, and I guess is the next "string"-type of problem) and the function does not get created.

@faho
Copy link
Member

faho commented Sep 4, 2016

This has been removed with cfefaaf. Removing from the milestone.

@faho faho removed this from the next-2.x milestone Sep 4, 2016
@faho faho removed enhancement release notes Something that is or should be mentioned in the release notes labels Sep 4, 2016
@zanchey zanchey reopened this Sep 5, 2016
@krader1961 krader1961 removed their assignment Sep 5, 2016
@krader1961
Copy link
Contributor

@zanchey, Why reopen this? It's pretty clear that no one wants to break both forward and backward compatibility to keep people from shooting themselves in the foot.

The only thing I think we can do that wouldn't be controversial is to give a warning if, and only if, the function that shadowed a builtin was defined interactively. And even then we shouldn't keep the function from being defined.

@zanchey
Copy link
Member Author

zanchey commented Sep 11, 2016

I don't think we've come up with the right answer yet, but that doesn't mean there isn't one; let's keep it open and keep thinking.

@floam
Copy link
Member

floam commented Sep 11, 2016

Why not go back around and just do the simple thing that'll make function test less likely to blow up: we use our [ builtin in preference over test. It's inherently safer to be using.

@floam
Copy link
Member

floam commented Sep 11, 2016

it's totally safe to shadow these builtins... right up until you start autoloading, and then things to to hell quickly. We could make funcsave reluctant, and educate the user a bit when saving a new function interactively. Maybe that's where ---shadows-builtin could have a home?

I think if we say that [ ]; is the preferred variant (or just use it without saying that) and we're catching mistake with funcsave, we've made the world a better place.

@krader1961
Copy link
Contributor

Why not go back around and just do the simple thing that'll make function test less likely to blow up: we use our [ builtin in preference over test. It's inherently safer to be using.

No because it doesn't address all the other failure modes; e.g., creating a string function. Too, the [...] syntax is a hack that was created by someone who wanted shell syntax to more closely resemble that of the other languages they were used to. And it is a hack. In the Bourne shell this was implemented by creating a /bin/[ command. If anything we should be replacing all the places we do if [...] with if test .... The if [...] syntax is not inherently safer. It is only safer if you use the if [[...]] syntax introduced by the Korn shell (ksh).

@krader1961 krader1961 added the RFC label Nov 16, 2016
@floam
Copy link
Member

floam commented Nov 16, 2016

It's safer because a user is less likely to accidentally define [ than "test". If we use [ everywhere then creation of a test function doesn't break as much stuff.

@krader1961
Copy link
Contributor

It's safer because a user is less likely to accidentally define ...

Which misses the point that someone defining a function that shadows any builtin can break fish. The test command is simply one example out of many; albeit one that is somewhat more likely to be redefined than the other builtins. There are plenty of other builtins that are likely to be inadvertently redefined by a user thus breaking things. I'm not going to bother enumerating them because all you have to do is run builtin --names.

Regardless, the use of the [...] syntax is a hack that is not consistent with the core fish goal to use commands rather than magic syntax. Especially since in this case it only looks like syntax of the core fish language. Since it is not that argues against using it in the fish scripts we ship.

@krader1961
Copy link
Contributor

A new fish use shot themselves in the foot by doing function test; ...; end (see issue #3890).

There is no way to disallow defining a function with the same name as a builtin without breaking backward and forward compatibility. As I said earlier when that became clear and my change was reverted the best we can do is issue a warning if someone does so interactively. Which is likely to deal with the majority of these situations since such problems typically involve someone defining a function interactively. So I propose we do so then consider this issue resolved. Note that we need to be careful not to issue a warning if the function is autoloaded in an interactive session.

There is one other possible, but problematic, solution. Examine the parse tree of the function to see if it contains builtin $funcname. If it does assume the user knows what they are doing. If not conclude the function will break things. This, however, is still quite fragile. It would, among other things, keep people from reimplementing a builtin as a function without ultimately invoking the builtin. While it is extremely unlikely anyone would do so I don't think we should disallow it.

@mqudsi
Copy link
Contributor

mqudsi commented Feb 21, 2018

From the duplicate #4742, with emphasis added:

While filing #4741 I ran into a nasty case where I broke test and my system went haywire and almost overwrote critical data, as both the default fish prompt and the completions for virtually all system commands end up invoking test which wound up calling my function, most infelicitously also named test.

A simpler alternative has just occurred to me, however. If a user defines a function whose name conflicts with a builtin (or perhaps just test), we could either display a warning message ("Warning: user function test conflicts with and hides builtin test") - or, because in some cases it would be too late, a confirmation ("Warning: user function test conflicts with and hides builtin test. Are you sure you would like to proceed?").

@faho
Copy link
Member

faho commented Feb 21, 2018

If a user defines a function whose name conflicts with a builtin (or perhaps just test), we could either display a warning message

Would we then have to store what functions we've warned about?


I believe the major issue here is test, followed by a couple of other builtins with really generic names. Of our builtins, a bunch are also keywords (and et al, continue, for) that already cannot be overridden.

Two already have functions overriding them (cd and history).

Honestly, I believe that most of the benefit can be had by just forbidding functions named test, set and status, and possibly some others. There really is no use overriding these (especially set, because a function can't define local variables that are visible outside of it), and so everything that overrides them is already at the very least superfluous, but most likely buggy.

By not requiring an additional flag for those builtins that are allowed to be overridden (at the very least cd and history because we actually do that), we get around the backward-compatibility problem that tanked "--shadow-builtin".

@faho
Copy link
Member

faho commented Feb 22, 2018

See #4732, where status was overridden (alias status 'git status').

@mqudsi
Copy link
Contributor

mqudsi commented Feb 23, 2018

I would be onboard with that. There's not much to be gained by refusing to expand the reversed keywords list, imho.

@faho
Copy link
Member

faho commented Feb 23, 2018

So, here's a list of builtins that currently aren't reserved:

[
argparse
bg
bind
block
breakpoint
cd
commandline
complete
contains
count
disown
echo
emit
exit
false
fg
functions
history
jobs
math
printf
pwd
random
read
realpath
set
set_color
source
status
string
test
true
ulimit
wait

Of these, we currently override cd, history and realpath, so they need to be excluded.

At the very least argparse, read and set cannot work in overridden form because they need access to local variables, and I think status might change its mind about being in a command substitution or block, and return a different filename/function/line-number. So they absolutely need to be forbidden.

Of the rest, I don't see any great reason to override any of them - one might be tempted to e.g. add logging to source or make random always return 4 for debugging purposes, but those are massive hacks. Maybe override set_color to disable color?

For math and string there is the possibility of someone using an old function, and I'm not sure we want that to error out (downgrading to fish before builtin math would then make all sessions spew errors).

So the minimal list is:

argparse
read
set
status

And the maximal list is

[
argparse
bg
bind
block
breakpoint
commandline
complete
contains
count
disown
echo
emit
exit
false
fg
functions
jobs
printf
pwd
random
read
set
set_color
source
status
test
true
ulimit
wait

(i.e. no math, string or any of the already overridden ones)

Does anyone have any objections to any in the maximal list? Maybe an idea for how one could override it in a useful manner?

@zanchey
Copy link
Member Author

zanchey commented Feb 25, 2018

I'd stick with the minimal list, plus test, in order to keep the risk of blowing something up as small as possible.

I'm not even sure about argparse - can't that work as a function with --no-scope-shadowing?

@faho
Copy link
Member

faho commented Feb 25, 2018

I'm not even sure about argparse - can't that work as a function with --no-scope-shadowing?

In my tests, it doesn't -

function argparse --no-scope-shadowing
    builtin argparse $argv
end

already causes type somewhere in my prompt to fail. I'd imagine this is because argparse modifies $argv, which I think gets shadowed even with --no-scope-shadowing. Or maybe because argparse sets argv with "ENV_LOCAL".

I'd stick with the minimal list, plus test, in order to keep the risk of blowing something up as small as possible.

I'd add [ to that as well, because it feels weird to allow one way of invoking test but forbidding the other.

@faho faho closed this as completed in 69f68d3 Feb 25, 2018
@faho
Copy link
Member

faho commented Feb 25, 2018

Okay, done with argparse, read, set, status, test and [.

@mqudsi
Copy link
Contributor

mqudsi commented Feb 27, 2018

Thanks, @faho. Do we use [ internally? Because I could see someone wanting to specifically repurpose that syntax, and so long as it's not going to cause internal problems in fish, I guess they should feel free to do so?

@kfkonrad
Copy link
Contributor

Having syntax rather than commands is not a good idea generally. Giving the user the possibility to alter the meaning of syntax is just too confusing and it also goes against everything bash or sh users would expect to happen. Implementing such details (or rather leaving them unguarded) will obfuscate code using such a 'syntax redefinition' for most fish users

@faho
Copy link
Member

faho commented Feb 27, 2018

Do we use [ internally?

Yes, we do. About 80 occurences in share/**.fish.

so long as it's not going to cause internal problems in fish

Even if it didn't cause internal problems in fish, it'd cause problems with other people's scripts. And even if it didn't do that, it would cause issues with other people reading your scripts.

We really don't want to support that. Like we don't want to support people repurposing "if" or "for". The minor advantage of subjective aesthetics is dwarfed by the disadvantage of compatibility with other people's stuff and us having to remove [ from all our scripts (and keeping it removed).

If you want to add something syntax-like (which doesn't really behave like syntax because all of the expansions and such still happen), you can just use a different name - you can even add a function called +, so instead of [ some arguments ], you can do + some arguments +.

@faho faho modified the milestones: fish-future, fish-3.0 Feb 28, 2018
@mqudsi
Copy link
Contributor

mqudsi commented Mar 5, 2018

Just a note for the benefit of future searchers (or myself in a day or two), while exec was never blacklisted and it was possible to declare a function named exec, in effect the tokenizer would always treat it as the exec builtin and the parser would necessarily execute it as such, regardless of the presence or lack thereof of a function by the same name.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants