Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resholve doesn't seem to resholve shebangs #15

Closed
grahamc opened this issue Nov 4, 2020 · 3 comments · Fixed by #16
Closed

Resholve doesn't seem to resholve shebangs #15

grahamc opened this issue Nov 4, 2020 · 3 comments · Fixed by #16

Comments

@grahamc
Copy link
Contributor

grahamc commented Nov 4, 2020

Furthermore, if the shebang can't be rewritten in to an absolute path, maybe fail unless it is explicitly authorized.

@abathur
Copy link
Owner

abathur commented Nov 30, 2020

"do something smarter" seems like a good goal. :)

2 things make me leery of committing until I have a good handle on how to do this:

  • I don't think the Oil AST visitor resholve uses exposes the comments or the shebang (it's possible I just need to dig more to figure it out--I haven't picked at this too much previously since I don't otherwise need the comments), making this more manual/custom/involved.
  • I'm aware of stuff like platform quirks and multiline shebangs--but it's shallow knowledge.

The obvious approach seems to be a resholve feature we could think of as a checkedPatchShebangs, but I want to brainstorm approaches (feel free to add) in case this is a hammer/nail mistake?

obvious: add something like a checkedPatchShebangs to resholve

  • not too hard if it's as simple as recognizing a first-line shebang and parsing #!first-word-executable and #!/usr/bin/env second-word-executable
  • I worry a little that it won't be this simple, and resholve will end up on a treadmill of increasingly-nuanced shebang support beyond the above (but I lack the perspective to know how realistic this concern is)
  • I worry a little that the smarter the behavior gets, the more likely there are to be gaps between the correct general behavior, and the correct behavior in a Nix build, i.e.:
    • During a Nix build, a shebang of #!/bin/bash should do something like trigger a search for a bash on RESHOLVE_PATH, and replace it if one is found
    • Outside of a Nix build, a shebang of #!/bin/bash should do something like fail if /bin/ wasn't in RESHOLVE_PATH?

Error if any shebang is present

The user will need a triage API to instruct resholve to ignore the shebang (swallow the error) or specify an interpreter (relative? absolute?) so resholve can lop off all existing shebang lines and write a new one. (Maybe also support specifying a new custom shebang from scratch?)

  • this feels like a fairly resholve approach (it's explicit, and consistent with existing patterns)
  • it creates busywork in scenarios that do seem simple to handle automatically
  • it spares resholve from needing to understand internal shebang structure at all (let alone well enough to accurately separate scenarios it can handle automatically from ones it can't)
  • it feels fairly global/stable (it seems less likely that resholve ends on a treadmill of increasingly-nuanced shebang issues/handling)

Find an existing standalone tool that can wrangle any complexity here

I've looked a bit and didn't find anything obviously up to the task, but that doesn't mean one doesn't exist.

Refactor patchShebangs to support adding a checkedPatchShebangs

The Nix API for resholve could then invoke this before actually resholving.

  • avoids a slippery-slope into reimplementing patchShebangs in python2
  • avoids gotcha differences between resholve's behavior and patchShebangs
  • leaves resholve without a feature that is probably also useful to non-Nix users
  • I'm personally a bit queasy about introducing subtle nixpkgs problems in the process, so I'd prefer finding someone who can confidently muck with it
  • I assume it's an epic rebuild

Extract a standalone tool from the core of patchShebangs

  • Existing patchShebangs and resholve could invoke it (are there other obvious users that would be a knock-on benefit?)
  • Would need to find someone happy to do this
  • I assume it's also an epic rebuild
  • it might be easier to take along if someone decides to package resholve for pip or a distribution

Not sure if GH will notify @shlevy or @zimbatm since they aren't contributors, but I'm also curious what they think.

@shlevy
Copy link

shlevy commented Dec 1, 2020

https://www.in-ulm.de/~mascheck/various/shebang/ was my reference for https://github.com/shlevy/long-shebang. As far as I know, long-shebang and nix's shebang mechanism are the only tools that use multiple lines. shebangs are moderately thorny, but I do think the complexity is ultimately tractable and I doubt it will grow further in the future.

@abathur
Copy link
Owner

abathur commented Dec 1, 2020

@shlevy Good to know it's tractable! :)

As far as I know, long-shebang and nix's shebang mechanism are the only tools that use multiple lines.

I haven't specifically tried to survey for nonstandard shebangs, but I noticed a similar project from spack: https://github.com/spack/sbang.

shebangs are moderately thorny, but I do think the complexity is ultimately tractable and I doubt it will grow further in the future.

(Can skip this para if you're familiar with what resholve does) In case more context is helpful, resholve tries to find references to relative external executables/scripts, and either resolve them to absolute paths (from directories in the RESHOLVE_PATH) or raise an error if it can't. I think @grahamc is imagining something like RESHOLVE_PATH="${expect}/bin" resholve blah.sh turning:

#!/usr/bin/env expect
...

into

#!/nix/store/ypn4mxpnimsjy54xv7y06hssqjzpvfck-expect-5.45.4/bin/expect
...

My treadmill worry is less (but still a little!) about the core syntax getting harder, and more about the parsing and resolution scenarios getting a lot thornier each step down the slope.

For example, consider what the user might expect if they call RESHOLVE_PATH="${long-shebang}/bin:${bash_5}/bin" resholve ohno.sh on:

#!/usr/bin/env long-shebang
#!sh -e
false
true

Given resholve's value-prop, they probably expect it to go one more step down the slope:

#!/nix/store/b89qg3zzznr1j84l469irjbr8nx6ryg0-long-shebang-1.2.0/bin/long-shebang
#!/nix/store/j5ry26cch1xghfa2gyfp5zdh20lwkb9a-bash-5.0-p17/bin/sh -e
false
true

That isn't incrementally too bad, but my treadmill worry is a few small reasonable steps leading users to ask for stuff that'll make me want to tear my hair out (i.e., can't really be supported without full shell parsing for each shebang line). This example isn't very sensible:

#!/usr/bin/env long-shebang
#!/usr/bin/env -i BLAH=hehe HEY=abathur,\ stop\ overthinking\ this\ stuff!\ :D ssh-agent bash
declare -p

But it does run!

 $ ./sshagent.sh 
BASH=/bin/bash
BASH_ARGC=()
BASH_ARGV=()
BASH_LINENO=([0]="0")
BASH_SOURCE=([0]="./sshagent.sh")
BASH_VERSINFO=([0]="3" [1]="2" [2]="57" [3]="1" [4]="release" [5]="x86_64-apple-darwin19")
BASH_VERSION='3.2.57(1)-release'
BLAH=hehe
DIRSTACK=()
EUID=501
GROUPS=()
HEY='abathur, stop overthinking this stuff! :D'
HOSTNAME=8db58a52
HOSTTYPE=x86_64
IFS=$' \t\n'
MACHTYPE=x86_64-apple-darwin19
OPTERR=1
OPTIND=1
OSTYPE=darwin19
PATH=/usr/gnu/bin:/usr/local/bin:/bin:/usr/bin:.
PPID=48047
PS4='+ '
PWD=/Users/abathur/work/nix/scripts
SHELL=/run/current-system/sw/bin/bash
SHELLOPTS=braceexpand:hashall:interactive-comments
SHLVL=1
SSH_AGENT_PID=49133
SSH_AUTH_SOCK=/tmp/ssh-Frxs1RZi9nLy/agent.49132
TERM=dumb
UID=501
_=bash
__CF_USER_TEXT_ENCODING=0x1F5:0x0:0x0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants