Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistently use `#! /usr/bin/env bash' as shebang #2308

Merged
merged 1 commit into from Dec 28, 2020
Merged

Consistently use `#! /usr/bin/env bash' as shebang #2308

merged 1 commit into from Dec 28, 2020

Conversation

iamleot
Copy link
Contributor

@iamleot iamleot commented Dec 23, 2020

This increase portability of semgrep on platforms not having bash as
/bin/bash.

This is not just a possible theoretical problem but actually helps helping build
semgrep on NetBSD!

(And, FTR, I was finally able to build semgrep and can use it on NetBSD with mostly just this patch and a couple of notes that I should tidy up and share (but probably only relevant for NetBSD and pkgsrc users! :))

This increase portability of semgrep on platforms not having bash as
/bin/bash.
@cla-bot
Copy link

cla-bot bot commented Dec 23, 2020

Even though most of our projects are open-source, we ask contributors to sign a contributor license agreement before we merge your PR. There are a few reasons for this that we've learned over time: it's important to have an explicit statement that you wrote the code yourself, that no one else has a copyright claim on it, and it gives us as the long-term maintainers flexibility should we need to change the license in the future (though we don't plan to).

We don't need your address or any other personal information, just a signature and your email address. Sorry to bug you with the legal stuff but the good news is once filed we can accept PRs from you on any project automatically!

Please email cla@r2c.dev with a signed copy of the contributor license agreement.

If you prefer anonymity, we will accept a cryptographically signed PDF using the public key associated with your Github account.

Copy link
Member

@ievans ievans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great @iamleot -- thanks for the contribution! @cla-bot check

tagging @DrewDennison or @brendongo for packaging/devops review

@mjambon or @minusworld -- could we write a semgrep generic-mode rule for #!/bin/bash -> #! /usr/bin/env bash?

My attempt: https://semgrep.dev/s/vzxn

@iamleot
Copy link
Contributor Author

iamleot commented Dec 23, 2020

Thanks @ievans!

@ievans
Copy link
Member

ievans commented Dec 23, 2020

@cla-bot check pls (seems flaky)

@cla-bot cla-bot bot added the cla-signed label Dec 23, 2020
@cla-bot
Copy link

cla-bot bot commented Dec 23, 2020

The cla-bot has been summoned, and re-checked this pull request!

@iamleot
Copy link
Contributor Author

iamleot commented Dec 24, 2020

@mjambon or @minusworld -- could we write a semgrep generic-mode rule for #!/bin/bash -> #! /usr/bin/env bash?

My attempt: https://semgrep.dev/s/vzxn

Really a minor cosmetic nit: in my experience it is more common to have #!/usr/bin/env bash instead of #! /usr/bin/env bash (with a white space between the #! and /usr/bin/env). In this patch I have added a whitespace in order to be consistent with existent shebangs in semgrep codebase.

In pkgsrc package system we have the following sed(1) invocation to spot shebangs (via pkgsrc/mk/check/check-interpreter.mk:67):

sed -n -e '1s/^#![[:space:]]*\([^[:space:]]*\).*/\1/p' -e '1q'

(...and then the interpreter in \1 is double-checked to be existent (so possible packages installing scripts but not having the corresponding interpreter available as dependency are catched), but that is not useful in the semgrep context! So we can ignore it! :))

While checking possible [[:space:]]* between the #! and the /bin/bash would be nice I think that checking only the first line is even more important in this case to avoid possible false positives in the middle of text files.
I have tried to adjust the @ievans' rule trying to address that by trying:

=~/#!\/bin\/bash/

(both unquoted and quoted via a single quote ('...') but it seems to not match anything.)
Regarding the other possible problem: only catching the first line of the file, it seems that I wasn't able to find any possible way to express that by skimming the docs.

Anyway, these are mostly my side-notes on thinking aloud about the rule proposed by @ievans! :) (i.e. not strictly related to this PR :))

@iamleot
Copy link
Contributor Author

iamleot commented Dec 25, 2020

While checking possible [[:space:]]* between the #! and the /bin/bash would be nice I think that checking only the first line is even more important in this case to avoid possible false positives in the middle of text files.
I have tried to adjust the @ievans' rule trying to address that by trying:

=~/#!\/bin\/bash/

(both unquoted and quoted via a single quote ('...') but it seems to not match anything.)
Regarding the other possible problem: only catching the first line of the file, it seems that I wasn't able to find any possible way to express that by skimming the docs.

Of course I have not skimmed enough! (pattern-regex can solve the first problem and it seems by accident using ^ in the regex also address the second problem (are multiple lines concatenated together in the generic mode?)):

https://semgrep.dev/s/gLle

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Dec 25, 2020
However, despite the bit pessimistic TODO, I was available to build it
on NetBSD/amd64 -current and pkgsrc-current after installing
devel/ocaml-opam (and wip/mccs), hacking a bit ocaml-tree-sitter to avoid a
`sudo make install` and with the following PR:

 semgrep/semgrep#2308

...and keeping semgrep-core and spacecat binaries available in PATH
environment variable.
Copy link
Member

@underyx underyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is good practice. We should probably write a semgrep rule for it, too! Thanks for the patch.

Edit: oh, I clicked directly to the review screen and failed to notice the discussion above. I see I'm not the only one with these thoughts :D

@underyx
Copy link
Member

underyx commented Dec 26, 2020

This one works only with the standard no-space rule, but adds a rule ID, message, and matching any binary, not just bash. https://semgrep.dev/s/underyx:hardcoded-shebang-path

@aryx aryx merged commit ef8ab7d into semgrep:develop Dec 28, 2020
@iamleot
Copy link
Contributor Author

iamleot commented Dec 28, 2020

Thanks for review/discussion/merge @ievans, @underyx and @aryx!

Happy holidays!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants