Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove more tools from Bento configuration #1150

Merged
merged 2 commits into from
Jun 30, 2020

Conversation

mschwager
Copy link
Contributor

Comments inline.

.bento/config.yml Show resolved Hide resolved
flake8:
ignore:
- bare-except-bugbear
run: true
r2c.registry.latest:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is running https://r2c.dev/default-r2c-checks, which should be covered by our semgrep-run-r2c-config job.

.bento/config.yml Show resolved Hide resolved
ignore:
- use-timeout
run: true
semgrep:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Covered by semgrep-run-r2c-config.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does let semgrep run automatically as a pre-commit hook. Should we add semgrep to .pre-commit.yaml first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required a bit more work, but I updated the pre-commit configuration as well.

ignore:
- use-timeout
run: true
semgrep:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required a bit more work, but I updated the pre-commit configuration as well.

rev: 'v0.12.0'
hooks:
- id: semgrep
args: ['--config', 'https://semgrep.live/p/python', '--include', '*.py', '--precommit', '--error']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying this with something like /p/r2c was really slow and produced lots of parse errors. Using /p/python and --include *.py sped things up.

@@ -1,52 +1,39 @@
## semgrep build

FROM python:3.7.7-alpine3.11 as build-semgrep
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this is no longer necessary now that we're not burning the commit hash into the package build/install. I.e. the .git directory is no longer necessary, and we can greatly simplify the install process (a simple pip install should do the trick).

USER opam

WORKDIR /home/opam/opam-repository
RUN git pull && opam update && opam switch create 4.10.0+musl+static+flambda

COPY --chown=opam . /home/opam/semgrep/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPYing everything in significantly slows down the build process because the Docker context is huge. I.e. if anything changes in the working directory we have to rebuild semgrep-core from scratch, which takes a long time. Now we only rebuild if anything in .git, pfff, semgrep-core, or .gitmodules changes, which helpfully avoids the rebuild if semgrep/ changes.

COPY --from=build-semgrep-core /home/opam/semgrep/semgrep-core/_build/default/bin/Main.exe /bin/semgrep-core
RUN semgrep-core --help
COPY semgrep /semgrep
RUN HOMEBREW_SYSTEM='NOCORE' python -m pip install /semgrep
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOMEBREW_SYSTEM='NOCORE' is now a misnomer referencing an env variable used in setup.py, but we can change that in a separate PR.

ENTRYPOINT [ "/root/.local/bin/semgrep" ]
ENV PYTHONUNBUFFERED=1
ENTRYPOINT ["semgrep"]
CMD ["--help"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no args are passed to docker run print the --help.

@@ -27,6 +27,7 @@
IN_DOCKER = "SEMGREP_IN_DOCKER" in os.environ
IN_GH_ACTION = "GITHUB_WORKSPACE" in os.environ
REPO_HOME_DOCKER = "/home/repo/"
REPO_HOME_DOCKER_PRECOMMIT = "/src/"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here was the big blocker that was breaking pre-commit - it mounts the code at /src/. We can re-purpose the --precommit flag to handle this case.

@@ -305,6 +306,10 @@ def close(self) -> None:

if self.final_error:
raise self.final_error
elif self.rule_matches and self.settings.error_on_findings:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --error flag wasn't doing anything prior to this. It's functionality got lost somewhere along the line in a refactor.

@mschwager mschwager force-pushed the mschwager-more-bento-removal branch from 7ddf9b9 to e3b0604 Compare June 30, 2020 14:45
Copy link
Contributor Author

@mschwager mschwager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that the pre-commit lint will break until these changes are in since the new pre-commit configuration requires the latest Docker image.

ENV PYTHONIOENCODING=utf8
ENTRYPOINT [ "/root/.local/bin/semgrep" ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was another issue with running semgrep via pre-commit. pre-commit runs Docker containers with the UID and GID of the host user. This was breaking this entrypoint because an unprivileged user was trying to run something from /root/.... Now that we simply pip install semgrep we can access the script as an unprivileged user 👍

WORKDIR /home/pythonbuild/semgrep
# downgrade to 20.1 since 20.1.1 reverts building in place
# https://github.com/pypa/pip/issues/7555
RUN pip install pip==20.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed cause of comment above re: no longer needing .git directory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to pin the version anymore since we're no longer using that functionality to burn in SCM information.

@brendongo brendongo merged commit 46eb085 into develop Jun 30, 2020
@brendongo brendongo deleted the mschwager-more-bento-removal branch June 30, 2020 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants