Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pronto on GitHub actions #401

Closed
wants to merge 6 commits into from

Conversation

dsander
Copy link
Contributor

@dsander dsander commented Feb 20, 2021

When the specs action is triggered by pull_request Pronto can not
post reports to GitHub because there isn't a token present (the action
only has read access to the repository).

Since the pull_request_target action has write access to the repository
we must make sure that no code that is coming from the PR is executed.
The workflow definition used for those actions are always taken from the
target branch which means the pull request can not modify what is run.

The pronto workflow is a mouth full for a couple of reasons:

  • Since we can not use bundle exec (it would execute code from the
    pull request) we must install all dependencies and pronto itself as
    system gems.
  • bundle install --system does not install pronto itself, we must
    build and install it from the gemspec
  • Installing rugged takes a while, to be able to cache the system gems
    we have to use a custom GEM_HOME
  • pronto-rubocop needs a newer version because 0.10.0 is not
    compatible with pronto 0.11 (this wasn't an issue before because
    bundle install does not verify the version of the gem it is run in)
  • After all gems are installed we are checking out the pull requests
    code and finally run pronto
  • pronto has to run on Ruby 2.5 because version 4.14.0 is marked as
    compatible with ruby 2.4 but breaks with a runtime error
  • Explicitly setting the pronto runners makes sure the workflow does not
    silently succeed if - for some reason - pronto-rubocop was not
    installed properly or could not be loaded by pronto

When the specs action is triggered by `pull_request` Pronto can not
post reports to GitHub because there isn't a token present (the action
only has read access to the repository).

Since the `pull_request_target` action has write access to the repository
we must make sure that no code that is coming from the PR is executed.
The workflow definition used for those actions are always taken from the
target branch which means the pull request can not modify what is run.

The `pronto` workflow is a mouth full for a couple of reasons:

* Since we can not use `bundle exec` (it would execute code from the
  pull request) we must install all dependencies and pronto itself as
  system gems.
* `bundle install --system` does not install pronto itself, we must
  build and install it from the `gemspec`
* Installing `rugged` takes a while, to be able to cache the system gems
  we have to use a custom `GEM_HOME`
* `pronto-rubocop` needs a newer version because `0.10.0` is not
  compatible with pronto `0.11` (this wasn't an issue before because
  bundle install does not verify the version of the gem it is run in)
* After all gems are installed we are checking out the pull requests
  code and finally run pronto
* pronto has to run on Ruby 2.5 because version `4.14.0` is marked as
  compatible with ruby 2.4 but [breaks with a runtime error][1]
* Explicitly setting the pronto runners makes sure the workflow does not
  silently succeed if - for some reason - pronto-rubocop was not
  installed properly or could not be loaded by pronto

[1]: NARKOZ/gitlab#550
@dsander
Copy link
Contributor Author

dsander commented Feb 20, 2021

Pronto will not be run for this PR because pull_request_target workflows are always taken from the target branch of a PR and it does not exist in master yet.

After a lot of iterations this seems to be working well on my fork: dsander#7
https://github.com/dsander/pronto/actions/workflows/pronto.yml

It would be nice if someone could open a PR to my fork which should trigger pronto comments to make sure the workflow also works for non-members of a repository.

@dsander dsander marked this pull request as ready for review February 20, 2021 12:43
@dsander dsander requested a review from a team as a code owner February 20, 2021 12:43
@ashkulz
Copy link
Member

ashkulz commented Feb 21, 2021

This configuration looks quite complicated. Why wouldn't the first example in the wiki work? I'd prefer to dog-food what we're recommending to users 😁

@dsander
Copy link
Contributor Author

dsander commented Feb 21, 2021

Because of this:

${{ github.token }} is scoped to the current repository, so if you want to checkout a different repository that is private you will need to provide your own PAT. e.g ${{ secrets.GitHub_PAT }} # GitHub_PAT is a secret that contains your PAT.

That configuration works for pull request from prontolabs/pronto to prontolabs/pronto but not for pull requests from forks. Pronto needs write access to the target repo and the token only has read access when the action is running for a PR coming from a fork.

As a alternative you could create a dummy pronto GitHub user and publicly expose it's personal access token in the action configuration. I wanted to avoid this (mainly for my own sake so that I can copy the provided workflow to my own repos without having to create an additional github user).

@ashkulz
Copy link
Member

ashkulz commented Feb 23, 2021

I'm not sure how something like this action would really work -- it uses the same formatters that you have defined. Or does it also have the same limitation?

@dsander
Copy link
Contributor Author

dsander commented Feb 27, 2021

Converting CI to GitHub Action on this repo is the first instance for me where linting is required on a public repository that accepts pull request and I have to say a lot (if not all) documentation and examples do not cover that I tried both examples you linked, the one from the wiki and the pronto-action.

Not modifying anything the pronto-action fails as expected, it does not have the permissions to create comments: https://github.com/prontolabs/pronto/runs/1993385704?check_suite_focus=true
I then switched the example to use github.token instead of SECRETS.GITHUB_TOKEN which didn't make a difference: https://github.com/prontolabs/pronto/runs/1993433673?check_suite_focus=true

The example from the wiki first failed in a similar way CI currently fails on master: https://github.com/prontolabs/pronto/runs/1993385721?check_suite_focus=true
After making a small adjustment so that the target sha is also checked out (fetch-depth: 0 contrary to the number actually fetches all history) it also fails with a permission error: https://github.com/prontolabs/pronto/runs/1993422654?check_suite_focus=true

I still don't know how it used to work on Travis (I am assuming you have a configured PRONTO_GITHUB_ACCESS_TOKEN in their configuration), but the pull_request_target I added was the simplest (albeit still pretty large) configuration that is both hopefully secure and doesn't require an additional GitHub token.

@ashkulz
Copy link
Member

ashkulz commented Feb 27, 2021

I think this is an area where we definitely need to improve the documentation. Apparently, this is a known limitation for forks -- but I don't think using pull_request_target is recommended, see this article by GitHub's security lab:

Together with the pull_request_target, a new trigger workflow_run was introduced to enable scenarios that require building the untrusted code and also need write permissions to update the PR with e.g. code coverage results or other test results. To do this in a secure manner, the untrusted code must be handled via the pull_request trigger so that it is isolated in an unprivileged environment. The workflow processing the PR should then store any results like code coverage or failed/passed tests in artifacts and exit. The following workflow then starts on workflow_run where it is granted write permission to the target repository and access to repository secrets, so that it can download the artifacts and make any necessary modifications to the repository or interact with third party services that require repository secrets (e.g. API tokens).

I don't think the pronto formatters are designed to run this way (separate step to store results as artifact and then upload via API), I think we need a new formatter to cope with this. I don't think it's safe to checkout the PR in the context of pull_request_target, as not all the plugins can guarantee that code from the PR can't be injected into the process.

This looks like a good feature to build, but I don't think I'll merge this PR because of the security impact 😞 I'd rather build something like this and dogfood it ourselves, so that other projects can use pronto by looking at our configuration as an example of doing it securely via GitHub Actions.

@dsander
Copy link
Contributor Author

dsander commented Feb 28, 2021

I think this is an area where we definitely need to improve the documentation.

Sorry I meant that more in a general way, it feels to me that our use case wasn't thought of when Actions were designed. One easy solution would have been to allow pull_request Actions to comment on pull requests since that can be done by anyone manually anyways.

as not all the plugins can guarantee that code from the PR can't be injected into the process.

Oh, I was assuming that they were we are mostly using the robocop and simplecov reporters and, to my knowledge, they don't eval the to code they are checking. But if that is something that can't be guaranteed the action definitely isn't something that should be recommended for all pronto users.

@ashkulz
Copy link
Member

ashkulz commented Mar 1, 2021

I think this is an area where we definitely need to improve the documentation.

Sorry I meant that more in a general way, it feels to me that our use case wasn't thought of when Actions were designed. One easy solution would have been to allow pull_request Actions to comment on pull requests since that can be done by anyone manually anyways.

Exactly! pronto works well right now, except within GitHub Actions for a pull request workflow -- which would be most open-source Ruby projects, due to mass-migration away from TravisCI. We definitely need to develop something to cater to this, as pronto is such a project too 🙂 While a solution may take some time, I'd rather incorporate this fact into the README.

as not all the plugins can guarantee that code from the PR can't be injected into the process.

Oh, I was assuming that they were we are mostly using the robocop and simplecov reporters and, to my knowledge, they don't eval the to code they are checking. But if that is something that can't be guaranteed the action definitely isn't something that should be recommended for all pronto users.

That's possibly true for this repository, but I'm not sure if it's true for most projects. Also, I couldn't find a FAQ if it was safe to run rubocop on untrusted Ruby code -- I'd rather be pessimistic by default. Although you've used a different cache key for the action cache, someone malicious can also use cache poisoning attacks -- safest is not to use a cache, but then like you said rugged installation is very slow.

I have a possible idea, but not sure I'll get a chance to work on it right away:

  • introduce a --replay-file option to pronto run which would serialize all the messages to it, replacing all formatters with NullFormatter
  • introduce a pronto replay command which would take the replay file and just call the formatters with messages from it (effectively doing the API calls).

This would fit the GitHub Actions model very well, with the replay file being the artifact that's used to share data across runs. Not sure about how to achieve the serialization, as the current in-memory structure isn't really friendly to that.

@dsander
Copy link
Contributor Author

dsander commented Mar 5, 2021

, I couldn't find a FAQ if it was safe to run rubocop on untrusted Ruby code

Me neither, I searched through their code for eval and similarly insecure calls but could not find any. This ofc isn't a guarantee that it is safe to use on untrusted code.

Although you've used a different cache key for the action cache, someone malicious can also use cache poisoning attacks -- safest is not to use a cache, but then like you said rugged installation is very slow.

As far as I know they would only be able to write something malicious into the cache if they get to the point to execute their own code. bundle install is running based on the latest master branch so this should be safe. If we don't want to trust rubocop to be safe everything else really doesn't matter much 😄

This would fit the GitHub Actions model very well, with the replay file being the artifact that's used to share data across runs. Not sure about how to achieve the serialization, as the current in-memory structure isn't really friendly to that.

The plan sounds solid to me, just don't use an unsafe deserializer like YAML.load 😜

I don't think that I will have enough time to look into it myself the next week. Feel free to ping me if you would like some help with the updated action setup. I think I now know how most of their logic and setup works after trying and failing with so many attempts 😄

dsander added a commit to dsander/pronto that referenced this pull request Mar 5, 2021
In prontolabs#401 we discussed that there
currently is no 100% safe way to run pronto on `pull_request_target`
actions we can not send any reports to the GitHub pull request.

This simple action will show up as a failed check. The offences then
need to be viewed in the action output.
@dsander
Copy link
Contributor Author

dsander commented Mar 5, 2021

Closing in favor of #403

@dsander dsander closed this Mar 5, 2021
dsander added a commit to dsander/pronto that referenced this pull request Mar 5, 2021
In prontolabs#401 we discussed that there
currently is no 100% safe way to run pronto on `pull_request_target`
actions we can not send any reports to the GitHub pull request.

This simple action will show up as a failed check. The offences then
need to be viewed in the action output.
@ashkulz
Copy link
Member

ashkulz commented Mar 8, 2021

Although you've used a different cache key for the action cache, someone malicious can also use cache poisoning attacks -- safest is not to use a cache, but then like you said rugged installation is very slow.

As far as I know they would only be able to write something malicious into the cache if they get to the point to execute their own code. bundle install is running based on the latest master branch so this should be safe. If we don't want to trust rubocop to be safe everything else really doesn't matter much 😄

As a pathological case, imagine the following:

  • step 1: create a normal pull request workflow installs a malicious gem to the cache (by explicitly overriding the same cache key) -- you can't access any secrets as yet
  • step 2: force push something non-malicious. From the above GitHub article, the cache isn't partitioned so you've got a malicious gem loaded from the cache loaded in the pull_request_target trigger.

There's too many things which can go wrong 🤷‍♂️

@dsander
Copy link
Contributor Author

dsander commented Mar 13, 2021

There's too many things which can go wrong 🤷‍♂️

That could work, but only if the malicious user already found a way to execute their own code. If we are assuming that all used pronto plugins are safe the cache could not be touched by the user because it is build based on the dependencies defined on pronto/master. Everything in the action up to this command is run based on the one target branch of the pull request, it might be possible to get very creative and get something malicious into the target branch, but I don't know if that is possible without having commit access to the pronto repository.

ashkulz pushed a commit that referenced this pull request Mar 16, 2021
In #401 we discussed that there
currently is no 100% safe way to run pronto on `pull_request_target`
actions we can not send any reports to the GitHub pull request.

This simple action will show up as a failed check. The offences then
need to be viewed in the action output.
AdamHawkinsa pushed a commit to AdamHawkinsa/pronto that referenced this pull request Nov 20, 2023
In prontolabs/pronto#401 we discussed that there
currently is no 100% safe way to run pronto on `pull_request_target`
actions we can not send any reports to the GitHub pull request.

This simple action will show up as a failed check. The offences then
need to be viewed in the action output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants