Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(lambda): Prevent scale-up lambda from starting runner for user repo if org level runners is enabled #3909

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

PerGon
Copy link

@PerGon PerGon commented May 13, 2024

We run a setup where we only have "Org Level Runners" (enable_organization_runners: true).

When creating an action job in a repository that does not belong to an Organization (User level repository), that ScaleUp lambda will happily create a new VM for it.
Well, this new VM in turn, will never be able to register itself in GitHub because it cannot register itself as a "Org Level Runner" as such concept does not exist at the user level.
This VM will then become a "zombie" VM and will also cause issues with the ScaleDown lambda (described in ToBeCreatedIssue).

In this PR, we will pass the "Repo Owner Type" all the way from the Webhook lambda to the ScaleUp lambda (by adding a new field in the SQS payload). Then, the ScaleUp lambda can decide to drop the request if the "Repo Owner Type" is not of the type Organization and enable_organization_runners is set to true.

Copy link
Member

@npalm npalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time crating the PR. What I don't understand why the user event is sent to the webook. The common way of using the module is installing repositories to an app.

For me it is not 100% clear what you mean with user space repo's. Are that forks of org repo's?

@@ -305,6 +316,11 @@ describe('scaleUp with GHES', () => {
expect(mockOctokit.paginate).toHaveBeenCalledTimes(1);
});

it('Throws error if it is a User repo and org level runners is enabled', () => {
process.env.ENABLE_ORGANIZATION_RUNNERS = 'true';
expect(() => scaleUpModule.scaleUp('aws:sqs', TEST_DATA_USER_REPO)).rejects.toBeInstanceOf(Error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer not to add a global event (module scope) for only a single test. Please can you update the event in the test to set the repoUserType

if (enableOrgLevel && payload.repoOwnerType !== 'Organization') {
logger.warn(`Repository ${payload.repositoryOwner}/${payload.repositoryName} does not belong to a GitHub` +
`organization and organization runners are enabled. This is not supported. Not scaling up for this event.`);
throw Error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not throw an error, this will send the message back to SQS. And trigger a retry.

@PerGon
Copy link
Author

PerGon commented May 22, 2024

Howdy! Thanks for having a look.

Sorry I was not very clear 😅 I'll try to elaborate!

Let's consider the following scenario with 2 different repos:

  1. github.my_enterprise.com/some_team/app1 - In this case we have a repo called app1 in one GitHub Org called some_team
  2. github.my_enterprise.com/john_doe/some_poc - In this case we have a repo called some_poc in a GitHub User called john_doe

In our setup we use a GitHub App. We also have plenty of GitHub Orgs in our enterprise - we tell the users who want to use our centrally maintained GHA Runners to install the specific GitHub App in their GitHub Orgs so they can use GHA (we have other CIs in our company 🙄 ) - in our internal documentation we clearly state "Do not install the GitHub App in your GitHub User account - ONLY install in GitHub Organizations".
But, even tho this is clearly stated in our internal docs, some engineers will eventually install the GitHub App for GHA runners in their GitHub User account - and once they try to create Action in their "personal" repos, it will cause the erratic behavior I described in OP.

As far as we can see, there's no way to prevent a GitHub App from being installed in GitHub User accounts 😢

The TL:DR; is that we have a self-service model where engineers can onboard their GitHub Orgs to use GHA Runners, but if they install the GitHub App in their own account (instead of their org) to try our GHA, it will cause issues.

Here's an example of two GitHub Apps installations, one that is OK and one that is not OK.

This is an OK use case

image

This is an NOT an OK use case

image

Sorry for the wall of text 😅 hopefully I could exemplify the case that causes issues. Let me know if it made things a bit more clear 😸 and also thanks for the comments in the code - will address them as soon as I have a few minutes!

@PerGon
Copy link
Author

PerGon commented May 30, 2024

I've addressed the comments on the code (I think 😅 ) - thanks for the review. Let me know how it looks and if you think more changes are needed 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants