Document difference between public/private repo's for organizational runners #732

consideRatio · 2021-08-18T21:30:09Z

UPDATE

A organizational runner won't take on a job to run on a public repo, unless its explicitly asked via a quite hard to find checkbox: see #732 (comment).

Action point to close issue: document it.

There is a troubleshooting section to fix common mistakes, but there is no debugging section to help users figure out how to identify what could have gone wrong.

I'd be happy to contribute updating such documentation if someone can help me on my way to debug my situation better, I've been stuck at this for ~6 hours or so now.

My specific situation in need of debugging advice

I've created an app within Org A, and only granted the repository level permission listed for organizational runners.
I've installed the app in Org B
I've created a k8s secret with credentials to use
I've got pods running and ready representing from a RunnerDeployment, and they are listed in Org B as idle.
I've created a PR that reference self-hosted, but the job isn't picked up by the org runner ,and my runner is stuck with the following logs as read via kubectl.

√ Connected to GitHub

2021-08-18 20:29:02Z: Listening for Jobs

Version details

Chart 0.19.0
summerwind/action-runner:latest (Trasnaltes to v2.280.2-ubuntu-20.04-b6465c5 thanks to imagePullPolicy: Always)
A k8s cluster based on k3s that has a Arm64 architecture on its RaspberryPi 4B computers.

The text was updated successfully, but these errors were encountered:

mumoshu · 2021-08-19T00:49:43Z

@consideRatio Hey! I read your stiuation and wasn't sure what was your goal.

So you seem to have correctly set up actions-runner-controller to successfully register organizational runners for Org B. That looks good.

Now, in the step 5 in which repository are you submitting a PR? Is that repo in Org B? Then it should trigger workflow run whose jobs will be scheduled onto the Org B's runners. It it doesn't it may be a bug in GitHub, not us, as all we do is to configure and deploy runners as you've specified.

If you're submitting a PR against a repo in a Org A and asking why the jobs aren't scheduled onto Org B's runners, I don't understand how it can work. If that's the case, probably you shall clarify a bit more about your goal.

consideRatio · 2021-08-19T01:27:36Z

Now, in the step 5 in which repository are you submitting a PR? Is that repo in Org B?

Ah yes, I'm submitting a PR to a repo in Org B where the runners are observed as registered as running.

[...] it may be a bug in GitHub, not us, as all we do is to configure and deploy runners as you've specified.

Absolutely. Not having a deep understanding of this or actions/runner or related code bases, I find it hard to make a conclusion on where to focus my attention to resolve the issue I have. Is it your belief that if the following conditions are met, then it probably is a bug in GitHub somehow rather than this repo?

Runners are registered to an org according to GitHub's UI
Runners report "Listening for Jobs"
A repo in the org receives a pull request that has a job that runs-on: self-hosted, and the runner has a label of self-hosted.
A checklist like above would be helpful in a debugging process of an unknown issue actually!

consideRatio · 2021-08-19T01:28:04Z

My next debugging ideas

The org I've tried against doesn't grant the GITHUB_TOKEN any notable permissions by default and instead require them to be explicitly requested.
- Action point on my end: verify this doesn't matter by testing this against another org without this
The PR I've tried updated the github workflow file as part of the PR, making the job be triggered by having the PR opting to use the self-hosted runners.
- Action point on my end: verify this doesn't matter by triggering jobs to execute by push events to the default branch of some repo instead
I haven't granted the GitHub App the read/write administration permissions on the repository level as needed for repository runners as was pre-filled when clicking the link for the organizational runners link. I am only interested in organizational runners though.
- Action point on my end: trial with a GitHub App requesting that permission as well and install it in a GitHub org and trial it there.

Btw thank you @mumoshu for your work on this project and responding to this potentially support-like issue. I hope to make it a contribution rather than just become a support errand for you maintainers by focusing on identifying advice to document under a "debugging" topic or similar.

mumoshu · 2021-08-19T01:42:17Z

if the following conditions are met, then it probably is a bug in GitHub somehow rather than this repo?

Yes, I believe so. Thanks for summarizing it nicely!

To be extra sure, can you share your RunnerDeployment manifest here? You should replace some concrete values in your manifest, like spec.organization to Org B to make it comparable with your description made above.

mumoshu · 2021-08-19T01:46:08Z

@consideRatio I saw your comment and although I see nothing suspicious in your setup right now, you could try:

Give more permissions to your GitHub app
Do use private key, not GITHUB_TOKEN for GitHub app based deployment of actions-runner-controller, as explained in our README (YOu mention GITHUB_TOKEN but I don't understand why you need to mention that when you're deploying it with GitHub App

consideRatio · 2021-08-19T01:56:48Z

# kubectl apply -f actions-runner-controller-runnerdeployment.yaml
#
# reference example: https://github.com/actions-runner-controller/actions-runner-controller#additional-tweaks
#
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: jupyterhub-org
  namespace: actions-runner-controller
spec:
  replicas: 1
  template:
    spec:
      organization: jupyterhub
      labels:
        - self-hosted

I'm using a Private Key created from the GitHub app's config page, downloaded and added to a k8s secret read by the actions-runner-controller pod - so I'm not using a GitHub token. I also doubt the organization configuration of default permissions granted to the github token injected into job's is irrelevant, but it was one of the things that I know could make my configuration stand out from others in some way or another. I'm just guessing at wild things at this point =/

mumoshu · 2021-08-19T02:14:39Z

@consideRatio What do you see in the Actions tab of your repo? Does trigger job fail due to no runner error, or anything else?

In your manifest, why do you explicitly specify the self-hosted runner label? I thought self-hosted was automatically added by GitHub to any runner on registration. I have never tried specifying it from my end so that might make some difference.

consideRatio · 2021-08-19T02:21:58Z

Now it sais "starting job...", but it sometimes have said no matching org or repo level runner matched the label "self-hosted".

I specified self-hosted explicitly as the error mentioned it, even though i saw they already had such label automatically registered.

I dont know why it sometimes say starting up and other times sais no runner with matching label.

consideRatio · 2021-08-19T02:23:32Z

mumoshu · 2021-08-19T02:29:53Z

After a few moments, it shows this error. So, I believe this is due to some discrepacy between your expectation and the actual config.

https://github.com/jupyterhub/zero-to-jupyterhub-k8s/runs/3365601314?check_suite_focus=true

mumoshu · 2021-08-19T02:30:40Z

I specified self-hosted explicitly as the error mentioned it, even though i saw they already had such label automatically registered.

Anyway, AFAIK, you don't need it. Can you try removing it from RunnerDeployment yaml?

mumoshu · 2021-08-19T02:32:53Z

Chart 0.19.0

To be extra sure, you should recheck your chart version. There's no chart of that version. 0.19.0 might be controller version.

consideRatio · 2021-08-19T09:16:35Z

NAME                     	NAMESPACE                	REVISION	UPDATED                                	STATUS  	CHART                           	APP VERSION
actions-runner-controller	actions-runner-controller	1       	2021-08-18 00:40:12.75749028 +0200 CEST	deployed	actions-runner-controller-0.12.7	0.19.0

Woops, okay, it is Chart version 0.12.7 - sorry for the confusion.

Anyway, AFAIK, you don't need it. Can you try removing it from RunnerDeployment yaml?

Absolutely, I've already done it - it was how i started, but I'll trial going onwards by not having it explicitly listed.

Thank you for your attention and help to debug this @mumoshu, I have some work to do to investigate this further and will try to summarize findings after that!

consideRatio · 2021-08-20T22:12:38Z

Yikes okay, I've tried all ideas that we discussed to try with no change in the outcome. I've also tried the summerwind/actions-runner images with version 2.277.1, 2.278.0, 2.279.0, 2.280.2, and 2.280.3 without any change in the outcome.

It seems some people report something similar from time to time at their discourse forum without a clear resolution. I couldn't identity a sticky issue about this in https://github.com/actions/runner/issues either.

Overall, I remain clueless and not sure at all how to proceed.

mumoshu · 2021-08-23T00:23:02Z

@consideRatio Hey! Thanks for reporting. Unfortunately, I have no idea what would be the answer to your issue yet.

If I were you, I would try to isolate the cause by using the same as where you create your GitHub App on, and onto where you install the app, like you both create the app on and install it onto either Org A or Org B, not across those.

I would also try verifying all other settings are correct, by trying to make it work with a personal access token, not as a GitHub App.

Other possibilities- try repository runners rather than organizational runners you have already tried.

mumoshu · 2021-08-23T00:24:59Z

@consideRatio I was rereading your original issue details and caught by this:

I've created an app within Org A, and only granted the repository level permission listed for organizational runners.

Could you share the exact list of permissions you've provided to your app?

Are you sure you did also provide Self-hosted runners (read / write) organizational permission?

https://github.com/actions-runner-controller/actions-runner-controller#deploying-using-github-app-authentication

consideRatio · 2021-08-23T00:42:50Z

I've reduced the complexity by using a single GitHub Organization where the app is defined and installed as well already. It made no difference.

Are you sure you did also provide Self-hosted runners (read / write) organizational permission?

Yepp!

Could you share the exact list of permissions you've provided to your app?

Repo	Org

The installed application within the organization describes the following permissions granted. It does not mention repository related permissions.

consideRatio · 2021-08-23T00:54:44Z

Do you think these points are of relevance?

I didn't configure a webhook URL, and explicitly unchecked "active" in this section for the application to be able to create the app without a webhook url configured. My Helm chart installation have the webhook setup disabled as well as.
I have not created a client secret for the GitHub App, but instead relied on a private key that I did create according to this repo's README.
I observe there is an opt-out'able feature enabled for the GitHub, I didn't explicitly opt-in or similar, it was a default.
I did made the app installable by any org during creation of the app.

mumoshu · 2021-08-23T01:11:23Z

@consideRatio You aren't using actions-runner-controller's webhook-based autoscale, right? Then point 1 seems ok.

For 2, honestly, I haven't tried using client secrets so it may make some difference. Sry for asking a question for a question but would a client secret can be used as an alternative to a personal access token?

For 3 and 4, I have not tried changing the defaults while testing actions-runner-controller(sry but developing all the features and testing all the combinations of settings myself isn't sustainable) so I can't surely say how it affects the setup.

consideRatio · 2021-08-29T04:36:26Z

[...] could a client secret can be used as an alternative to a personal access token?

I don't think so, I'm just confused in general. I remember reacting to the fact that I could not delete a client secret after creating one without creating a new first. So, its like they required you to have one, but at the same time, we are not using one. Due to that hint from github, I got a bit confused about the situation without any proper understanding about it.

sry but developing all the features and testing all the combinations of settings myself isn't sustainable

Absolutely understand that. I've bashed my head against this a bit more and tried various permutations. Still stuck with the same issue. I'll raise a question in the GitHub forums to ask how to debug a situation where I have a action that is registered and Idle according to GitHub, while at the same time not having it pick up any jobs etc even though it has matching labels.

My current hypothesis is that they are sending out some request back to my runner, but it doesn't receive it due to some networking issue or just ignores logging an error or similar.

consideRatio · 2021-08-29T06:00:58Z

I installed tcpdump and analyzed the traffic coming to my runner pod. It seems like it's just a repeat of some form of keep-alive packets coming.

https://github.community/t/self-hosted-runner-registered-as-idle-but-not-picking-up-jobs/198240/2?u=consideratio

mumoshu · 2021-08-30T00:19:20Z

@consideRatio Thanks for the update and your patience. I just saw the response you got on the forum. Glad to see you've found the solution.

I wish I could have pointed out it myself- I'm feeling very sorry about that 😢

I think I've tested my GitHub App-based deployment by triggering jobs on a private repo. Apparently, that was the difference.

Based on this experience, we should at least add some notes about how to use organizational runners on public repositories. Would you agree?

The biggest gotcha from my perspective was that you need to configure the Default runner group even if you aren't going to use runner groups with your organizational runners. An organizational runner implicitly belongs to the Default runner group but who cares.

consideRatio · 2021-08-30T00:24:43Z

Based on this experience, we should at least add some notes about how to use organizational runners on public repositories. Would you agree?

Haha yes, it can save some workdays of debugging ;D Excellent image!

I wish I could have pointed out it myself- I'm feeling very sorry about that 😢

Oh no worries at all, I'm very thankful for your help considering this with me! 🙇 ❤️

stale · 2021-09-29T01:25:37Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2021-10-29T12:18:21Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dongho-jung · 2022-12-05T03:04:29Z

you saved my day indeed

consideRatio mentioned this issue Aug 25, 2021

Improve performance of CI system jupyter/docker-stacks#1407

Closed

stale bot added the stale label Sep 29, 2021

consideRatio changed the title ~~Debugging advice documentation~~ Document difference between public/private repo's for organizational runners Sep 29, 2021

stale bot removed the stale label Sep 29, 2021

stale bot added the stale label Oct 29, 2021

mumoshu added documentation Improvements or additions to documentation and removed stale labels Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document difference between public/private repo's for organizational runners #732

Document difference between public/private repo's for organizational runners #732

consideRatio commented Aug 18, 2021 •

edited

Loading

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021

consideRatio commented Aug 19, 2021 •

edited

Loading

mumoshu commented Aug 19, 2021

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021

mumoshu commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 19, 2021

mumoshu commented Aug 19, 2021

mumoshu commented Aug 19, 2021 •

edited

Loading

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 20, 2021 •

edited

Loading

mumoshu commented Aug 23, 2021

mumoshu commented Aug 23, 2021

consideRatio commented Aug 23, 2021

consideRatio commented Aug 23, 2021

mumoshu commented Aug 23, 2021

consideRatio commented Aug 29, 2021

consideRatio commented Aug 29, 2021

mumoshu commented Aug 30, 2021

consideRatio commented Aug 30, 2021 •

edited

Loading

stale bot commented Sep 29, 2021

stale bot commented Oct 29, 2021

dongho-jung commented Dec 5, 2022

Document difference between public/private repo's for organizational runners #732

Document difference between public/private repo's for organizational runners #732

Comments

consideRatio commented Aug 18, 2021 • edited Loading

UPDATE

My specific situation in need of debugging advice

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021

consideRatio commented Aug 19, 2021 • edited Loading

My next debugging ideas

mumoshu commented Aug 19, 2021

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021

mumoshu commented Aug 19, 2021 • edited Loading

consideRatio commented Aug 19, 2021 • edited Loading

consideRatio commented Aug 19, 2021

mumoshu commented Aug 19, 2021

mumoshu commented Aug 19, 2021 • edited Loading

mumoshu commented Aug 19, 2021

consideRatio commented Aug 19, 2021 • edited Loading

consideRatio commented Aug 20, 2021 • edited Loading

mumoshu commented Aug 23, 2021

mumoshu commented Aug 23, 2021

consideRatio commented Aug 23, 2021

consideRatio commented Aug 23, 2021

mumoshu commented Aug 23, 2021

consideRatio commented Aug 29, 2021

consideRatio commented Aug 29, 2021

mumoshu commented Aug 30, 2021

consideRatio commented Aug 30, 2021 • edited Loading

stale bot commented Sep 29, 2021

stale bot commented Oct 29, 2021

dongho-jung commented Dec 5, 2022

consideRatio commented Aug 18, 2021 •

edited

Loading

consideRatio commented Aug 19, 2021 •

edited

Loading

mumoshu commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 19, 2021 •

edited

Loading

mumoshu commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 19, 2021 •

edited

Loading

consideRatio commented Aug 20, 2021 •

edited

Loading

consideRatio commented Aug 30, 2021 •

edited

Loading