Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GitHub runner telemetry to aid in diagnosing concurrent E2E test failures #309

Open
Tracked by #50
jeromy-cannon opened this issue May 16, 2024 · 3 comments
Open
Tracked by #50
Labels
Design Issue/PR for feature design documents New Feature A new feature, service, or documentation. Major changes that are not backwards compatible. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule.

Comments

@jeromy-cannon
Copy link
Contributor

jeromy-cannon commented May 16, 2024

Possible Solutions

  1. workflow-telemetry-action
  2. Argo Prometheus Metrics
  3. Enabling GitHub ARC Metrics

1. workflow-temetry-action

It was discovered that there were two issues with this solution:

2. Argo Prometheus Metrics

This might help us with some metrics such as the following:

  • Keeping track of the duration of a workflow or template
  • Keeping track of the number of times a Workflow or Template fails over time
  • Custom metrics

3. Enabling GitHub ARC Metrics

Some metrics, however, no CPU or RAM utilization

  • busy runners
  • idle runners
  • number of jobs assigned to a runner
  • many more...

NOTES:

@jeromy-cannon jeromy-cannon added New Feature A new feature, service, or documentation. Major changes that are not backwards compatible. Needs Triage Further development work cannot be done labels May 16, 2024
@JeffreyDallas
Copy link
Contributor

Experimented with workflow-telemetry-action
but getting this error

/runner/_work/_actions/catchpoint/workflow-telemetry-action/v1.8.4/dist/webpack:/workflow-telemetry-action/node_modules/@octokit/auth-action/dist-node/index.js:19
    throw new Error("[@octokit/auth-action] The token variable is specified more than once. Use either `with.token`, `with.GITHUB_TOKEN`, or `env.GITHUB_TOKEN`. See https://github.com/octokit/auth-action.js#createactionauth");
^
Error: [@octokit/auth-action] The token variable is specified more than once. Use either `with.token`, `with.GITHUB_TOKEN`, or `env.GITHUB_TOKEN`. See https://github.com/octokit/auth-action.js#createactionauth

Have tried different release versions, all get the same error

Workflow logs

https://github.com/hashgraph/solo/actions/runs/9164986979/job/25197553053

@jeromy-cannon
Copy link
Contributor Author

jeromy-cannon commented May 21, 2024

@JeffreyDallas , looks like it is complaining about us having more than one token that it can use, we probably need to pick one and pass it in. In the settings we have multiple that looks similar. We should probably specify to avoid confusing it.

I think since this is a re-usable workflow, that we will need to add a secret at the top, and then pass it in from where it is called. From the caller we would pass in GITHUB_TOKEN.

search for snyk-token and you can see the pattern used there.

@JeffreyDallas
Copy link
Contributor

This workflow failed with Unable to get current workflow job info. Please sure that your workflow have "actions:read" permission!
each actins:read already defined
https://github.com/hashgraph/solo/actions/runs/9179531736/job/25241904073

As noticed by Jeromy, looks like there are two show stoppers:
No support for reusable workflows: catchpoint/workflow-telemetry-action#67
Self-hosted GH Runners require elevated security: catchpoint/workflow-telemetry-action#44

@jeromy-cannon jeromy-cannon added P2 Required to be completed in the assigned milestone, but may or may not impact release schedule. Blocker Further development work cannot be done and removed Needs Triage Further development work cannot be done labels May 31, 2024
@jeromy-cannon jeromy-cannon added Design Issue/PR for feature design documents and removed Blocker Further development work cannot be done labels May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Issue/PR for feature design documents New Feature A new feature, service, or documentation. Major changes that are not backwards compatible. P2 Required to be completed in the assigned milestone, but may or may not impact release schedule.
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants