Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added allowed_workflows to pytorch probot #7485

Merged
merged 4 commits into from
Apr 5, 2023

Conversation

DanilBaibak
Copy link
Contributor

@DanilBaibak DanilBaibak commented Mar 31, 2023

Added allowed_workflows to pytorch probot. This is a follow up PR regarding the retry bot.

cc @seemethere

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 31, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7485

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures

As of commit c7d2255:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@@ -1 +1,8 @@
tracking_issue: 2447
allowed_workflows:
- lint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions here:

  1. The lint workflows aren't flaky so why would we rerun them? Seems like a waste of resources.

  2. For all the others we seem to use name of the workflow, e.g.

    name: Build Linux Wheels

    However, for the lint workflow we have

    Is this case-insensitive? Even if so, could we use the "right" spelling here to avoid confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not case sensitive. I removed the lint workflow from the list. Once the configuration is moved to the domain repo, you guys decide which worker processes should be restarted.

@@ -1 +1,8 @@
tracking_issue: 2447
allowed_workflows:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment what this actually does? Preferably with a permalink to the implementation of the bot that uses this.

@DanilBaibak DanilBaibak requested a review from pmeier April 3, 2023 13:36
Comment on lines +6 to +10
- Build Linux
- Build Macos
- Build M1
- Tests on Linux
- Tests on macOS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are case-sensitive and basically just check job_name.startswith(allowed_workflow), shouldn't we be able to reduce this to

Suggested change
- Build Linux
- Build Macos
- Build M1
- Tests on Linux
- Tests on macOS
- Build
- Tests

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just prefer to be a bit more explicit. But again, this config should be maintained by domains. For now, I need it to be present. After, you can change it according to your needs or requirements.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize, all workflows going by the name: "..." attribute listed here, will be retried on the main branch in case they fail, right? Meaning, if we have flaky tests, the viable/strict branch can be updated anyway, correct?

@DanilBaibak
Copy link
Contributor Author

No, the retry mechanism should work for all PRs. But there are some additional rules: the number of failed jobs should be less than 5 per workflow, retry can happen once for workflow.

@DanilBaibak DanilBaibak merged commit 715db1d into main Apr 5, 2023
@DanilBaibak DanilBaibak deleted the Add-Allowed-Workflows-for-Retry-Bot branch April 5, 2023 11:40
@github-actions
Copy link

github-actions bot commented Apr 5, 2023

Hey @DanilBaibak!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@pmeier
Copy link
Collaborator

pmeier commented Apr 5, 2023

No, the retry mechanism should work for all PRs.

Meaning just on PRs or on main and PRs?

number of failed jobs should be less than 5 per workflow

Could you define "job" and "workflow" here? IIUC, "workflow" means a configuration file and a job is anything under the jobs key, right?

jobs:
unittests:

Do jobs that use a matrix count as multiple ones or as one?

retry can happen once for workflow.

Does the retry happen on workflow or job level? Meaning, if I have a single failing job in a large workflow, will everything be retried or just the failing job?

facebook-github-bot pushed a commit that referenced this pull request Apr 24, 2023
Reviewed By: vmoens

Differential Revision: D45183678

fbshipit-source-id: d1fcbcf8d10c8acb3937b56804f77860b8eb946a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants