Skip to content

Conversation

clee2000
Copy link
Contributor

@clee2000 clee2000 commented Jan 12, 2023

  • flatten the workflows into just jobs in order to give more specific links (link to the specific job that failed instead of just pull), this should make it easier to implement bypass certain failures in the future
  • try catch of MandatoryChecksMissingError from find_matching_merge_rule should fix error where merge loops instead of raising runtime error when trunk job fails
  • remove usage of on_green and mandatory_only flags just in case. on_green and force are the only two behaviors we currently use
  • fail if ghstack pr has non ghstack change, tested locally with useless change 2 #92177 but unsure how to write tests b/c requires use of repo._run_git

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 12, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92097

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 Failures

As of commit 8da3a25:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jan 12, 2023
@clee2000 clee2000 changed the title [mergebot] Flatten workflows into jobs, fix bug where bot does not fail early [mergebot] Flatten workflows into jobs, fix bugs Jan 13, 2023
@clee2000 clee2000 force-pushed the csl/trymergechanges branch from d48377c to c691d1d Compare January 13, 2023 22:56
@clee2000 clee2000 marked this pull request as ready for review January 13, 2023 23:13
@clee2000 clee2000 requested a review from a team as a code owner January 13, 2023 23:13
@clee2000 clee2000 force-pushed the csl/trymergechanges branch from c691d1d to 536f655 Compare January 17, 2023 17:15
required_checks = []
failed_rule_message = None
try:
find_matching_merge_rule(pr, repo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the code correctly, this would raise MandatoryChecksMissingError when there are pending jobs. So how are the 2 branches of failed_rule_message is not None run on line 1511?

I guess the question is about the exceptions raised by find_matching_merge_rule (also calls get_combined_checks_from_pr_and_land_validation inside) v.s. the exceptions raising here in merge. They looks duplicated. But then trymerge logic is complicated, so I wonder what I miss here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not entirely sure if i understand the first question, but some None checks are just in case.

For the exception raising duplication, let's say a pull job is pending, but a trunk job has failed. Then find_matching_merge_rule doesn't see the failing trunk job because it only checks the mandatory jobs (none of which include trunk at the moment). Then it raises MandatoryChecksMissingError which previously would get caught at the end of the loop, and we would wait 5 minutes and run the loop again. By catching the MandatoryChecksMissingError earlier, we can also check the non mandatory checks (trunk) and raise RuntimeError, which would cause the trymerge script to exit early.

get_combined_checks_from_pr_and_land_validation does get called twice but get_checkrun_conclusions does check for self.conclusions first, so at least its not rerunning all the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, so find_matching_merge_rule is for mandatory checks while merge will handle both mandatory and non-mandatory ones. The latter behavior is what I'm familiar with, so I wrongly thought that find_matching_merge_rule did both. This also answers my first question.

Copy link
Contributor

@huydhn huydhn Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion with @clee2000, we should consider just get rid of the obsolete distinction between mandatory and non-mandatory checks in trymerge (not as part of this PR though). So the logic could be as simple as:

  • Does the PR have the correct approval and labels?
    • Yes, continue
    • No, the merge process stops and clearly says the reason
  • Are pull and trunk run?
    • Yes, continue
    • No, the merge process stops and clearly says the reason
  • No job fails. All is good and the commit will be merged
  • Some jobs fail. Are they ignorable?
    • Yes, the commit will be merged
    • No, the merge process stops and clearly says the reason

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this is the most confusing chunk of our merge rules and would be great to simplify them like that

required_checks = []
failed_rule_message = None
try:
find_matching_merge_rule(pr, repo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this is the most confusing chunk of our merge rules and would be great to simplify them like that

@clee2000
Copy link
Contributor Author

@pytorchbot merge -f "lint passed, failures are disabled dynamo tests and some other windows thing that ive seen on some other prs"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@atalman
Copy link
Contributor

atalman commented Jan 23, 2023

cc @clee2000 I want to revert this PR: #92592 and get this error: Can't revert PR that was landed via phabricator as D42154677. Please revert by going to the internal diff and clicking Unland. Could these changes be related to this latest mergebot change ?

@clee2000
Copy link
Contributor Author

cc @clee2000 I want to revert this PR: #92592 and get this error: Can't revert PR that was landed via phabricator as D42154677. Please revert by going to the internal diff and clicking Unland. Could these changes be related to this latest mergebot change ?

that change was introduced in #91975

@github-actions github-actions bot deleted the csl/trymergechanges branch July 20, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants