Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only run one full sync on main at a time #4977

Closed
teor2345 opened this issue Aug 29, 2022 · 2 comments · Fixed by #4981
Closed

Only run one full sync on main at a time #4977

teor2345 opened this issue Aug 29, 2022 · 2 comments · Fixed by #4981
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles C-bug Category: This is a bug I-integration-fail Continuous integration fails, including build and test failures I-remote-node-overload Zebra can overload other nodes on the network I-slow Problems with performance or responsiveness

Comments

@teor2345
Copy link
Contributor

Motivation

There isn't much point in running multiple Mainnet full syncs on the main branch at the same time.

It slows down the Zcash network, costs the Foundation more money, and it doesn't provide much more diagnostic information.

Designs

We could skip or cancel full syncs if there is already one running on main.
(But still run all the dependent jobs, if possible.)

We don't want to just limit the concurrency of jobs running on main, because that could queue up days worth of jobs on main.

@teor2345 teor2345 added C-bug Category: This is a bug A-devops Area: Pipelines, CI/CD and Dockerfiles S-needs-triage Status: A bug report needs triage P-High 🔥 I-slow Problems with performance or responsiveness I-integration-fail Continuous integration fails, including build and test failures I-remote-node-overload Zebra can overload other nodes on the network labels Aug 29, 2022
@gustavovalverde
Copy link
Member

gustavovalverde commented Aug 29, 2022

What should be an expected scenario with 10 PRs waiting to be merged?

  • Running the sync with the first merge and cancelling the following ones
  • Waiting for a "merge-cycle" and run the sync with the last merge (every X PRs)
  • Allow to trigger it with specific merges (not all PRs/merges need a full sync)

@teor2345
Copy link
Contributor Author

teor2345 commented Aug 29, 2022

The goals are:

  • automatically do a full sync when the main branch has changed,
  • only have one full sync running on main at the same time, and
  • don't have a large queue of syncs waiting to run: the latest code is the most important to test.

It is ok to occasionally have two syncs running, if it turns out that is hard to avoid.

Also remember that the GitHub trigger is per-push, not per-PR. Mergify might have batched some of the PRs, there's no need to test every PR/commit in a batch push.

What should be an expected scenario with 10 PRs waiting to be merged?
* Running the sync with the first merge and cancelling the following ones
* Waiting for a "merge-cycle" and run the sync with the last merge (every X PRs)

The exact implementation doesn't really matter - either of those implementations would be ok.

But it seems like GitHub has exactly the implementation we want:
https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idconcurrency

If a concurrent job is running:

  • when a new job arrives, make it pending, and cancel any jobs that were previously pending
  • run the latest pending job when the original job finishes
  • Allow to trigger it with specific merges (not all PRs/merges need a full sync)

The check must be automated. When we've made checks manual in the past, people forget to do them.

gustavovalverde added a commit that referenced this issue Aug 29, 2022
Previous behavior:
Multiple Mainnet full syncs were able to run on the main branch at the
same time, and pushing multiple commits to the same branch would run
multiple CI workflows, when only the run from last commit was relevant

Expected behavior:
Ensure that only a single CI workflow runs at the same time in PRs.
The latest commit should cancel any previous running workflows from the
same PR.

Solution:
Use GitHub actions concurrency feature https://docs.github.com/en/actions/using-jobs/using-concurrency

Fixes #4977
Fixes #4857
@mergify mergify bot closed this as completed in #4981 Aug 30, 2022
mergify bot pushed a commit that referenced this issue Aug 30, 2022
* ci(concurrency)!: run a single CI workflow as required

Previous behavior:
Multiple Mainnet full syncs were able to run on the main branch at the
same time, and pushing multiple commits to the same branch would run
multiple CI workflows, when only the run from last commit was relevant

Expected behavior:
Ensure that only a single CI workflow runs at the same time in PRs.
The latest commit should cancel any previous running workflows from the
same PR.

Solution:
Use GitHub actions concurrency feature https://docs.github.com/en/actions/using-jobs/using-concurrency

Fixes #4977
Fixes #4857

* docs: typo

* ci(concurrency): do not cancel running full syncs

Co-authored-by: teor <teor@riseup.net>

* fix(concurrency): explain the behavior better & add new ones

Co-authored-by: teor <teor@riseup.net>
@ftm1000 ftm1000 removed the S-needs-triage Status: A bug report needs triage label Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles C-bug Category: This is a bug I-integration-fail Continuous integration fails, including build and test failures I-remote-node-overload Zebra can overload other nodes on the network I-slow Problems with performance or responsiveness
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants