Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Nim-matrix workflow to run on merge queue #693

Merged
merged 2 commits into from
Feb 6, 2024

Conversation

veaceslavdoina
Copy link
Contributor

@veaceslavdoina veaceslavdoina commented Feb 4, 2024

Context

In PR Build improvements #680, we raised the question about building Codex using different Nim versions, but only before the merge to not slow-down existing PR flow and to not waste resources on every push.

We can try to use GitHub merge queue to implement that.

PR

  • Add build-status job to CI workflow to check matrix execution status - will use it in branch rule
  • Add Nim-matrix workflow to run only on merge queue and only on Linux
  • Using concurrency for CI workflow to cancel previous run at PR next commits

Next steps

  • Are we ready for experiments ? 😄
  • Merge
  • Update branch rules with just build / status job
  • Enable merge queue in branch rules

-- Not ready? Just not enable merge queue and then run Nim-matrix workflow manually.

Known issues

  • There is no way to set different checks for PR and merge queue and this is why we use single build-status job name for both workflows
  • When merge queue checks fail we should press merge button again they will re-run them from scratch
  • Jumping to the top of a merge queue will cause a full rebuild of all in-progress pull requests
  • By running multiple additional jobs in Nim-matrix workflow we consume GitHub Org Total concurrent jobs limit
  • Nim-matrix workflow fail very often on BuildJet runners at tests - we use GitHub ones for now
  • By adding an additional step before the merge, we will delay merge up to 20 minutes
More details

Implementation

In the current merge queue implementation we can't set branch rules separately for PR and merge queue checks and this is very confusing. In that way, both workflows should be run on PR and merge queue and we would need to run steps conditionally and that adds some ambiguity.

This is why we workaround that by using a single build-status job name for both workflows and run them separately. In that way, PR and merge queue checks will pass at their own stages.

Basically, flow will be the following

PR

  1. PR is created
  2. CI workflow will be triggered
  3. At next push, p.2 will be repeated
  4. When CI workflow will be finished successfully and auto-merge is enabled, PR will be moved into the merge queue

Merge queue

  1. Nim-matrix workflow will be triggered
  2. Nim-matrix workflow will mark branch as protected (push is not permitted) and will
    • create a temporary branch and merge a current one with the master
    • run build over merged code
    • report build status
  3. If Nim-matrix workflow will fail we should press auto-merge button again or start from PR-p.3
  4. When Nim-matrix checks will pass, PR will be merged automatically

More information can be found in How merge queues work.

Speed

Merge queue will add an additional matrix of jobs to build over different Nim versions. Because they will run in parallel, max merge delay will be equal to the job max execution time.

Runner Duration, max Costs, job Costs, matrix
ubuntu-latest 19m 23s Free Free
buildjet-4vcpu-ubuntu-2204 15m 33s 0.128 $/job 0.64 $/run
buildjet-8vcpu-ubuntu-2204 13m 14s 0.224 $/job 1.12 $/run
buildjet-16vcpu-ubuntu-2204 13m 00s 0.416 $/job 2.08 $/run

Copy link
Member

@markspanbroek markspanbroek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @veaceslavdoina for figuring this out, looks like it wasn't trivial 😄

Is there something we can do to avoid the duplication between ci.yml and nim-matrix.yml? They both contain very similar steps, and I'm afraid that we'll forget to update one of them in the future.

@veaceslavdoina
Copy link
Contributor Author

veaceslavdoina commented Feb 5, 2024

@markspanbroek, it was a long story with lots of experiments 😄

Thank you for the idea. Now we use Reusable workflows for CI and Nim-matrix.

We probably should enable merge queue only after confirmation that builds with Nim v2 passes. We can trigger Nim-matrix workflow manually until that.

Also, jobs names were changed slightly and now include Nim version to have an unified pattern for PR and merge queue. Even without enabling merge queue, we should update branch rule checks with the new ones or use just build / status which include them all.

Screenshot 2024-02-06 at 08 06 53

Copy link
Member

@markspanbroek markspanbroek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for removing the duplication! It all looks good to me now.

@veaceslavdoina veaceslavdoina enabled auto-merge (squash) February 6, 2024 10:32
@veaceslavdoina veaceslavdoina merged commit 403b9ba into master Feb 6, 2024
10 checks passed
@veaceslavdoina veaceslavdoina deleted the feat/github-merge-queue branch February 6, 2024 10:56
@veaceslavdoina
Copy link
Contributor Author

Branch rule was changed and PR automatically merged

Screenshot 2024-02-06 at 12 56 20


env:
cache_nonce: 0 # Allows for easily busting actions/cache caches
nim_version: v1.6.14, v1.6.16, v1.6.18, v2.0.0, v2.0.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guys, why do we have 2.x in the compiler matrix? Is Codex supposed to work with 2.x?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we want to start moving to 2.x

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... but this will fail for now, since we haven't completed the move @veaceslavdoina

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the idea?

We probably should enable merge queue only after confirmation that builds with Nim v2 passes. We can trigger Nim-matrix workflow manually until that.

Merge queue is not enabled and Nim-matrix workflow is not used yet, may be just to test how it works. If we want to use it right now, we then should remove v2 from the list.

Should we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, well, if it isn't being used for now, then no harm in keeping it, but as soon as we do, it will break with 2.x and so it would need to be disabled until we port Codex over.

@veaceslavdoina veaceslavdoina mentioned this pull request Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants