Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci(build): distill workflow #502

Open
innovaker opened this issue Dec 11, 2020 · 13 comments
Open

ci(build): distill workflow #502

innovaker opened this issue Dec 11, 2020 · 13 comments
Assignees
Labels
help wanted Extra attention is needed refactor rfc

Comments

@innovaker
Copy link
Contributor

innovaker commented Dec 11, 2020

I recently increased our ci test coverage to include all boards and shields because we had some blind spots which needed to be covered for verifying #401. However, GitHub Action's random ECONNRESET issue (during the archiving step) is subsequently hitting us harder than before and we're having to restart runs more often.

I've suggested we explore a new strategy:
a) PRs and merges trigger testing of a select (small) group of boards/shields as well as build-args/configs, targeting broad feature coverage.
b) Nightly builds for all boards/shields as a safety net.

I'm happy to implement this change, but I don't have sufficient insight to design the specifics of (a). We need from all the contributors.

Please use this issue to discuss the approach and suggest ideas, hopefully leading to a design for (a).

@innovaker innovaker added help wanted Extra attention is needed rfc refactor labels Dec 11, 2020
@okke-formsma
Copy link
Collaborator

I would like to propose the following:

  • build all boards for a small set of shields (proton-c and nice nano for example)
  • build all shields for a small set of boards (kyria and qaz for example)

This would cover the most used combinations.

This would require us to define multiple jobs, as there is only one matrix definition per job possible.

@Nicell
Copy link
Member

Nicell commented Dec 11, 2020

  • build all boards for a small set of shields (proton-c and nice nano for example)
  • build all shields for a small set of boards (kyria and qaz for example)

This seems like a good start. I think we can also keep the current include list since that's quite short. I'd like to see #500 be done for underglow and other future features.

The only potential issues I see for this small coverage case is when a new shield is submitted that works on the nice!nano, but it breaks on other boards due to missing a board specific overlay or enabling a feature that isn't supported. These shields should be able to work without a board specific overlay and shouldn't have bad config options enabled, but it will be harder to catch that error without testing a few boards (or all). See this commit as an example: a50fdec

It would be neat if we could have a way of building a shield on every supported board for PRs, but I'm thinking GitHub actions may be a bit too limited to do this. This page has some information on labels be used to trigger certain builds, but I'm not sure if we can work with this: https://github.community/t/do-something-if-a-particular-label-is-set/17149

@innovaker
Copy link
Contributor Author

innovaker commented Dec 11, 2020

There's also the option of writing a dedicated script (i.e. Python) which generates a tailored build matrix. Would that be more useful @Nicell?

See: https://github.blog/changelog/2020-04-15-github-actions-new-workflow-features/

@mcrosson
Copy link
Contributor

I'll add a 👍 to the build shield + all boards option ; as I worked on the tidbit bring up I've tripped over some proton_c differences compared to the nice!nano. Nothing too complicated to fix but definitely not caught by local testing as I don't have a proton_c on-hand.

@Nicell
Copy link
Member

Nicell commented Dec 12, 2020

I think we could do some on magic in the workflow file to only build every shield/board combo when the boards directory has been adjusted. Otherwise for other changes, only build the small matrix as outlined above.

It's not perfect, as ideally it'd only build the changed boards/shields, but that type of logic would probably require scripting, which I'm not sure if we need on a first pass of updating this system.

Thoughts @innovaker?

@innovaker
Copy link
Contributor Author

Composition/reuse within GitHub workflows will become easier over time as the GitHub development team is working on it. Moreover, I'm fairly sure that the modular approach will soon be considered the best approach for shields. So for those reasons, I'm not bothered by the repetition of workflow code for the short-term as long as we're mindful of it.

@Nicell, yes, I was also thinking of us leveraging on and I'm keen to fully understand the critical paths. If we're relegating our current blanket approach, I'd like to consider our tactical options before settling on another (albeit cut-down) blanket approach. Given that the current matrix is mostly promicro-orientated shields, what benefits do we get running every shield on every push with the default settings? Can/should we be more selective? I'd appreciate your insight on that please to save me studying every shield in detail.

@innovaker
Copy link
Contributor Author

Composition/reuse within GitHub workflows will become easier over time as the GitHub development team is working on it. Moreover, I'm fairly sure that the modular approach will soon be considered the best approach for shields. So for those reasons, I'm not bothered by the repetition of workflow code for the short-term as long as we're mindful of it.

Adding to my previous comment (above): in parallel to this issue, I'm intending to prototype wrapping up the core of the build job (west commands and variables) with a composite action because it also has strong potential for modules. Hopefully the checkout and archiving steps too once uses is also supported. This should minimize any code reuse/duplication issues which frees our hands somewhat.

@petejohanson
Copy link
Contributor

So, for a minimum basic "coverage", I would see.

  • One complete/onboard nRF52 based board (do we have any in tree yet?)
  • One complete/onboard stm32 based board.
  • One split w/ display and encoder enabled (kyria? Lily?) for at least one pro-micro compat board (e.g. nrfmicr_11/nice_nano)
  • One non-split w/ display and encoder enabled (Romac+?) with both nRF52 (nice_nano) and stm32 (proton_c) based pro-micro compat boards.

That seems like our baseline "cover most edge cases" set.

@mcrosson
Copy link
Contributor

So, for a minimum basic "coverage", I would see.

* One non-split w/ display and encoder enabled (Romac+?) with both nRF52 (`nice_nano`) and stm32 (`proton_c`) based pro-micro compat boards.

If / when the Tidbit is merged I am willing to volunteer to keep tabs on build failures. Tidbit is similar to the Romac+ but with more features/flexibility present making it a little more complex of a test case day to day.

@petejohanson
Copy link
Contributor

Once we have tidbit in, we could move to that. Until then I would say:

  • dz60rgb_rev1
  • kyria_left/kyria_rightwith-DCONFIG_ZMK_DISPLAY=y -DCONFIG_EC11=y -DCONFIG_EC11_TRIGGER_GLOBAL_THREAD=y -DCONFIG_ZMK_RGB_UNDERGLOW=y -DCONFIG_WS2812_STRIP=yand built againstnice_nano`
  • romac_plus with -DCONFIG_EC11=y -DCONFIG_EC11_TRIGGER_GLOBAL_THREAD=y with proton_c and nrfmicro_13

@innovaker
Copy link
Contributor Author

@Nicell has kindly offered to take on the implementation of this issue. Thanks!

@innovaker
Copy link
Contributor Author

I'm tracking this issue related to the ECONNRESET: actions/upload-artifact#116

okke-formsma added a commit to okke-formsma/zmk that referenced this issue Dec 21, 2020
Build steps are often failing with "ECONNRESET" errors due to
rate-limiting by github. We usually don't need/care about the
artifacts, so don't have to fail the build when the upload fails.

related to zmkfirmware#502
okke-formsma added a commit to okke-formsma/zmk that referenced this issue Dec 21, 2020
Build steps are often failing with "ECONNRESET" errors due to
rate-limiting by github. We usually don't need/care about the
artifacts, so don't have to fail the build when the upload fails.

related to zmkfirmware#502 and #actions/upload-artifact/issues/116
petejohanson pushed a commit that referenced this issue Dec 21, 2020
Build steps are often failing with "ECONNRESET" errors due to
rate-limiting by github. We usually don't need/care about the
artifacts, so don't have to fail the build when the upload fails.

related to #502 and #actions/upload-artifact/issues/116
@caksoylar
Copy link
Contributor

Related: #889 (not sure if this is addressed by it sufficiently?)

tyalie pushed a commit to tyalie/zmk that referenced this issue Nov 15, 2022
Build steps are often failing with "ECONNRESET" errors due to
rate-limiting by github. We usually don't need/care about the
artifacts, so don't have to fail the build when the upload fails.

related to zmkfirmware#502 and #actions/upload-artifact/issues/116
yuanbin pushed a commit to yuanbin/zmk that referenced this issue Jun 14, 2023
Build steps are often failing with "ECONNRESET" errors due to
rate-limiting by github. We usually don't need/care about the
artifacts, so don't have to fail the build when the upload fails.

related to zmkfirmware#502 and #actions/upload-artifact/issues/116
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed refactor rfc
Projects
None yet
Development

No branches or pull requests

6 participants