Proprietary toolchain support sustainability #63738

fabiobaltieri · 2023-10-10T09:35:45Z

Introduction

Problem description

Hi, I recently bumped into few compiler quirks:

the first priority check implementation broke ARCMWDT due to differences in section naming
the second implementation broke ARMClang due to missing symbols
the troubleshooting of the second implementation was slowed down because ARMClang support was already broken
a recent build time optimization added a warning in clang builds for pretty much anything but native_posix that turned out to be an llvm bug
we are now blocking a Kconfig option removal for unclear compiler limitations (may not be the case, but it adds to the point)

None of the above were detectable in CI. On paper the project supports a handful of different compilers, but in practice the only ones tested in CI are

Zephyr SDK at the current stable version
Clang/LLVM at the version in the CI image, but really only builds for native_posix

Any breakage that does not hit any of these is going to go undetected and has to get caught manually by a reviewer or post-submit, potentially post-release, so it's regression and potential backport.

More importantly, many of those toolchains are proprietary, most developer have no access to them, so one has to rely on someone else to test their code, but at the same time there's not toolchain maintainers or a known list of point of contacts for validating changes against a specific toolchain.

Proposed change

Supported proprietary toolchains should have an area in the MAINTAINER file listing users who can test potentially breaking changes. Ideally there should be a way for Zephyr contributors to access proprietary toolchains for troubleshooting, or the support team should provide help in running the changes on the toolchain and providing build artifacts, or contributing fixes and workarounds directly.
Supported open source toolchains should have some level of testing in CI (Clang/LLVM beyond native_posix is what I'm thinking)

fabiobaltieri · 2023-10-10T09:38:27Z

cc @cfriedt @evgeniy-paltsev @galak @gmarull @karben @keith-zephyr @kokas-a @nashif @tejlmand

cfriedt · 2023-10-10T11:38:50Z

At one point, we spoke about reporting test results from trusted test runners outside of Zephyr CI such as SoC vendors like ST or NXP.

The toolchain spec should be an integral part of that. Possibly an enumeration. Custom LLVM and custom GCC should also be a thing. We have for example CROSS_COMPILE, but it would be great to capture more metadata than that for reporting purposes.

Like external modules, external toolchains would need to run a subset of tests (often proprietary toolchains only support a limited number of platforms).

Maybe we could reuse some of that machinery

fabiobaltieri · 2023-10-10T13:17:45Z

At one point, we spoke about reporting test results from trusted test runners outside of Zephyr CI such as SoC vendors like ST or NXP.

Would it make sense to investigate the possibility of having some custom runners with those toolchains available running a limited set of core tests? Something on the order of magnitude of the current Clang/LLVM one.

tejlmand · 2023-10-10T13:20:28Z

Would it make sense to investigate the possibility of having some custom runners with those toolchains available running a limited set of core tests? Something on the order of magnitude of the current Clang/LLVM one.

it makes a lot of sense, and has been discussed in Toolchain WG, but guess it mostly has to do with priorities.
So if anyone have the skills and bandwidth to carry out such work, then it will be highly appreciated.

Fixes: zephyrproject-rtos#63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

tejlmand · 2023-10-10T13:30:37Z

please find the following proposal as a starting point: #63759

Hopefully this will provide more visible information on whom to contact in case of issues related to those toolchains.
Not the final solution, but a step in a better direction.

Fixes: zephyrproject-rtos#63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

abrodkin · 2023-10-10T20:16:57Z

I do support that discussion here as we (Synopsys ARC maintainers) and our customers are suffering from exactly that problem: even though we do quite extensive in-house testing with our proprietary tools, simulators and boards, we only do it post-merge upstream. In that sense we'd prefer to enable 3rd-party developers to see problems which are introduced by their changes before merging these changes in the upstream main tree.

But then we start seeing logistical problems:

From one point of view it's hard to make proprietary tools and even more so HW boards freely available to the wide audience. There're usually legal and availability/scalability problems.
From another point of view all the above difficulties could be solved in one or another way, but that requires some work to be done. For example, come-up with extensible Zephyr CI pipeline architecture so that external (vendor managed) runners could be involved. And on top of the initial implementation we'll get problems with stability (i.e. we'll need troubleshoot things in those external runners) and, again, scalability (even if we may allocate 10 boards, it will be nowhere close to required bandwidth).

That said I fully agree with @tejlmand here: if there's somebody who's willing to bite that bullet and work on that, we'll provide all the support with reviewing, testing etc.

Fixes: zephyrproject-rtos#63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

fabiobaltieri · 2023-10-11T09:50:09Z

@tejlmand @abrodkin do you know if the licensing model of those toolchains would even allow usage in CI for a public project? I imagine we could not just make a public docker image with those, but if someone were to put the work to set it up, would it be possible licensing wise to have custom runners with proprietary compilers installed?

I'm not thinking board testing or anything particularly extensive, just something to build few core samples to just test the core stuff, at least as a start, but if the licensing does not allow that then it's not even worth discussing it. :-)

tejlmand · 2023-10-11T10:14:40Z

@tejlmand @abrodkin do you know if the licensing model of those toolchains would even allow usage in CI for a public project?

Past talks with ARM regarding Arm compiler 6 licenses has been positive, so should be possible, but we haven't gotten to the point where we have a concrete way of doing the implementation with CI.
I would say we're lacking that step before having a final 👍 / 👎 to the solution.

cfriedt · 2023-10-11T12:04:09Z

@tejlmand
GitHub also fully supports Remote Test Runners as shown in a recent Tech Talk by @kartben featuring @szczys from Golioth.
https://www.youtube.com/live/940O1CUgh4Q?si=zJlRCSqFqO73tHla

https://blog.golioth.io/golioth-hil-testing-part1/

Technically speaking, Synopsys (or another partner) could easily instantiate the existing public Zephyr container image (on-premesis for hardware-in-the-loop testing or in a private cloud for emulation), add a volume on top of that for the proprietary toolchains, add a secret for a toolchain license key, and lastly add another secret for AWS storage (or elsewhere) to provide test reports.

Very doable. My only concerns are the volume of traffic and of course security (it would be best to host the runner in an isolated environment).

fabiobaltieri · 2023-10-11T14:06:07Z

Very doable. My only concerns are the volume of traffic and of course security (it would be best to host the runner in an isolated environment).

Yeah that would be the main concern, there's some documentation here about that: https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#self-hosted-runner-security, it's not trivial to set up properly but it can certainly be done.

Fixes: zephyrproject-rtos#63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

abrodkin · 2023-10-11T23:01:06Z

From discussion with @stephanosio a couple of months ago I understood that idea of using a private/internal CI image in the main Zephyr CI was rejected by the TSC in the past. That's because it limits user's ability to easily reproduce the issues seen in the CI locally (i.e. users won't be able to simply download the CI docker image and run the tests locally).

And indeed, that's a good question: how to react on failures reproduced by vendor runners?

I mean, what if your PR triggers some issue in a vendor CI. At best you see a build issue and from compiler error message may get enough information for the fix. But what if a test fails during execution: how are you going to debug it? I see only 2 options:

Get creative and re-spin your PR until the problem is gone.
Get hold off the vendor's maintainers and "co-debug" with the person.

If we're OK with such limitations, let's try to prototype some solution for "external" vendor Zephyr CI runners.

Also, speaking of Synopsys runner I'd say we don't need to have on-premises runners which will build tests and run on SDK's QEMU or on our proprietary nSIM simulator. Instead, the same cloud infrastructure which is used by the main Zephyr CI will do for us. I mean the same cloud provider like AWS or whatever is used now.

Fixes: #63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

fabiobaltieri · 2023-10-12T10:22:18Z

Yeah ideally the toolchain would be generally available, but having it in CI only seems like a better compromise than not testing at all.

Instead, the same cloud infrastructure which is used by the main Zephyr CI will do for us. I mean the same cloud provider like AWS or whatever is used now.

Sounds ideal, private runners seems like a bit of a logistic nightmare. I've no idea how the docker setup works though, @cfriedt could you expand a bit on that?

I'm also wondering if there's some other open source project on GitHub running such a setup that we could look into.

stephanosio · 2023-10-16T11:24:57Z

As @abrodkin mentioned, the problem has more to do with the project policy (i.e. to ensure that any problems in the Zephyr main CI are locally reproducible) than technical hurdles.

In fact, I have already tested the "private" CI Docker image idea for supporting ARM FVP in our CI in #45099; but, it was rejected for the aforementioned reason.

Yeah ideally the toolchain would be generally available, but having it in CI only seems like a better compromise than not testing at all.

@fabiobaltieri The compromise suggested by the TSC at the time was to create standalone workflows (separate from the Zephyr main CI twister workflow) that run with proprietary tools; but, this comes with some logistical issues (see #45099 (comment)).

By the way, in terms of the CI infrastructure, we already have a very flexible Kubernetes-based self-hosted runner deployment where we can use private Docker images for a specific set of runners or even mount an NFS volume containing proprietary tools as needed. With the current infrastructure, we do not need externally hosted runners, which would truly be a logistical nightmare.

Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. (cherry picked from commit 2f79f5e) Original-Fixes: zephyrproject-rtos#63738 Original-Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no> GitOrigin-RevId: 2f79f5e Change-Id: Ic01f3f9411ac143e1faae10e19c13ebe60f818a4 Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/zephyr/+/4933966 Tested-by: Keith Short <keithshort@chromium.org> Tested-by: ChromeOS Prod (Robot) <chromeos-ci-prod@chromeos-bot.iam.gserviceaccount.com> Reviewed-by: Keith Short <keithshort@chromium.org> Commit-Queue: Keith Short <keithshort@chromium.org>

Fixes: zephyrproject-rtos#63738 Create dedicated entries for ARC MWDT, arm compiler 6, and one Api toolchains. This helps contributors to identify whom to contact in case of issues related to those toolchains. The Zephyr SDK, cross-compile, and other GCC based compilers are covered as part of the general cmake/toolchain,compiler,linker,bintools entry in the MAINTAINERS file. Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>

marc-hb · 2024-02-29T05:05:00Z

I mean, what if your PR triggers some issue in a vendor CI. At best you see a build issue and from compiler error message may get enough information for the fix. But what if a test fails during execution: how are you going to debug it?

Good point. Also, trying to somewhat "centralize" the CI of multiple, proprietary toolchains may not be very "scalable"[1]

FWIW SOF CI (tries to) test the Zephyr main branch daily. The feedback loop is obviously slower than testing PRs before merge but it's not too bad and has exposed many toolchain issues fairly quickly. Post-merge but very soon after merge.

All in all this has so far proved to be a decent trade-off and this obviously scales to any number of proprietary tools because there is zero centralized involvement. The lack of permissions management and other administrative tasks like licenses is especially nice (I saw scary keywords like "security" and "logistical" lurking above already...)

Testing daily means the workload is constant, which in turn means more tests can be run for longer. Testing daily and pre-merge are of course not mutually exclusive; but trying to test every PR with some proprietary toolchain would be putting the cart before the horses if that toolchain does not already have a good track record with some independent automation able to test the Zephyr main branch regularly and engineers actually looking at failures and trying to fix them. In my experience, the second most common CI sin is trying to automate something that isn't maintained much in the first place [2]

My 2 SOF cents.

[1] An approach where some sort of "publisher" workflow notifies a list of distributed and independent CIs would help.
[2] The most common CI sin is regularly testing configurations that developers don't use.

fabiobaltieri · 2024-02-29T11:37:16Z

Good point. Also, trying to somewhat "centralize" the CI of multiple, proprietary toolchains may not be very "scalable"[1]

I would more interested in a plan to make such toolchain available to any Zephyr developer as needed somehow.

marc-hb · 2024-02-29T15:58:51Z

I would more interested in a plan to make such toolchain available to any Zephyr developer as needed somehow.

Anyone can change sof/west.yml today, point it at their (unmerged) Zephyr commit and submit it to SOF CI.

I agree a more direct method would be extremely useful. It's also more complicated to set up.

stephanosio · 2024-03-02T01:57:00Z

I would more interested in a plan to make such toolchain available to any Zephyr developer as needed somehow.

The problem is more on the legal (licensing) side than the technical side -- obviously, you cannot just let anyone use a proprietary toolchain for which Zephyr Project has been licensed, without getting a permission from the company that licensed it.

On the technical side, it is actually quite simple -- just provide a one-time SSH access to an ephemeral runner (mxschmitt/action-tmate) with the proprietary toolchain.

marc-hb · 2024-03-02T18:21:46Z

You mean it's technically easy to provide access to a proprietary toolchain if it is... not proprietary? :-)

cfriedt · 2024-03-02T20:35:45Z

Sounds ideal, private runners seems like a bit of a logistic nightmare. I've no idea how the docker setup works though, @cfriedt could you expand a bit on that?

I remember using private runners via gitlab a long time ago. It was surprisingly easy. Not entirely sure how it's done with GitHub, but I did linked to a video about it up above.

TL;DR - private runners get authenticated with a key and communicate via a server API as if they were regular runners.

fabiobaltieri added the RFC Request For Comments: want input from the community label Oct 10, 2023

fabiobaltieri changed the title ~~Proprietary toolchain compatibility testing~~ Proprietary toolchain support sustainability Oct 10, 2023

henrikbrixandersen added the area: Toolchains Toolchains label Oct 10, 2023

zephyrbot assigned tejlmand Oct 10, 2023

gmarull mentioned this issue Oct 10, 2023

dts: drop HAS_DTS #63696

Merged

tejlmand mentioned this issue Oct 10, 2023

MAINTAINERS: Create dedicated entries for 3rd party toolchains #63759

Merged

jhedberg closed this as completed in #63759 Oct 12, 2023

fabiobaltieri reopened this Oct 12, 2023

stephanosio self-assigned this Oct 16, 2023

henrikbrixandersen mentioned this issue Oct 28, 2023

Portable C output generation #64522

Open

marc-hb mentioned this issue Apr 15, 2024

toolchain: gcc: Simplify GEN_ABSOLUTE_SYM and GEN_ABSOLUTE_SYM_KCONFIG #70999

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proprietary toolchain support sustainability #63738

Proprietary toolchain support sustainability #63738

fabiobaltieri commented Oct 10, 2023 •

edited

Loading

fabiobaltieri commented Oct 10, 2023 •

edited by stephanosio

Loading

cfriedt commented Oct 10, 2023

fabiobaltieri commented Oct 10, 2023

tejlmand commented Oct 10, 2023

tejlmand commented Oct 10, 2023

abrodkin commented Oct 10, 2023

fabiobaltieri commented Oct 11, 2023

tejlmand commented Oct 11, 2023

cfriedt commented Oct 11, 2023 •

edited

Loading

fabiobaltieri commented Oct 11, 2023 •

edited

Loading

abrodkin commented Oct 11, 2023

fabiobaltieri commented Oct 12, 2023

stephanosio commented Oct 16, 2023

marc-hb commented Feb 29, 2024

fabiobaltieri commented Feb 29, 2024

marc-hb commented Feb 29, 2024

stephanosio commented Mar 2, 2024

marc-hb commented Mar 2, 2024

cfriedt commented Mar 2, 2024

Proprietary toolchain support sustainability #63738

Proprietary toolchain support sustainability #63738

Comments

fabiobaltieri commented Oct 10, 2023 • edited Loading

Introduction

Problem description

Proposed change

fabiobaltieri commented Oct 10, 2023 • edited by stephanosio Loading

cfriedt commented Oct 10, 2023

fabiobaltieri commented Oct 10, 2023

tejlmand commented Oct 10, 2023

tejlmand commented Oct 10, 2023

abrodkin commented Oct 10, 2023

fabiobaltieri commented Oct 11, 2023

tejlmand commented Oct 11, 2023

cfriedt commented Oct 11, 2023 • edited Loading

fabiobaltieri commented Oct 11, 2023 • edited Loading

abrodkin commented Oct 11, 2023

fabiobaltieri commented Oct 12, 2023

stephanosio commented Oct 16, 2023

marc-hb commented Feb 29, 2024

fabiobaltieri commented Feb 29, 2024

marc-hb commented Feb 29, 2024

stephanosio commented Mar 2, 2024

marc-hb commented Mar 2, 2024

cfriedt commented Mar 2, 2024

fabiobaltieri commented Oct 10, 2023 •

edited

Loading

fabiobaltieri commented Oct 10, 2023 •

edited by stephanosio

Loading

cfriedt commented Oct 11, 2023 •

edited

Loading

fabiobaltieri commented Oct 11, 2023 •

edited

Loading