Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mainline QEMU #1971

Merged
merged 3 commits into from Jul 10, 2020
Merged

Use mainline QEMU #1971

merged 3 commits into from Jul 10, 2020

Conversation

alistair23
Copy link
Contributor

@alistair23 alistair23 commented Jun 22, 2020

Pull Request Overview

Instead of cloning a fork QEMU let's instead clone mainline.

This also applies two patches from #1938 but updates them to refer to mainline QEMU instead of the old fork.

Testing Strategy

Running the CI.

TODO or Help Wanted

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make prepush.

Copy link
Contributor

@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git submodules are incredibly un-user-friendly. We can't just hardcode a commit hash in our scripts somehow?

@alistair23
Copy link
Contributor Author

I agree, submodules are awful

@gendx suggested them here: #1865 and I thought that was the general agreement.

Although submodulres are a pain, I don't think hard coding a SHA in a script is any better. In this case I find submodules the lesser of two evils as at least submodules are more standard and easier for a new person to figure out (instead of grepping for SHAs).

ppannuto
ppannuto previously approved these changes Jun 22, 2020
Copy link
Member

@ppannuto ppannuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While submodules do have an awful UX, the build scripts do a good job of hiding them here. I've validated that this all works correctly on my machine.

@bradjc
Copy link
Contributor

bradjc commented Jun 22, 2020

I'm not sure I understand why we need this (presumably short-term?) fix. I don't think submodules are appropriate here (Tock development has nothing to do with QEMU development). If we need a specific version that is fine: we can specify the version in a readme or makefile (we do this with rustup, rust, OT, and openocd, at least). But adding a submodule is a rather big hammer solution to a bit of CI that, as far as I know, is still in the "see how it works" phase.

@alistair23
Copy link
Contributor Author

Something worth considering is that since we have had even limited QEMU support multiple people have asked about it on the Slack channels and GitHub issues. There is interest in running Tock on QEMU and by making it a submodule we simplify the process of running it. QEMU allows people to easily try Tock without having the hardware (for OT it's very expensive) and also simplifies development and debugging, especially for new users who aren't used to embedded..

If you don't want it to be a submodule we can just clone it as part of the CI and tell users to go and build it themselves. I'm not going to hardcode a SHA in a script for a long term solution as that just seems prone to bit rotting and confusing people.

Let me know if that's the final decision and I'll write up some steps.

@bradjc
Copy link
Contributor

bradjc commented Jun 22, 2020

In the context of QEMU being a board, we hardcode versions of boards in READMEs (hifive1 and opentitan), so I guess I don't see an issue with doing that for QEMU too.

As for supporting new users, I really don't think submodules are ever the answer. I think step-by-step instructions are preferable.

@alistair23 alistair23 changed the title gitmodules: Add mainline QEMU as a submodule Use mainline QEMU Jun 22, 2020
@alistair23
Copy link
Contributor Author

Ok, I borrowed @ppannuto's patches from: #1938 (I hope that's ok) and updated them to include steps for building.

No more submodules

@gendx
Copy link
Contributor

gendx commented Jun 23, 2020

Git submodules are incredibly un-user-friendly.

It would be nice to have some more specific explanation for this statement. Although they are not trivial (but neither is version control in general), the documentation is quite clear https://git-scm.com/book/en/v2/Git-Tools-Submodules.

In particular, if one doesn't use the submodule (git submodule init), the folder is not populated.

As a couple examples on related projects:

So outright rejecting submodules because they are supposedly unfriendly without any specifics about what's unfriendly seems wrong to me.

We can't just hardcode a commit hash in our scripts somehow?

In the end, that's what a submodule is under the hood. The main difference I see is that a submodule is more integrated, e.g. you can click directly through the submodule in GitHub's UI to visit the underlying repository.


All in all, I'm not in favor of one or the other. But cloning an external repository at its current HEAD without pinning any commit seems risky to me (as I explained it in #1865 (comment)).

@ppannuto
Copy link
Member

I guess I don't see what is so bad about a submodule in this case, it seems like exactly the right tool for the job?

Right now, the proposed build script is strictly worse than a submodule. It pulls the head of the qemu repo, and we're in the same problem that we are with the riscv brew tap -- different qemu instances depending on when users run this. We could of course fix this by updating the build script to check out a specific version for the qemu repo, but then we are literally recreating exactly what submodules do.

We could also simply download a tarball of the desired qemu commit instead of doing anything with git, and that works fine for the first install case, but it makes updating reliably much harder, as now the make system would have to track which qemu version is installed versus what is desired, and must download a new image if our pinned qemu version ever needs to change, at which point we are recreating even more submodule functionality.

Yes, the UX of submodules is criminally bad, but in this case it's all hidden behind make rules anyway. It has no impact on users that don't run any of the CI + QEMU rules locally and for those that do, they never directly interact with the submodule anyway.

Makefile Outdated Show resolved Hide resolved
@bradjc
Copy link
Contributor

bradjc commented Jun 23, 2020

git clone && git checkout <sha> seems totally fine to me.


A different comment: what about making a tock-testing (or something) repo for experimenting with new CI and testing for Tock? I'm imagining that tool would be run on the tock kernel periodically. I think this would have three benefits:

  1. Testing development could iterate much more quickly, without all of the overhead of opening PRs on the main repo.
  2. CI issues wouldn't disrupt PRs to tock/tock.
  3. It would be the place to explore new benchmarks or cloud-hosted on-device testing.

Downsides that I see:

  1. We would lose some CI testing on each PR. But, I think more testing periodically (say once per day) is worth it.

@ppannuto
Copy link
Member

In this case, the re-implemented submodule tracking is basically already there. @alistair23, if you reproduce the sha-checking that it was doing when it looked at your fork for setup and then add the checkout of a specific commit, I think we'll be at a place where this can go through.

@alistair23
Copy link
Contributor Author

I'm not going to hard code a SHA. I have worked on projects that do that and it's incredibly unfriendly for new users. It's extremely difficult to figure out where the SHA comes from if you don't know the code base. What if your change causes a failure so you are trying to update QEMU but it keeps getting overwritten by the CI scripts?

Plus then it's pretty much the exact same as a sub-module, I don't see why it would be any better. It's just using a custom implementation instead of a standard one.

bradjc
bradjc previously approved these changes Jun 25, 2020
@bradjc
Copy link
Contributor

bradjc commented Jun 26, 2020

Can we disentangle the two use cases? CI infrastructure only needs to support the CI use case, and the CI server needs to know exactly what version to use.

For users we can explain how to get a compatible version of qemu, like we do with so many other tools.

@alistair23
Copy link
Contributor Author

The problem is they are kind of the same use case.

If I make a change and it fails on the CI I want to reproduce the CI locally. If I'm trying to debug why something fails in the QEMU CI then obfuscating the Git SHA makes it harder to figure out what exactly is being tested.

@bradjc
Copy link
Contributor

bradjc commented Jun 26, 2020

I don't have an expectation that one should have to or even be able to reproduce CI locally. What we really want is to run tests on actual boards via CI, and there is no way that all developers would have access to all boards. So I don't see anything wrong with not being able to replicate CI locally in the general sense.

It sounds like we all agree that hardcoding a SHA is not user friendly, either via submodules or makefiles. So why are we so determined to do it? Is there a third option? Maybe QEMU, like risc-v on libtock-c, just isn't there yet.

@alistair23
Copy link
Contributor Author

Running the CI locally is part of the CI policy (3 paragraphs down https://github.com/tock/tock/blob/master/doc/CodeReview.md#3-continuous-integration) and something that I also think makes sense. I agree it shouldn't be a hard requirement, but it's a good goal.

The current option doesn't hard code a SHA, we instead just use master QEMU. That seems like a third option

I know it's easy to say that automated testing is hard so we shouldn't do it yet, but it's very useful. Tock on OpenTitan has been broken for the last few weeks because of a PMP change (made by me) and it was never caught. Even with me manually testing Tock on the board.

Having a CI (I'm working on running apps in the CI as well) saves so much time in debugging regressions and means that Tock will overall be much more stable.

@gendx
Copy link
Contributor

gendx commented Jun 26, 2020

It sounds like we all agree that hardcoding a SHA is not user friendly, either via submodules or makefiles. So why are we so determined to do it? Is there a third option? Maybe QEMU, like risc-v on libtock-c, just isn't there yet.

If the SHA itself is the unfriendly part, we can point to a tag instead, e.g. https://github.com/qemu/qemu/releases/tag/v5.0.0. We can then either point to the corresponding commit via git, or clone a .zip archive from GitHub. In the end is not much different from hardcoding a SHA, except it's a more "user friendly" version name.

But any kind of versioning of a dependency will ultimately come back to a variant of submodules or hardcoding a tag or commit in some script/Makefile, so I don't really understand all the push-back. And again, I still didn't see any specific explanation as to what is unfriendly in submodules.

@alistair23
Copy link
Contributor Author

If hard coding a SHA is the only way this will be accepted then I can do that. Although I really don't like it and think it makes new contributions hard (if their change affects the CI) it is more user friendly to have a more thorough CI.

@alistair23
Copy link
Contributor Author

@gendx Thanks for the idea. I agree with you that pointing to a release is basically the same thing in the end. The part I want to avoid is that a user thinks they are testing x when in fact the CI is going behind their back and testing y.

Submodules fix this by using a standard (although clunky) implementation. Downloading/cloning source from scripts should be avoided where possible.

@bradjc
Copy link
Contributor

bradjc commented Jun 26, 2020

I want to see more CI happen, and I don't think it should be subject to so much review. I know I sound absurd saying that, but I would like to see a tock-[ci|testing|?] repo that can iterate quickly and try out all sorts of continuous integration and testing in a separate work flow from the kernel development. Since that would be highly related to tock development, to me that would make sense as a submodule, and a periodic PR would be needed just to update the submodule sha and therefore update the CI. It is entirely possible, however, that this general approach wouldn't work in the github actions workflow, and in that case I will give up this idea.

While risking a slippery slope argument: if latest QEMU doesn't always work for us right now, is it reasonable to expect that the same QEMU commit will work for all of our QEMU needs going forward? What if one QEMU commit works for one platform but breaks another? This might be more of a problem as #1827 progresses and ARM platforms are involved. Similarly, it might be nice to allow a board author to specify a version of QEMU that they have tested and know works on their board, so even if they can't work on Tock for a while other users will have a known working version.

@alistair23
Copy link
Contributor Author

Splitting the CI into a different repo defeats the purpose of CI. If it doesn't run on every commit then why bother?

I don't think the CI really slows anything down and if running tests in a CI is slowing down development then something else is probably wrong with the development flow.

Latest QEMU should always work, in saying that there could always be regressions and bugs.

There should never be a case where different boards are on different QEMU commits. Some boards will need newer versions of QEMU (OpenTitan for example) but there should never be conflicting versions.

As for known working commits, that is where a submodule would be useful. It would be a known working version that would only be bumped after all QEMU supported boards are tested.

@bradjc
Copy link
Contributor

bradjc commented Jun 27, 2020

A stable version of the CI would run on every commit. It's not that the CI slows down development, it's that development and making sure that PRs don't incorrectly get a red X on them slows down developing CI.

@alistair23
Copy link
Contributor Author

Ah, I see. You mean have the CI logic in a separate repo but still include it in the main Tock repo. That seems difficult to do as usually they end up being tightly integrated (for example adding a new feature means extending the CI to test that feature).

@ppannuto
Copy link
Member

At the risk of talking in circles, I really feel like submodules are the right solution to this problem.

I am staunchly opposed to something non-deterministic, i.e. just pointing to master or something that might change depending on when people set up their QEMU.

For all of the development ease reasons that @alistair23 has mentioned (i.e. branches that live before or after QEMU / CI changes), it would be preferable if this "auto-fixed" itself whenever a user changes branches or pulls updates to the main repository. As I understand it, submodules do that automatically, or, we can recreate the functionality by hard-coding SHAs in scripts. The latter is less discoverable to new users, as it's something custom rather than something standard.


Submodules, for better or worse, as the git standard solution to this problem. Much like we prefer not using cfg, but there are rare places where it's appropriate, I think the same applies here. This is not to be seen as license to add submodules willy-nilly, and there will always be reticence, but we should use them when they are the right tool for the job.

Is there a reason beyond "the UX is awful" to not use submodules?

@bradjc
Copy link
Contributor

bradjc commented Jul 2, 2020

I'm not blocking this, but I'm not supportive.

@alistair23
Copy link
Contributor Author

I have changed this back to submodules.

Makefile Outdated Show resolved Hide resolved
ppannuto and others added 3 commits July 7, 2020 14:52
[ Changes by AF:
 - Update to use mainline QEMU
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
[ Changes by AF:
 - Remove reference to the fork
 - Add details about building QEMU
]
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Copy link
Member

@ppannuto ppannuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Okay, I think this is good to go. Sorry that what should be simple turned into such a big affair.

@bors bors bot merged commit a3b9a8c into tock:master Jul 10, 2020
@alistair23 alistair23 deleted the alistair/qemu branch July 10, 2020 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants