Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0021] Pull Request Testing on Hydra #21

Closed
wants to merge 3 commits into from

Conversation

Projects
None yet
10 participants
@globin
Copy link
Member

commented Nov 6, 2017

@globin globin changed the title [RFC 00XX] Pull Request Testing on Hydra [RFC 0021] Pull Request Testing on Hydra Nov 6, 2017

@globin globin force-pushed the mayflower:nixborg branch 2 times, most recently from 5c17235 to 73f3711 Nov 6, 2017

globin added some commits Nov 6, 2017

@globin globin force-pushed the mayflower:nixborg branch from 73f3711 to 6c48f62 Nov 6, 2017

@zimbatm

This comment has been minimized.

Copy link
Member

commented Nov 6, 2017

Sorry for bringing this up again, can you expand on why we can't re-use bors? It would be nice to re-use existing software if we can. Basically can Hydra:

  • build a set of fixed branches every time some code is pushed to them?
  • report the build status as a github build status on the commit that has been built?
@globin

This comment has been minimized.

Copy link
Member Author

commented Nov 7, 2017

Currently it can do neither, I also have too many difficulties writing perl to be able to implement that properly..

Also we definitely don't want the bot merging PRs right now, this might be an option in the future but currently it should only give us an opportunity to test PRs more easily.

@Profpatsch

This comment has been minimized.

Copy link
Member

commented Nov 7, 2017

Sorry for bringing this up again, can you expand on why we can't re-use bors? It would be nice to re-use existing software if we can.

Would be awesome to reuse the work of the rust team. Especially considering not very many people know (or want to write) perl these days, but lots of people want to learn Rust anyway.

@moretea

This comment has been minimized.

Copy link

commented Nov 7, 2017

@zimbatm @Profpatsch AFAIK, the bors currently used by Cargo (package manager for Rust) is bors-ng which is implemented in Elixir.

It appears that they don't use it yet for Rust itself (see bottom comments in this thread)

@globin

This comment has been minimized.

Copy link
Member Author

commented Nov 7, 2017

Can we please keep discussing PR testing on hydra for now which is a small piece of work compared to changing our whole CI system.

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Nov 7, 2017

I think we all agree that eventually, only CI should be changing master (i.e. the ref itself). Is it a good idea to tackle this in this RFC, or best to leave it for later? Relatedly, I'm a bit unsure on rebasing or merging PRs.

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Nov 8, 2017

@Ericson2314

This «eventually» requires more Hydra capacity than there seems to be right now. And more trust in the handling of transient failures than we currently have (and a solution probably requires that the transient failures, and not the capacity, are the bottleneck for large rebuilds). Also a clear idea of what breakage is acceptable at what points (this obviously depends on the previous points, because we need confidence that we find out about failures, and find out quickly). This RFC is a good step to try now (and see what needs fixing), and only-CI-moves-master (and even only-CI-merges-PRs) seems significantly out of scope.

@globin

This comment has been minimized.

Copy link
Member Author

commented Nov 8, 2017

I generally agree with most of the points mentioned but I wanted to keep this RFC rather small in scope to not reach a deadlock in discussion and planning while not moving forward :)

@fpletz

fpletz approved these changes Nov 9, 2017

Copy link
Member

left a comment

The real question here is if we want PR testing on Hydra as a service for the community or not.

I'm not sure if we have enough resources but this should be implemented. Maybe even on a second Hydra or with another bot. The specifics aren't important.

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Nov 9, 2017

@7c6f434c the incremental step would be having CI able to push to master (if it's a fast-forward to the commit it just built successfully), without requiring CI to push to master. The extra step of going back to merge a PR after CI succeed is generally a waste of human time, too.

That should ideally be just an additional @-mention command + a conditional git push HEAD:master || true + credentials for bot, but yes I don't want to derail this on bonus features if its more work than that.

[design]: #detailed-design

The general idea is to have a service which receives webhooks from github,
then requires some acknowledgement of a maintainer to build/test a PR and

This comment has been minimized.

Copy link
@7c6f434c

7c6f434c Nov 10, 2017

Member

Re: «a maintainer» — maybe add something about supporting a flexible definition of a maintainer (any NixOS project member/Nixpkgs committer or the package maintainer who doesn't yet have access) to «Future work»? Not in the main body, because this discussion should not delay checking if Hydra can build all PRs. Maybe also people who have multiple accepted commits to the package.

This comment has been minimized.

Copy link
@moretea

moretea Nov 10, 2017

See #19 also.

AFAIK the current implementation of that bot just has a hardcoded list of people who should be allowed access.

I think that people from the maintainers file + anyone who currently has commit access should be allowed to instruct the bot via @-mentions.

This comment has been minimized.

Copy link
@globin

globin Nov 12, 2017

Author Member

Currently "a maintainer" refers to a contributor with commit access, "member" in GitHub terminology, I'll clarify this. In the current test instance this doesn't work as @nixborg is not member of the NixOS GitHub org and therefore cannot read the membership status. That's why it currently has an additional hard-coded whitelist, which should be removed in the future..

- Which "release.nix" should be used to test:
- We probably want to test aarch64 and x86_64-darwin, not only x86_64-linux, as only a little number of maintainers/pull requesters have access to these architectures.
- We probably want to also test NixOS module tests, especially for PRs touching these
- Can Hydra handle the additional workload by this, we probably have to start with a small number of PRs being tested to evaluate this.

This comment has been minimized.

Copy link
@7c6f434c

7c6f434c Nov 10, 2017

Member

That, and maybe perform initial tests only x86_64-linux? How the Hydra capacity (or maybe scaling costs, if cloud builds are used regularly — I do not know) compare across targets?


Some small remarks on the architecture:
- the rebasing is done in order to:
- make sure Hydra builds a commit that has been reviewed by a maintainer

This comment has been minimized.

Copy link
@grahamc

grahamc Nov 11, 2017

Member

We should note that the bot compares the time the nixborg comment was made to when the last push was, to ensure there isn't a race condition between the comment and a malicious push.

This comment has been minimized.

Copy link
@globin

globin Nov 12, 2017

Author Member

Will do

@globin

This comment has been minimized.

Copy link
Member Author

commented Nov 12, 2017

Added some minor clarifications, after the comments above and am happy that @grahamc is co-authoring this.

@edolstra do you have some comments on this?

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Nov 13, 2017

It just occurred to me that in the failing case I wish the original branch was being built, so I'd have some cached binaries to help debug. But I suppose the PR could be updated to the nixborg's rebase itself.

Also, does it rebase on master or non-master when the target branch is non-master (e.g. staging)? The answer, arrived at empirically, is the actual target branch.

@globin

This comment has been minimized.

Copy link
Member Author

commented Nov 16, 2017

Says so in the RFC, too:

  1. a nixborg-worker rebases the PR on the base branch of the PR (mainly master, staging or release-XX.XX)

@moretea moretea referenced this pull request Nov 16, 2017

Closed

Structured maintainers #31739

0 of 1 task complete
@zimbatm

This comment has been minimized.

Copy link
Member

commented Nov 19, 2017

How much work left is required to make this work?

The sooner we have any sort of automation the better, and it's hard to predict the problems that might come up without deploying and running the code. So instead of mulling over the RFC too much, I'm in favor of getting something running and have a trial period, let's say 3 months. After that period we will review the state of things and decide if we want to keep it or not.

During that period, the implementer need to have access to the machines where the code is deployed so that they can fix and iterate.

The only big concern would be to overload Hydra. If that happens we either need a separate Hydra deployment or a way to pull the plug until a solution can be found.

@dtzWill

This comment has been minimized.

Copy link
Contributor

commented Feb 20, 2018

This may be incorrect or premature, but... hasn't ofborg filled this gap more or less?

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Feb 20, 2018

@zimbatm

This comment has been minimized.

Copy link
Member

commented Feb 20, 2018

It's coming. The next step is going to be able to share the binary cache between ofborg and hydra so that hydra can basically become a noop.

@edolstra

This comment has been minimized.

Copy link
Member

commented Feb 20, 2018

In that case, we should just get rid of Hydra?

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Feb 20, 2018

@zimbatm not yet close in power to what is needed

@edolstra it's like a blockchain, if in the future Borg can perform a 51% buildrate attack against Hydra…

@edolstra

This comment has been minimized.

Copy link
Member

commented Jan 10, 2019

Closing after discussion in the RFC SC meeting: this is obsoleted by ofborg.

@edolstra edolstra closed this Jan 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.