New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18.09 Zero Hydra Failures #45960

Open
samueldr opened this Issue Sep 2, 2018 · 73 comments

Comments

Projects
None yet
@samueldr
Member

samueldr commented Sep 2, 2018

image

Let's make Jellyfish the best release so far!

We have: the main jobset starting at 425 failures, x86_64-darwin at 829, and aarch64-linux at ~1120. The numbers may seem large, but one weird trick appropriate fix may fix many at once.

How can I help???

  • Choose some package from those that fail on Hydra. Hurry before the good ones have been taken!
  • Find a fix. That may mean simply restricting meta.platforms, in case the package inherently doesn't support what it has in there ATM.
  • Typically the package was broken on master already. You can verify that on Hydra - example URL: https://hydra.nixos.org/job/nixpkgs/trunk/bash.x86_64-linux In that case base the fix on master and request backporting in the description of the pull request.
  • Ping this issue from the PR, e.g. /cc ZHF #45960. Do this also if you have some WIP. Alternatively you may just post a note in this issue. If the breakage is specific to darwin #45961 or aarch64 #45962 mention the respective issue instead.

The remaining packages will be marked as broken before the release (on the failing platforms), i.e. at the end of September. /cc @NixOS/nixpkgs-committers, but everyone can help out!

@samueldr samueldr added this to the 18.09 milestone Sep 2, 2018

@worldofpeace worldofpeace referenced this issue Sep 3, 2018

Merged

Various build fixes #45967

4 of 9 tasks complete

@symphorien symphorien referenced this issue Sep 3, 2018

Merged

purple-matrix: 2016-07-11 -> 2018-08-02 #45974

5 of 9 tasks complete

@danieldk danieldk referenced this issue Sep 3, 2018

Merged

vowpalwabbit: fix build against boost-python. #45987

4 of 9 tasks complete

@veprbl veprbl referenced this issue Sep 3, 2018

Merged

[18.09] Revert "arrow-cpp: 0.9.0 -> 0.10.0" #45991

0 of 9 tasks complete

danieldk added a commit to danieldk/nixpkgs that referenced this issue Sep 3, 2018

danieldk added a commit to danieldk/nixpkgs that referenced this issue Sep 3, 2018

Keras: fix build by updating expected dependencies.
Keras expects keras_preprocessing 1.0.2 and 1.0.4. 1.0.3 and 1.0.5
are respectively in nixpkgs.

ZHF #45960

@symphorien symphorien referenced this issue Sep 3, 2018

Merged

gede: 2.6.1 -> 2.10.9 #45995

4 of 9 tasks complete
@volth

This comment has been minimized.

Show comment
Hide comment
@volth

volth Sep 3, 2018

Contributor

Reverting ad47c38 will fix nixpkgs.perl*Packages.MouseXGetOpt

Contributor

volth commented Sep 3, 2018

Reverting ad47c38 will fix nixpkgs.perl*Packages.MouseXGetOpt

@xeji

This comment has been minimized.

Show comment
Hide comment
@xeji

xeji Sep 3, 2018

Contributor

Reverting ad47c38 will fix nixpkgs.perl*Packages.MouseXGetOpt

reverted in 9889c0f and 4c00a04

Contributor

xeji commented Sep 3, 2018

Reverting ad47c38 will fix nixpkgs.perl*Packages.MouseXGetOpt

reverted in 9889c0f and 4c00a04

xeji added a commit that referenced this issue Sep 3, 2018

Keras: fix build by updating expected dependencies. (#45992)
Keras expects keras_preprocessing 1.0.2 and 1.0.4. 1.0.3 and 1.0.5
are respectively in nixpkgs.

ZHF #45960

(cherry picked from commit e33be2a)

xeji added a commit that referenced this issue Sep 3, 2018

Keras: fix build by updating expected dependencies. (#45992)
Keras expects keras_preprocessing 1.0.2 and 1.0.4. 1.0.3 and 1.0.5
are respectively in nixpkgs.

ZHF #45960

@dywedir dywedir referenced this issue Sep 3, 2018

Merged

ion: broken on darwin #46010

0 of 9 tasks complete

@markuskowa markuskowa referenced this issue Sep 3, 2018

Merged

gnss-sdr: set boost version to 1.66 #46014

4 of 9 tasks complete

@andir andir referenced this issue Sep 3, 2018

Merged

python.pkgs.pytest-fixture-config: disable tests #46021

3 of 9 tasks complete
@samueldr

This comment has been minimized.

Show comment
Hide comment
@samueldr

samueldr Sep 4, 2018

Member

Failures report as of right now.

Let's see how useful this is as a format. This was queried from the last finished eval, there were evals running while this was made.

Member

samueldr commented Sep 4, 2018

Failures report as of right now.

Let's see how useful this is as a format. This was queried from the last finished eval, there were evals running while this was made.

@volth

This comment has been minimized.

Show comment
Hide comment
@volth

volth Sep 4, 2018

Contributor

Failures report as of right now.

I assumed that perl52[68]Packages.TestMagpie should be ignored as broken because it depends on broken perl52[68]Packages.UNIVERSALref (#45983).

Or should each dependent have its own meta.broken = versionAtLeast perl.version "5.26" ?

Contributor

volth commented Sep 4, 2018

Failures report as of right now.

I assumed that perl52[68]Packages.TestMagpie should be ignored as broken because it depends on broken perl52[68]Packages.UNIVERSALref (#45983).

Or should each dependent have its own meta.broken = versionAtLeast perl.version "5.26" ?

@xeji

This comment has been minimized.

Show comment
Hide comment
@xeji

xeji Sep 4, 2018

Contributor

This was queried from the last finished eval, there were evals running while this was made.

@volth the table shows an earlier eval where UNIVERSALref wasn't marked as broken yet, see the build logs, it is fine in the latest eval: https://hydra.nixos.org/eval/1477017#tabs-removed

Or should each dependent have its own meta.broken = versionAtLeast perl.version "5.26" ?

No, only the package that is broken itself.

Contributor

xeji commented Sep 4, 2018

This was queried from the last finished eval, there were evals running while this was made.

@volth the table shows an earlier eval where UNIVERSALref wasn't marked as broken yet, see the build logs, it is fine in the latest eval: https://hydra.nixos.org/eval/1477017#tabs-removed

Or should each dependent have its own meta.broken = versionAtLeast perl.version "5.26" ?

No, only the package that is broken itself.

@danieldk danieldk referenced this issue Sep 4, 2018

Merged

mxnet: 1.1.0 -> 1.2.1 #46026

3 of 9 tasks complete
@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Sep 4, 2018

Member

I'd add that broken and other checks are transitive during evaluation (implemented as exceptions).

Member

vcunat commented Sep 4, 2018

I'd add that broken and other checks are transitive during evaluation (implemented as exceptions).

@timokau

This comment has been minimized.

Show comment
Hide comment
@timokau

timokau Sep 4, 2018

Contributor

The sage failures are due to a mistake I made when adding pkg-config aliases to openblas and the recent numpy update. I fixed openblas in #46016 in staging. I don't know if that means it will also be merged into 18.09. I haven't gotten to backporting the numpy upgrade from sage upstream yet.

I really think it is a shame that hydra doesn't ping maintainers on failures anymore. Seems like an essential feature to miss.

Contributor

timokau commented Sep 4, 2018

The sage failures are due to a mistake I made when adding pkg-config aliases to openblas and the recent numpy update. I fixed openblas in #46016 in staging. I don't know if that means it will also be merged into 18.09. I haven't gotten to backporting the numpy upgrade from sage upstream yet.

I really think it is a shame that hydra doesn't ping maintainers on failures anymore. Seems like an essential feature to miss.

@xeji

This comment has been minimized.

Show comment
Hide comment
@xeji

xeji Sep 4, 2018

Contributor

@vcunat @samueldr there are a number of changes currently in staging/staging-next that should go to 18.09 once they reach master - openblas, texlive 2018, a systemd bugfix, etc.
What's the workflow for these? Guess we'll need a staging-18.09 branch + Hydra job.

Contributor

xeji commented Sep 4, 2018

@vcunat @samueldr there are a number of changes currently in staging/staging-next that should go to 18.09 once they reach master - openblas, texlive 2018, a systemd bugfix, etc.
What's the workflow for these? Guess we'll need a staging-18.09 branch + Hydra job.

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Sep 4, 2018

Member

Since the fork point the staging branch won't get to 18.09 anymore. Cherry-picking should be done if desired. For this one I did it in 6f8e07a.

Member

vcunat commented Sep 4, 2018

Since the fork point the staging branch won't get to 18.09 anymore. Cherry-picking should be done if desired. For this one I did it in 6f8e07a.

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Sep 19, 2018

Member

I don't expect large transient failures on mac anymore – those would probably be visible as jumps on large rebuilds. Looking at Hydra numbers, during this ZHF, aarch64 has seen lots of improvements, whereas darwin didn't improve much. (can't say why)

Member

vcunat commented Sep 19, 2018

I don't expect large transient failures on mac anymore – those would probably be visible as jumps on large rebuilds. Looking at Hydra numbers, during this ZHF, aarch64 has seen lots of improvements, whereas darwin didn't improve much. (can't say why)

Mic92 added a commit to Mic92/nixpkgs that referenced this issue Sep 19, 2018

python3.pkgs.typeguard: fix builds by applying utf-8 locales
ZHF #45960

(cherry picked from commit c9fc0a609dd0eb4fad67394148bf9adb66a79e41)

Mic92 added a commit to Mic92/nixpkgs that referenced this issue Sep 19, 2018

Mic92 added a commit that referenced this issue Sep 19, 2018

@LnL7

This comment has been minimized.

Show comment
Hide comment
@LnL7

LnL7 Sep 20, 2018

Contributor

A lot of what's broken on darwin, 340 jobs + new ones, have been for a long time or forever. I've not had the time/energy to go though those and mark them linux only.

Contributor

LnL7 commented Sep 20, 2018

A lot of what's broken on darwin, 340 jobs + new ones, have been for a long time or forever. I've not had the time/energy to go though those and mark them linux only.

@xeji

This comment has been minimized.

Show comment
Hide comment
@xeji

xeji Sep 20, 2018

Contributor

@LnL7 maybe you can define clear criteria like "if it's been broken on darwin for >x months it should be marked linux only?" and open a WIP PR so all committers could add to it whenever they have time?

Contributor

xeji commented Sep 20, 2018

@LnL7 maybe you can define clear criteria like "if it's been broken on darwin for >x months it should be marked linux only?" and open a WIP PR so all committers could add to it whenever they have time?

@LnL7

This comment has been minimized.

Show comment
Hide comment
@LnL7

LnL7 Sep 20, 2018

Contributor

I linked to an overview of the jobs that have been broken for over a year on the darwin ZHF issue, the rest is kind of a grey zone. I started to go over those marking them broken or linux only where appropriate, like #46584 and #46628, but I'm kind of busy with other things now.

Contributor

LnL7 commented Sep 20, 2018

I linked to an overview of the jobs that have been broken for over a year on the darwin ZHF issue, the rest is kind of a grey zone. I started to go over those marking them broken or linux only where appropriate, like #46584 and #46628, but I'm kind of busy with other things now.

@xeji

This comment has been minimized.

Show comment
Hide comment
@xeji

xeji Sep 21, 2018

Contributor

staging-18.09 looks pretty good on Hydra, but there are some timed out gnome3 jobs that need restarting.
@vcunat @samueldr

Contributor

xeji commented Sep 21, 2018

staging-18.09 looks pretty good on Hydra, but there are some timed out gnome3 jobs that need restarting.
@vcunat @samueldr

@vcunat

This comment has been minimized.

Show comment
Hide comment
@vcunat

vcunat Sep 21, 2018

Member

OK, merged. The cheese error looked like a parallel-make problem, but I can't see it happen often, so I simply restarted it along with the rest of new failures.

Member

vcunat commented Sep 21, 2018

OK, merged. The cheese error looked like a parallel-make problem, but I can't see it happen often, so I simply restarted it along with the rest of new failures.

@markuskowa

This comment has been minimized.

Show comment
Hide comment
@markuskowa

markuskowa Sep 22, 2018

Contributor

Could someone please re-trigger julia_10.x86_64-linux, julia_07.x86_64-linux? These two should build successfully.

Contributor

markuskowa commented Sep 22, 2018

Could someone please re-trigger julia_10.x86_64-linux, julia_07.x86_64-linux? These two should build successfully.

@samueldr

This comment has been minimized.

Show comment
Hide comment
Member

samueldr commented Sep 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment