Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable most VM tests #80036

Closed
wants to merge 2 commits into from
Closed

Conversation

@edolstra
Copy link
Member

@edolstra edolstra commented Feb 13, 2020

Motivation for this change

Unfortunately VM tests take an excessive amount of time and memory to evaluate (more than half an hour, dozens of gigabytes of RAM). The tested job alone takes 5 minutes and ~12 GB. So disable most VM tests. Most of these should be moved into separate repos/flakes anyway in the future.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Unfortunately these take an excessive amount of time and memory to
evaluate (more than half an hour, dozens of gigabytes of RAM).
@edolstra edolstra added this to the 20.03 milestone Feb 13, 2020
@grahamc
Copy link
Member

@grahamc grahamc commented Feb 13, 2020

This is unfortunate, I think they are extremely valuable. I wonder if there is a way we can test each of these as part of the release, but not in the same evaluation? I hate to lose permanently such an important thing. None of the removals specifically made me squirm.

@dtzWill
Copy link
Contributor

@dtzWill dtzWill commented Feb 13, 2020

Perhaps conditionally disable? For a "quick-tests" vs "release-test-all-the-things" jobset?

If resources are needed, perhaps a fundraising round would be a good idea? (or is the sort of agility lost w/these not easily addressed that way?)

Perhaps there are ways to slim the eval impact of our vm tests, with some re-use or something? (not sure how much that's been looked at...)

Disabled tests seem like they might as well be removed, which would be unfortunate indeed.
Since most contributors don't have the ability to run these tests regularly, would pushing to flakes be useful for evaluation costs or because .. these tests aren't run anymore by NixOS infra?
(not sure if the suggestion is to push into flakes for some eval-time improvements that don't require removing them...?)

@grahamc
Copy link
Member

@grahamc grahamc commented Feb 13, 2020

@ofborg ofborg bot added the 6.topic: nixos label Feb 13, 2020
@edolstra edolstra force-pushed the edolstra:disable-most-vm-tests branch from 9610421 to 2ea4427 Feb 13, 2020
@@ -97,11 +97,11 @@ in rec {
(all nixos.tests.ipv6)
(all nixos.tests.i3wm)
(except ["aarch64-linux"] nixos.tests.keymap.azerty)

This comment has been minimized.

@worldofpeace

worldofpeace Feb 13, 2020
Member

azerty was commented before I re-added these bf49181#diff-0f15ebe03b218d11d461288a9b051eeb

Copy link
Member

@worldofpeace worldofpeace left a comment

These omissions made me uncomfortable.
But doing this certainly is 😿

netdata = handleTest ./netdata.nix {};
#ndppd = handleTest ./ndppd.nix {};
#neo4j = handleTest ./neo4j.nix {};
#nesting = handleTest ./nesting.nix {};

This comment has been minimized.

@worldofpeace

worldofpeace Feb 13, 2020
Member

this tests nesting.clone and nesting.children, IMHO not a great omission

#openstack-image-userdata = (handleTestOn ["x86_64-linux"] ./openstack-image.nix {}).userdata or {};
#openstack-image-metadata = (handleTestOn ["x86_64-linux"] ./openstack-image.nix {}).metadata or {};
#orangefs = handleTest ./orangefs.nix {};
#os-prober = handleTestOn ["x86_64-linux"] ./os-prober.nix {};

This comment has been minimized.

@worldofpeace

worldofpeace Feb 13, 2020
Member

it scares me to not run this continuously, but it's actually broken so nvm https://hydra.nixos.org/job/nixos/trunk-combined/nixos.tests.os-prober.x86_64-linux

This comment has been minimized.

#kafka = handleTest ./kafka.nix {};
#keepalived = handleTest ./keepalived.nix {};
#kerberos = handleTest ./kerberos/default.nix {};
#kernel-latest = handleTest ./kernel-latest.nix {};

This comment has been minimized.

@worldofpeace

worldofpeace Feb 13, 2020
Member

we have the latest in the releases so I think that's not the best omission either, testing being omitted is fine though.

@vcunat
Copy link
Member

@vcunat vcunat commented Feb 13, 2020

Note: as #79907 wasn't mentioned, I didn't notice and pushed ceb90b0 (20.03 only for now).

#cadvisor = handleTestOn ["x86_64-linux"] ./cadvisor.nix {};
#cassandra = handleTest ./cassandra.nix {};
#ceph-single-node = handleTestOn ["x86_64-linux"] ./ceph-single-node.nix {};
#ceph-multi-node = handleTestOn ["x86_64-linux"] ./ceph-multi-node.nix {};

This comment has been minimized.

@devhell

devhell Feb 13, 2020
Contributor

Ouch. Ceph without tests make me feel queasy.

@flokli
Copy link
Contributor

@flokli flokli commented Feb 13, 2020

I don't really think simply disabling tests is the right way forward here. Also, having test in all-tests.nix (and in nixosTest.*) makes them discoverable.

It surely might ease some of the memory consumption issues we're currently facing, but having those tests around, and having most of the "stable ones" included in the tested jobset, and in this ensuring the features described there don't suddenly break is a very nice property we should not give up.

I'd really prefer if we could teach hydra to evaluate parts of that individually, as proposed in NixOS/hydra#715.

@bjornfor
Copy link
Contributor

@bjornfor bjornfor commented Feb 14, 2020

The NixOS VM tests is a major selling point for me. Removing (so many of) them due to ~12 GiB of memory use and CPU cycles? I don't get that. Human cycles are much more expensive, and a change like this will mean more human cycles will have to be spent.

@flokli
Copy link
Contributor

@flokli flokli commented Feb 18, 2020

With all the changes that happened in the meantime on master and 20.03, I assume we don't plan to disable mostly all tests anymore. Please reopen if I'm wrong.

@flokli flokli closed this Feb 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.