-
-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing{-python}.nix: Remove log pretty-printing cruft #87191
Conversation
This completes the removal of the nested log feature, which previously got removed from Nix, Hydra, stdenv and GNU Make. In particular, this means that the output of VM builds no longer contains a copy of jQuery.
The time window between opening the PR and merging it (2 days) was way too quick for me this week, i did not have a chance to study these changes before they were merged.
Do you have links or any other references to discussions about this that were ongoing?
I don't understand how the change makes the tests more reproducible, as in my experience they were reproducible already, as test authors usually write their tests in a way that they are not timing-sensitive. What test was damaged so hard by the logging that the feature has to be removed completely? The output folder now contains no logging whatsoever. In the beginning i carefully ported the perl driver's logging code to python because there was a lot of pushback against removing it. In the long term the idea was to use native python logging frameworks, but people were against having no logging in the meantime. If it was just about jQuery, it would surely have been sufficient to just remove jquery from the template file and make it a static no-js html page. I am confused about the hurry and the method of removing this. |
I was also baffled by this, and would appreciate some reaction from @edolstra. For now, we kept "feature compatibility" with the perl driver, and originally planned to do all the refactoring and yakshaving after all the remaining perl test driver occurences have been removed. As written in #86486 (comment), this PR didn't really fix the issue it was supposed to fix, and my experiments around using the python logging framework revealed the test driver got a lot less powerful (no more "log nesting" for subtests, as there's no mutable log object, screenshots without any context in test output). I'd personally suggest to make the writes to stderr threadsafe, and revert the nested log output removal (and the "no more log in |
Yes, putting logs in an output makes the build non-reproducible. Let's pick a random line from a VM test run:
The timestamp 1.667557 is unlikely to be the same across runs. So running with More importantly, it's bad UX to have two different kinds of log. Also, viewing "pretty-printed" logs puts strain on Hydra since it needs to fetch paths from the binary cache and unpack them to serve them to the client (whereas regular logs are fetched from the binary cache by the client directly). BTW, I believe I did mention during the Python migration that there was no need to port the logging stuff. We removed log pretty-printing everywhere else so why should we keep it here? |
You did, but this started a sub-discussion that exposed that there are people who want to have such logging facilities in general (i certainly do), that we did not resolve. A flag like @flokli mentioned, that is either off by default or in jobsets that do reproducibility checks, but still allows for logging (without depending on jquery etc) would be great. |
I don't understand how this makes the tests reproducible, if there is no output to verify most of the time.
Two runs could have wildly differing executions, possibly with actually differing behaviours, but be understood as "reproducible" according to what I understand. What could happen, to help make the logs more reproducible, which may be a fool's errand, is to have the Now, that won't fix ordering of the messages, but that's because the kernel execution is not reproducible in that context. EDIT: until we have truly reproducible execution of a system, we shouldn't describe a VM test as reproducible, and shouldn't pretend it is. They are not. They are notoriously fickle enough to sometimes work and sometimes fail. |
I don't think anyone expects we have a truly reproducible system execution, especially not across multiple VMs. NixOS VM tests are used to "integration test" one or multiple NixOS systems. Many people expressed the wish to have the logs available in the output. I also don't see a reason on why we should drop logs, but still keep (also non-reproducible) screenshots and coverage-data. We could not do this for hydra builds if their outputs are really an issue, but TBH, it's mostly text, which can be well compressed, and we probably fill the cache with much bigger things on every world rebuild. So IMHO, let's put the log output back in the output. Whether it should be behind a flag, whether HTML output (nested/non-nested) should be preserved, whether we can use the python logging framework - that's all details, but just removing it probably isn't what we should do here. |
I disagree. Putting logs in $out is wildly inconsistent with every other kind of derivation. It's very surprising (in a bad way) from a UX perspective. The nested logging once made sense because the rest of the Nix ecosystem implemented nested logs as well. But since we got rid of it everywhere else, it no longer makes sense to keep it around just for VM builds. |
Following that logic, screenshots and coverage data also don't belong into the output as well, and the output should be entirely empty. IMHO, VM tests are something different with other "package derivations", as they don't build packages, but do integration tests, using Nix derivations as a tool to compose them. People make use of the VM testing infrastructure in scopes outside of nixpkgs, and even in nixpkgs, being able to view the screenshots interleaved with the logs helps to debug them. We shouldn't just remove them without any alternative. |
Isn't the output of a test, some kind of result, probably data proving the test? As @flokli says, tests don't build packages, they run tests. Right now, we're losing the output of the tests. There isn't even a "dirty" VM image to peruse in the output. If it isn't logs, the full VM image would be something I would expect to see as the output of a test, just as I expect the output of packages to be the package. I'm sure we'll hear " It may be that "log" is an overloaded term here, there is the log for running the tests, i.e. harness output, and there is the test result log, i.e. console output from the system. |
Motivation for this change
This completes the removal of the nested log feature, which previously got removed from Nix, Hydra, stdenv and GNU Make. In particular, this means that the output of VM builds no longer contains a copy of jQuery.
This also makes VM builds more reproducible because the log output contained a lot of timing-sensitive data.
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)