Upgrade to Nixpkgs 25.11#303
Conversation
|
Closing for now, will revisit in future cycle with GRUB skeleton pinned. |
|
Re-based on current |
@knuton does kiosk still segfault from I suggest to test with |
|
Rebased on top of main. |
Fixed an issue with kiosk going into an infinite reload in 509788d. Strangely, this seems to also get rid of the segfault for me at least. Nevertheless, I also disabled Vulkan in 8513fc4, because it gets rid of some related errors. |
|
The automated tests now pass, remaining efforts should focus on manual testing / verification. In the meantime, @knuton @dividat-jgu could you check if the following work for you on this branch:
|
All good! Notes: I had to use |
|
3e43b4f to
bfb9dad
Compare
nixpkgs 25.11 deprecates substituteAll, however replaceVars has slightly
different behaviour. Using `substituteAll`, the substitute values were
implicitly coerced to store paths. `replaceVars` doesn't do that and it
causes local paths to appear (at least) in tests, e.g. in e2e-tests:
Traceback (most recent call last):
File "/nix/store/qkjvh813d6kjw4clp5kh2fgp3gkb2l8y-install-playos-2026.3.0-TEST/bin/.install-playos-wrapped", line 497, in
<module>
_main(parser.parse_args())
File "/nix/store/qkjvh813d6kjw4clp5kh2fgp3gkb2l8y-install-playos-2026.3.0-TEST/bin/.install-playos-wrapped", line 438, in
_main
install_bootloader(disk, machine_id)
File "/nix/store/qkjvh813d6kjw4clp5kh2fgp3gkb2l8y-install-playos-2026.3.0-TEST/bin/.install-playos-wrapped", line 215, in
install_bootloader
shutil.copy2(GRUB_CFG, '/mnt/boot/grub/grub.cfg')
File "/nix/store/kjgslpdqchx1sm7a5h9xibi5rrqcqfnl-python3-3.12.8/lib/python3.12/shutil.py", line 475, in copy2
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/nix/store/kjgslpdqchx1sm7a5h9xibi5rrqcqfnl-python3-3.12.8/lib/python3.12/shutil.py", line 260, in copyfile
with open(src, 'rb') as fsrc:
^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/yfyf/src/playos2/skeleton/bootloader/grub.cfg'
In earlier nixOS versions, connman would "win" /etc/resolv.conf management over resolvconf silently. Since openresolv has been updated, it throws up with an error when it detects /etc/resolv.conf is not managed by it and makes `network-setup.service` fail. This does not seem to break anything, but produces noise and confusion in system logs. Since we have been using connman to manage /etc/resolv.conf, keeping things "as is" and simply disabling resolvconf. This has the side-effect of disabling `network-setup.service` as well, which is desirable.
Since NixOS 25.05, `multi-user.target` no longer depends on
`network-online.target`, see:
NixOS/nixpkgs@2370696
More over, when using connman, nothing starts the
`network-online.target` (see previous commit's 0ec53e4 message).
Transitively, depending on the setup, this means the `network.target`
might be never activated at all, which is what actually happens in the
controller-proxy test.
More NixOS networking misconfiguration for connman non-sense. This fixes the testing/integration/controller-interface-labeling.nix test, but also potential real mDNS issues down the line.
The `dontWrapQtApps = true` is recommended by Qt packaging docs for PyQt applications and the makeWrapperArgs approach is what many PyQt packages seem to use in nixpkgs.
We do not use Nvidia cards at the moment and without this, at least in
the VM we get confusing errors in logs such as:
radv/amdgpu: failed to initialize device.
MESA: info: could not get caps: Function not implemented
MESA: info: could not get caps: Function not implemented
MESA: error: vdrm_device_connect failed
radv/amdgpu: failed to initialize device.
MESA: info: could not get caps: Function not implemented
MESA: error: vdrm_device_connect failed
radv/amdgpu: failed to initialize device.
MESA: info: could not get caps: Function not implemented
MESA: error: vdrm_device_connect failed
radv/amdgpu: failed to initialize device.
In earlier Qt / QWebEngine verisons, the loadFinished signal arrived
before the next play:beforeunload event, which would clear the
`_is_full_reload` variable.
After the nixpkgs bump, the events look like:
load page -> SW triggers play:beforeunload ->
start full_reload -> SW triggers play:beforeunload -> loop
To prevent this, we do not re-enter the full_reload if one is already in
progress.
From the NixOS 25.11 release notes: > NixOS display manager modules now strictly use tty1, where many of > them previously used tty7. Options to configure display managers’ VT > have been dropped. This was breaking release-validation tests, since they expect to find a terminal on TTY1 for triggering the reboot. Using TTY3 should be backwards compatible with previous releases, since TTY1-6 were all terminals. Under some distro configurations, TTY2 is sometimes also used for "active graphical session", so jumping straight to TTY3. Due to this, switching to TTY7 should no longer be allowed, since it no longer holds the graphical session.
…sion
This is mostly to help existing users learn about the new shortcut if
they have been using Ctrl+Alt+{F7,F8} to switch between the status
screen and graphical interface for whatever reason.
Follow-up to 7756649
At first I thought it was a problem local to nixos-test-script-helpers.py, so tried to fix it by making sure it typechecks standalone by making it a separate package, but turns out the issue is elsewhere... When typechecking, `runNixOSTest` prepends a silly piece of code to the test script that defines various global types and variables. It added `t: TestCase` which conflicts with the `TestCase` we define. The only way to avoid the conflict is to either rename or always import it using a qualified name. Since we use it everywhere, it is nicer without the qualified name, so renaming. Closes: dividat#298
There is tester.nixosTest, but it is in process of being deprecated: NixOS/nixpkgs#293891
This seems to have been always useless, since NixOS does not actually set up a `connman-wait-online.service` and the `network-online.target` was activated by `multi-user.target` implicitly, without any relation to an actual network "online" status. Recent NixOS versions have removed the dependency on network-online.target from multi-user.target, which means that (with connman as the network manager) it never gets activated, which causes the TestPrecondition to fail.
New static-web-server version prevents serving files that are symlinks which resolve to paths outside of the webroot.
The new log message is:
PlayOS network watchdog skipped, unmet condition check ConditionPathExists=!/home/play/.config/playos-network-watchdog/disabled
The NixOS Stage 2 thing seems to be gone, but it is not relevant.
This vastly increases the test disk size, but seems to be the only way
that works for making the kiosk / QtWebEngine work without crashing /
errors / no display.
Tried disabling graphics accelaration / GPU rendering with all of the
following:
export QTWEBENGINE_DISABLE_GPU="1"
export LIBGL_ALWAYS_SOFTWARE="1"
export QTWEBENGINE_CHROMIUM_FLAGS="--disable-gpu"
export QSG_RHI_BACKEND="software"
but kiosk/Qt still either crashes or renders nothing.
The positive aspect is that this makes the e2e setup closer to "real
world".
static-web-server made symlinks outside web root illegal (see previous commits) in the recent version. For tests using UpdateServer, it is preferable to avoid copying the bundles, since they are quite large and tests running /tmp already occasionally run out of memory. So instead we switch to a different static web server, which cares less about security :P
In CI, this sometimes fails at the precondition stage,
while running:
playos.succeed("curl ${primaryCheckUrl}")
which seems to hang forever, indicating incorrect NAT / port forwarding.
Cannot reproduce locally.
watchdog logs also indicate the primaryCheckUrl is not reachable, while
secondaryCheckUrl is reachable:
2026-05-07T07:53:19.2454857Z playos # [ 16.693981] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13939/check succeeded!
2026-05-07T07:53:20.4631618Z playos # [ 17.911811] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13838/check failed: HTTPConnectionPool(host='10.0.2.88', port=13838): Read timed out. (read timeout=0.2)
2026-05-07T07:53:20.4753877Z playos # [ 17.923982] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13939/check succeeded!
2026-05-07T07:53:21.7012434Z playos # [ 19.149858] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13838/check failed: HTTPConnectionPool(host='10.0.2.88', port=13838): Read timed out. (read timeout=0.2)
2026-05-07T07:53:21.7125924Z playos # [ 19.162291] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13939/check succeeded!
2026-05-07T07:53:22.9341384Z playos # [ 20.380213] playos-network-watchdog[661]: DEBUG:watchdog:URL check for http://10.0.2.88:13838/check failed: HTTPConnectionPool(host='10.0.2.88', port=13838): Read timed out. (read timeout=0.2)
...
This is very confusing, since the two HTTPStubServer`s are started
identically and also have identical port forwarding setup.
If networking setup is broken (e.g. NAT failing or QEMU's slirp partially-setup), a client connection might be kept open indefinitely since the client never finishes the read or closes the connection. This is mostly to avoid "strange" hangs and make the failures more explicit. The ThreadingHTTPServer seems to be sufficient to fix network-watchdog integration test flakiness.
The initial multiple rfkill switches alone takes ~20 seconds on CI, before anything even "happens". Also increase the timeout further for good measure.
|
@knuton I believe this is now at a stage where we can move into manual testing. I think it would be a good idea to perform the standard PlayOS pre-release manual tests. Do you maybe want to take over this part, since you have all the peripheral hardware? 😊 Alternatively, we can go through code review and then do the manual tests. |
Sure
No, this is the way. |
Updated definitions to resolve deprecation warnings, eval errors and build errors.
Issues
run-in-vmbecause QML imports are not foundnixos-test-script-helpersdon't run due to mypy errorsmainChecklist
du -BM -sL $(nix-build -A components.unsignedRaucBundle))pkgs.linux-firmwareto fit actually used hardware #343du -BM -sL $(nix-build -A components.installer.isoImage))2252M3345Mpkgs.linux-firmwareto fit actually used hardware #343 is still the best remedy