NixOS is an amazing Linux distribution. The InfoQ article and thesis are well worth your time to read. Meanwhile, here is a new trick I discovered for debugging Linux distribution upgrades using git bisect.
I upgraded from NixOS 15.07 to 17.03 and found that the Pharo Virtual Machine had broken. Starting the VM would cause a Segmentation Fault within around one second. There was no obvious cause in the Pharo VM code itself: it seemed to be indirectly caused by a change in some dependency. There had been around 35,000 package updates to NixOS between those two releases, so how do you know which one is the problem?
It turns out that you can use git bisect to answer that question automatically. This is because the whole NixOS distribution is defined in a Git repository (nixpkgs) and so the history of every update to every package is tracked. So all I needed to do is write a script that starts the Pharo VM and checks whether it prints Segmentation fault within the first few seconds of execution. Easy, here it is:
Then once I have this script I can ask git bisect to please find the commit that introduces the segmentation fault, considering all updates to all packages in the whole NixOS universe:
git bisect start master 15.09
git bisect run ./pharo-nix-bisect.sh
Finding the bad commit from a set of 35,000 actually only requires around 15 tests because git bisect uses a logarithmic-time binary search.
This test ran for a few hours, testing many different versions of the whole OS including compiler toolchains, etc, and then finally pointed me in the right direction. It turns out that the problem was introduced by adding "hardening" to the default CFLAGS on NixOS and particularly by building Pharo with -fPIC which is not compatible with the VM. So I disabled -fPIC for the Pharo package on my nixpkgs branch, sent a pull request upstream, and went on with my day.
Truly, this feels like a small step towards "dependency heaven." Thanks, Nix!
The text was updated successfully, but these errors were encountered:
I used the same technique to trace this issue down to a guile upgrade and rapidly fix it. Nix and git are both wonderful tools, and in combination they're amazing. On top of that, nix didn't even have to rebuild weechat every time because not all the commits affected weechat's dependencies, and it took just about 10min to find the problem.