-
-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prefetch-yarn-deps, fetchYarnDeps: init #140701
Conversation
Is this sufficient to perform a proper build? I would assume you tried, but we'll need to know.
Is this part of the FOD? Otherwise it's ineffective.
If you check early, you can retry. Also failing early is helpful when fetchYarnDeps takes a long time. It'd be bad when downloads are failing slowly, because that could make the fetcher take a very very long time to fail. It's also irresponsible from a distributed systems perspective. If the load is too high and a service starts to fail, clients should back off. The only (responsible) thing we can do is fail early.
This is convenient but adds the risk of minute changes to the lockfile to break the FOD, in a way we can't detect, because we don't have a hash for the lock file. It does not make the FOD more powerful, so I recommend to wire that file into the build independently of the FOD.
It must do so in the FOD, otherwise we can't get any info out about the mismatch, except a useless combined hash.
Use https://nixos.org/manual/nixpkgs/unstable/#sec-pkgs-invalidateFetcherByDrvHash for success cases. Testing failures is harder, but can be done with "passthru.tests" on functions via OfBorg is an open problem. |
It is sufficient to build an |
54fbaaf
to
a153aef
Compare
I have implemented hashsum checking, and ensured it will always exit with a non-zero code when something went wrong. As a demo, I have converted gitlab, hedgedoc and element-desktop to fetchYarnDeps. |
|
I really like this personally. I'm running a nixpkgs-review just to confirm all is good. This is obviously outside the scope of this PR, but there are 2 yarn.lock parsers in pure nix. I think in the future we can think about doing that parsing of the lock in nix. I'm just starting the conversation here to see how people feel about that. |
@happysalada from my point of view there are multiple use cases that have really different requirements:
So in my opinion it's really nice to have a yarn.lock parser in pure Nix, but right now it's useless for packaging stuff in nixpkgs, because it would have worse evaluation time than pre-converted yarn.nix files, and would still need to check in big yarn.lock files to nixpkgs. @roberth I'm not sure if I understand your other comments
I don't understand why it would be ineffective to check outside of the FOD if the FOD generated some nonsense offline cache, but anyways this is solved now: It will exit immediately if one download fails or gives the wrong hash.
Again, solved by checking hashes and exiting
This is how it's supposed to work: The risk I am talking about is: Someone updates the src, but forgets to update the FOD hash. It will not attempt to build the FOD again, because the output for the old FOD hash is cached. No check in the FOD could help against this, because it would simply not be run. The question I am asking is: Do we need additional protection from this like fetchCargoTarball has? |
I agree with you on the fact that a pure nix-parser will be much use in nixpkgs. In order to not be IFD, the lock file would need to be commited, and that's not great. FOD sounds a much better solution (in my opinion too). Another package suggestion I had if you are interested in testing more is https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/applications/video/mirakurun/default.nix In investigating yarn and the madness around it, we found a couple of places that are badly broken. For example in the 'react' lock https://github.com/facebook/react/blob/main/yarn.lock#L12604, a version of 'react-is' that fits the semver is not included. I'm not saying that the FOD should be mindful of these, but just saying that we might have problems with this approach on some packages. Personally I think as long as we can use this approach, we should still go for it. |
I couldn't finish the nixpkgs-review, due to Regarding the problem of updating source and not the sha-256, perhaps the solution should just be to have update scripts for these packages ? Of course, an error can still be made, but it would reduce the potential of such mistakes. |
Result of 6 packages built:
|
Actually I will remove the "convert xyz to fetchYarnDeps" again, and we can do that in seperate PRs. For now I just want to add fetchYarnDeps with this PR. |
e484cb9
to
b38d147
Compare
I think this is in a pretty good state, with update scripts and all. |
If the lock file changes subtly, you'll get a hash mismatch, telling you nothing about the cause of the problem. Such a change could be caused by anything like a formatter, git crlf handling, editors appending or removing final newlines. As usual with FOD arguments, the error will only appear after some time, or when run by someone else, removing the bad error message from its context.
I think you're right. Maybe there's a middle ground where the lock is in a second output. Having two hashes means you can tell which one broke, but it also leads to another slight chance of human error and more implementation complexity. Another way to improve the error context is to incorporate a lock source hash (hash of source path or hash of FOD out path or regular drv hash, depending on how we get the lock file) in the FOD I can't think of a solution that ticks all the boxes including a good error message when the lock file changes subtly. I don't think this scenario will be as frequent as forgetting to update the hash, so adding the lock file to the fixed output seems best. |
pkgs/applications/networking/instant-messengers/element/element-desktop.nix
Show resolved
Hide resolved
FYI before this gets merged, I'd also like to review it — at least the element changes, however I don't think I'll get to it before the weekend :) |
But this going unnoticed is exactly what we are preventing by comparing the yarn.lock in the FOD with the yarn.lock from the current src. If the yarn.lock is edited (for example a newline is appended), the main derivation will tell you that there is a mismatch and this change will not even make it onto master (or will be directly apparent).
Of course, we'll wait for your review :) |
The only problem is that the FOD may not be buildable. It relies on an FOD with the same output having been built before on the same host or cache. Without a preexisting FOD output, the error will not be informative. I hope we are correct to assume that it's unlikely to happen in practice. As said, I'm ok with adding the lock file to the FOD output. It's a good trade-off. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without a preexisting FOD output, the error will not be informative. I hope we are correct to assume that it's unlikely to happen in practice
Isn't that a common issue with FODs?
While I'd personally prefer to generate fixed-output drvs while reading a lock-file at eval time (despite increased eval-times, see e.g. https://github.com/edolstra/import-cargo), I acknowledge that this is pretty complex for Yarn's lockfiles.
Since this also avoids relying on tools such as Yarn that bring in potential impurities or imply a risk of hard upgradability in case something changes internally (e.g. the issues we have with Go 1.17), I think that this is a good solution for us, hence 👍
This is quite a unique fetcher because we're in a position to validate the individual files. It lets us give an error message that leads to the cause of the problem, rather than "we downloaded the interweb again and this time it's different". However, this does not apply to the lock file itself if we add it to the output, because we don't have a hash of that.
Yes, it is, and that's why I would normally recommend against complex FODs. This one can provide good error messages, unless the lock file is in the output, in which case it can also produce a bad error message: a hash mismatch with no clue as to what has changed. Someone who knows this fetcher really well might be able to speculate what went wrong, but it's bad UX. |
pkgs/applications/networking/instant-messengers/element/keytar/update.sh
Outdated
Show resolved
Hide resolved
dc72676
to
99caa8b
Compare
99caa8b
to
f40ff0c
Compare
Anything more left to do? Would be nice to get it done until the end of the week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of anything else
Would it be possible that an empty hash doesnt produce |
Motivation for this change
Each time we want to add a big application with lots of JS dependencies, someone says it will slow down eval.
Running yarn itself in a fixed-output derivation is a relatively high risk (depending on who you ask a misuse of FOD), as changes in yarn might lead to invisible breakage of the FODs
=> Let's instead only do the downloads of tars and git repos with a small (~100 line) program
Afterwards we can run yarn in a proper derivation, just as if we had imported the
offline_cache
of a yarn2nixyarn.nix
(see gitlab, element).Inspired by:
Implementation details that need to be worked out:
Things done