New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifying version dependencies doesn't appear to work -- but I like it that way. #52
Comments
+1. If version numbers do not represent equality, we will have to develop a new system in order to guarantee reproducible builds. |
Indeed it the commit is not working. Will have to investigate that. On the more general issue, I do understand the concerns. I think a good start to discuss this is if I give a much more detailed explanation, and probably a blog post is a good starting point for that. |
Would it be possible to have that discussion here instead? That way all the relevant details are in one place. I would imagine that if there's a good reason to have this feature then it should be pretty straightforward to summarize the reasons it's important and the reasons why it won't lead to the problems that @creswick outlined. |
The summary is that other systems (e.g. Gentoo) do this successfully. They do track "revisions", and we're doing the same here (with the x-hackage-revision field for now). They also make these revisions visible in the UI when building so people know in case it's relevant. The issue with reproducible builds I think isn't really valid: right now builds are not reproducible unless you fix the versions of all packages, and if you do freeze the versions then it will work with this scheme too. Builds are not reproducible between different users now for a couple reasons: 1. different snapshots of hackage will lead to different solutions in general, 2. the solution the solver picks will also depend on what the user already has installed. In summary I think what we want is:
There's room for discussion and design on both points. For example in gentoo you can select which revision to use if you care, though with gentoo revisions can include patches which is not the case for us. On the second point, we have the mechanisms to do this now (local config files), we just need a convenient UI. |
Part of what concerns me is that I don't understand what problem this feature is meant to solve; any changes to the cabal metadata will require that clients run a With regard to your points:
One way to resolve this (unless I'm mistaken...) would be to append the x-hackage-version value to the cabal version number. Another solution (which I'm /much/ less keen on, but it's better than nothing) is to add support for x-hackage-version to ghc-pkg, cabal and cabal-install before providing the ability to make such changes via Hackage. At least then we could use the current set of tools to determine when a build was failing due to a hackage-based metadata change. |
It seems to me that some potential changes to package metadata will fall under some criteria addressed by the PVP. Therefore a package version number increase should be required for any such change. Since this mechanism already does some validation of the changed cabal file, it should enforce a version number increase if any of the rest of the file changes, and I would be quite happy with this solution. However the scheme of adding |
Yes.
No, that is not the case. This change does not change the source code for the package you are building, so if you are still using the same package versions then nothing has changed. None of the changes allowed in the .cabal file should leak into the build artefacts. The only wrinkle here that I'm aware of is that it is possibly to mistakenly over-constrain the dependencies so that a set of versions that did successfully build is now an invalid set of versions. However this is easily detected and fixed, and never leads to building something different, just to not finding a solution. In principle even that could be guarded against as part of a configuration-freeze feature, by picking the exact versions and then overriding/ignoring the package constraints (which is a useful feature independently). So, to address the main Q: why we want this in the first place, given that it's always possible for the author/maintainer to upload a new version. Yes, that's quite true but in practice, as we all know, it just does not happen. This means that there are many packages that could work, or work together that are just broken. This is unhelpful to new and casual users. The solution is not to force maintainers to be more attentive, or to make them think about the package collection as a whole, but to create a new class of people who see that as their role. There are people doing this job right now, but they do it for the distros. This duplicates work across all the distros and people working direct from hackage cannot benefit from it. Once we see that it is a separate role to think about the health of a set of packages rather than a single package, then it is clear that we cannot use the mechanism of uploading new package versions. The package version is solely the responsibility of the author/maintainer, and distro-maintainers must never fiddle with it. Instead distros need their own extra revision mechanism. For Hackage, we are taking a deliberately conservative approach. In Linux distros, distro packages can and do contain code patches. For hackage I believe that would be a step too far. But we can still gain a lot of the benefits without patching, just by allowing some flexibility with dependency constraints. The issues about tracking revisions, and reproducible builds are very important. I think the answer again comes from the distros who have made this work in practice for years. They typically do use an extra revision number, distinct from the package's upstream version number. For distros this revision number is even more important than it is for us, since they can include code patches. For us the revision number should I think be somewhat less important, however there are obviously places in our toolchain where it may be important from time to time. I'm happy to discuss the best options here. My several years experience with the Haskell Gentoo packages convinces me that in the end it will really be worth it, that we'll be able to make most packages install out of the box most of the time (or at least get a clearer constraint solver error rather than a build failure). |
BTW, I'm also open to not fully enabling this feature on hackage until we have some tool changes, if that's the general consensus, or constraining what changes are allowed at first. |
This will still be true, in the sense that it is almost true now, and it should in practice almost be true in future. But this particular point I think is not worth arguing over. Since it is only an "almost" solution, we agree that the right solution is to fix all the versions direct and indirect and to make that convenient. People are working on a cabal-install feature to do that now. |
Yes, that's one reasonable approach. For example, Gentoo uses a scheme like foo-1.0-r2. Then the issue is where would we want this reported, and where would it be accepted as input. We never need it in .cabal files. We could start by merely reporting it in the output of cabal-install's build plan and log output "Building foo-1.0-r2". |
Since this is not currently working (Issue #52)
There has been significant progress recently on haskell/cabal#1519. Once that feature is implemented, and hopefully also haskell/cabal#949, then I don't see any reason why not to enable editing dependency version constraints. Even if so, the ability to refer to revision numbers and both here and in cabal would be very valuable. |
So I'd like to do the minimum possible in Cabal/cabal-install, at least initially, so that we can get this feature rolled out soon. So we could report in cabal install --dry-run and when actually installing packages what revision is being used, but I'd rather avoid (at least initially) providing the ability to specify on the command line which revision to use. That's because the implementation is a lot easier if we don't have to have all the different revisions available in the source index at all times. |
Just to clarify, would this ship with the ability to freeze and ignore constraints then? If so I think it sounds good. |
dcoutts is exactly correct here; fixing the dependencies and version constraints is neither sufficient nor necessary component of reproducible builds. Reproducible builds (and exactly how reproducible do you want them to be?) is outside the scope of what cabal is currently capable of, and you are going to have to create significant new functionality regardless. Builds also depend on the package indexes you are compiling against (both installed and what's available), the particular constraint solver you are using, the compiler version you are using, the operating system you are using, any shared objects/dlls your executable depends on, etcetera etcetera etcetera. Freezing the transitive closure of the dependency graph gets you a significant way towards reproducible builds, and would be awesome to have, but you are still going to have to write another system (subsystem of cabal?) to do it. A version should identify the source code of a package, not its dependencies. You need some other identifier (e.g. a manifest of package versions) to identify a build. And there are significant costs to mandating dependency constraints be immutable. For starters, I would be far less opposed to putting upper bounds on my package dependencies if I could change them after the fact. From a certain intellectual point of view, it's incorrect to advertise an immutable upper bound because I have no idea if a particular version of my package will work with a future version of your package. And in practice, dependency version constraints tend to be not quite right anyway; it's a lot more convenient to release with dependencies that's approximately correct and then fix it up later. But by mandating a version bump just to change a dependency means that it's possible to cause other packages to tweak their versions as well. And the version bump tends to cause the binary haskell distributions for various operating system to get recompiled as well, when there is no reason to do if the only thing that has changed is that you advertise modified dependency constraints. |
BUMP I want to fix shit. this would make my life easier, and not require i give myself uploader acls on projects just to fix stale version constraints. I don't care about the philosophical issues, i care about making sure stuff builds on 7.8 without having to (ab)use being able to edit uploaders groups. |
So I'm minded to enable this feature now. We now have a cabal-install release with the freeze feature, (and the ability to ignore upper bounds), so I think this addresses most of the issues that people have brought up. There's some further improvements to display somewhere in the cabal-install ui output what revisions are being used, but I don't think that's a blocker. |
+1 :) |
@dcoutts +1 for enabling the feature... (...while hoping it will get used judiciously; I've already noticed it requires a bit of investigative work with older GHCs to find out how to change bounds properly if its not obvious which build-dep package version was the first one to really break) |
+1 for enabling this |
It seems editing package metadata now works (Thanks!), but the edit page still says "NOTE: This is work in progress. It's not currently actually possible to publish new revisions (see Issue 52)." Can that text go away? |
I would feel better about this feature if the Of course this overlay could also be contained inside the |
But that would still not allow us to say “I want package version 0.1.0.0 from 01-index” would it? AFAICT the only way to solve the conflict of ‘I want newer bounds on my package but don't have time to release right now – maintainer’ and ‘I want to be able to actually specify what goes into my builds and not have that randomly start failing – distributions’ is to make the updates explicitly reflected in package versions. So IMHO if Hackage could generate new package versions for each info bump, this would solve the problem from user perspective. Currently the only way for distributions to deal with this would be to check the Hackage-only cabal file and if it's different, make sure it somehow gets included in the build, effectively resulting in |
@Fuuzetsu Indeed, it's not a perfect solution. Yours is indeed much better in the long run. AFAICS there are (roughly) two use cases:
The second group wasn't served at all that well before this change, while the first group isn't served at all that well after the change. Looking at it from that perspective I hope you can confirm that it would offer a better situation than we have right now. It would give me (and you) the opportunity to just keep doing what we did before the change, while cabal-install users can enjoy a better experience (until they also become grunts). If it's cheaper to implement it can serve as a quick fix for now, until a proper solution (like yours) can be put in place. |
@magthe can you please explain why this hinders reproducible builds? In my experience it's improved. If someone previously had a too lenient upper bound then simply doing a |
@bergmark I'm not a cabal-install user, I am however a user of the As far as I understand it's also broken for all users that can't easily upgrade cabal-install for exactly the same reason. I can of course modify my tooling to mimic how I understand that cabal-install works (copy the .cabal from the |
@bergmark, I was referring to the |
Another weird side-effect of this feature is that it's impossible to tell which package users have installed. The version number looks the same for all x-revisions of the Cabal file. |
@peti the version the user has installed doesn't change. The source code is still the same. |
The source code may be but how the package is installed might not be. It's easy to change a setting in cabal file which makes different flags to be used. |
@dcoutts, I'm quoting from http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/113502:
|
Yes, this!
Imagine that the difference is between your workstation (with a cabal update as of 8am) and your CI server (with a cabal update as of 12 midnight). CI fails, but the local machine works, you have 30-60 commits that came in yesterday, and you have no idea when the last OS update happened on the build machine (which runs a different OS, and can only be accessed via a single-user VNC session). Is this cabal/hackage feature really the first thing you're going to check, or are you going to spend the next 4-6 hours trying to figure out why the CI machine broke, or where the platform-specific code got injected in the last 24 hours of hacking? |
The problem here is rather the (IMHO wrong) assumption that having only the information of package I don't know if such a tool exists yet, but something that dumps out that configuration information would be desirable (I've been missing something like that to document all package versions that were used to produce a deployed binary artifact) |
@hvr, you are right, Cabal flags are one more feature with the potential to break reproducible Haskell builds in addition to Hackage's destructive edits because the chosen flag assignment doesn't reflect in the version string. Anyway, I don't think that the existence of Cabal flags is a compelling reason to break reproducible builds in yet another way on top of that. When a Cabal file on Hackage is edited, its version number should be bumped. It is true that this mechanism won't remedy all possible causes for ambiguity, but it does improve the situation over what we have now. |
I'd really love an example where the edits have made fewer things
|
@cartazio, nobody claims that edits make fewer things buildable. The problem is that the builds become unpredictable and non-repeatable. Your the output of a |
I think these examples will be very hard to find in the wild because you can throw up your hands in frustration, run 'cabal update' on every machine, and (most likely) get consistent behavior (modulo sandbox details, etc...). I think that's much more likely to happen than someone actually debugging the actual cause, recording the details, and reporting it. "rebooting" everything will probably work, but it's frustrating. My whole beef with this change is that it adds potential complexity, and in a way/place that is not directly associated with the software artifacts that we're already accustomed to investigating when a build failure occurs. It adds yet another thing you need to know inorder to debug cabal build failures -- and that's _already_ a topic of great consternation.I strongly believe that we should be doing everything we can to make build failures easy to debug. We certainly shouldn't be making it easy to apply duct-tape fixes if/when it adds to the complexity of the debugging problem. My mind is completely blown that the Haskell community has a global, statefull source of information that can potentially cause any package to stop building (without the degree of oversight that goes into compiler development). Aren't we all drawn to Haskell to get away from that sort of complexity? The arguments in favor of this sound (to me) like paraphrases of the justifications for dynamic languages, OOP Singletons, and global variables. Sure, you can use those things to quickly make something work, and with sufficient effort you can continue to do that for some time, but eventually the long-term consequences start to raise issues. |
I totally agree we should work on creating more visibility into understanding how things work. My stance is thus "lets improve the tools and empower" rather than "that features doesnt work for me, dont add it". can we focus on the latter in this discussion. What are actionable things we can add to empower. |
Fwiw, this was already so before mutable |
indeed! you almost need to specify a "commit hash" for the entire state of On Thu, Dec 4, 2014 at 5:32 PM, Herbert Valerio Riedel <
|
@cartazio: Isn't haskell/cabal#2222 precisely an example of a situation where post-release edits have caused end-user problems? |
@hvr,
again, nobody claims that Haskell builds were perfectly deterministic and reproducible before destructive edits were introduced on Hackage. They were not. Everyone agrees on that. Those of us who care about deterministic reproducible builds, however, feel that this fact is not a compelling argument to make Haskell builds even less deterministic than they already were. Deterministic builds are an important feature for everyone who distribute binaries, i.e. virtually every non-trivial commercial enterprise needs this. IMHO, we should be moving towards deterministic builds, but instead it seems like we are going boldly into the opposite direction. This is a little disconcerting. |
|
Curiously, meta-data editing is a feature that actually helps provide deterministic builds. Without that you can't fix-up existing meta-data that became unsound (due to bounds being too lax) when new package uploads occurred, and would not be able to restore a working So far I've experienced far more such cases where meta-data editing (to add missing version bounds) helped provide deterministic builds, than cases where meta-data editing was done incorrectly (and hindered perfectly fine install-plans). So IMO meta-data updating actually solves real problems we have with Hackage, and reduce the pain more than it adds new pains. |
@hvr, yes, this is true. Destructive edits can be used to remedy issues like the one you've described. Do you agree that it's also possible to edit Cabal files in a way that hurts deterministic builds? Also, it seems like you are arguing against my suggestion that edits should be reflected in the version number. (At least, I never saw you concede that this might be useful.) I would be interested to know what your reasons for that position are? |
@hvr, ping? |
...as with most tools, you can also misuse it to do harm. The question is whether the benefits outweigh the risk of harmful misuse.
Well, since cabal has access to the Does that align with your suggestion or did you have something else in mind? |
@peti said:
@hvr, are you proposing to modify cabal-install so that every command is aware of Hackage-specific package description revision numbers, and users are allowed to declare revision numbers in |
no @mietek, you're proposing that. |
Popping in to say that I do not want hackage revisions to introduce themselves as a suffix onto the Cabal version. If a distro like Nix wants to track that, that's fine, but for the general Cabal user's purposes, I don't think that's a good idea. We already enforce "reproducible builds" much more rigidly than any other language ecosystem I can think of. Python is buzzing right along on completely mutable package uploads, zero type-safety, and dependency management that doesn't even pretend to care about version compatibility. I care about reproducible builds. That's why I don't like Golang's package management story. However, this enforcement of version intersections in Cabal is largely responsible for the difficulties people complain about. I don't want to roll this backward, but making things even more annoying to build is not the direction we should be headed in. I'm in favor of ignoring hackage revisions in Cabal and enabling edits of Cabal files. The former avoids a mess, the latter helps us continue to maintain a higher standard for consistent builds than other language ecosystems. If building things with Cabal gets any more inconvenient or annoying, we are going to alienate a lot of people. Maybe some don't care about that, but I do. Other people are getting along with far less tooling support and soundness. Maybe they aren't doing so "just fine", but the world isn't ending for them either. I know this is the Hackage thread, not the Cabal thread. I don't care. |
let me clarify my opinion on this, because i wasnt articulating it well earlier: I think theres really two / three classes of uses And I guess what my implicit concern in this thread and related ones comes down to: how can we support those who care about use cases (a) and (b), while not adding any new complexity burden on (c). Theres a LOT of valid arguments for (a),(b) to be compelling, but it has a HIGH engineering cost for those who dont need it / dont want it. I'd like us to push to solutions that dont put a heavy burden on the people in (c) |
I agree with what @bitemyapp said, and I deeply care about reproducible builds — which is why I built Halcyon. If this wasn’t clear — I’m also in favor of ignoring Hackage revisions in cabal-install, and retaining the ability to edit |
Can we close this ticket? It works now, whatever the controversy 😃 |
In hackage 2, you can modify the cabal file "...edit certain bits of package metadata after a release...". Including to change dependencies.
This doesn't appear to work; but I actually think that this is a dangerous feature to enable.
There is significant risk that this will result in non-reproducible builds; particularly when trying to resurrect older software, but also when debugging during new software releases. If a dependency change is made to the cabal file without changing the version (which is specifically disallowed by the current interface) there is no practical way to verify that the software you are building is the same as the software I am building -- we end up in a situation worse than the dependency hell we've all worked to avoid with sandboxing.
Years of experience teach us to look at version numbers, and to trust in the version numbers to represent equality across systems, but this feature violates that ingrained assumption. I can't imagine trying to remotely debug a build failure caused by a difference in the dependency specification of a transitive dependency.
I strongly suggest that the cabal file modifications be restricted to non-semantically relevant fields; fixing documentation typos, source repos, authors, maintainers, etc. things that can not cause cabal to behave differently.
At the moment, this is easily fixed by changing the explanatory text on the 'Edit package metadata...' page, (although it also appears that modifying descriptions is also broken).
The text was updated successfully, but these errors were encountered: