Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pkg3: enforcement of immutability? #4

Open
tkelman opened this issue Oct 22, 2016 · 8 comments
Open

Pkg3: enforcement of immutability? #4

tkelman opened this issue Oct 22, 2016 · 8 comments
Labels

Comments

@tkelman
Copy link

tkelman commented Oct 22, 2016

Questions to answer in further revisions:

Will installed packages be read-only (at least until deletion is called for), or strictly checksummed, or immutable by convention only? What consequences, if any, would modifications have? Using code that doesn't match what the manifest specifies was installed isn't good for reproducibility, but what's the intended level of tracking and granularity of this? If packages aren't always git repositories any more, how are generated files and downloaded resources, which would be ignored from a version control perspective, dealt with?

This ties into the bigger separate question of where and how development happens if everything Pkg3 touches is immutable (by convention only, or strictly enforced). If you want to make a local change, does that require a separate installation mechanism and modifiable copy that lives outside of Pkg3 somewhere? Or if you make it locally do you then have untracked local modifications that never get recorded anywhere? (People will forget they've made this kind of change if packages aren't git repos.) Most other package systems work this way, but most other package systems have an unfriendly distinction between the way users work with packages and the way developers/contributors do. The low barrier to entry of contributing to Julia packages is a huge benefit to our ability to get users to become developers.

@nalimilan
Copy link
Member

I guess we could keep git repos somewhere, and make source-only copies for each used release. A "master" pseudo-version would allow using these instead of releases. That would also be an improvement over the current system for developers, since they would be able to switch easily between the development version they are hacking on and the releases of their own packages.

@tkelman
Copy link
Author

tkelman commented Oct 23, 2016

Keeping track of whether you're using the development version or an installed version needs to be made extremely clear. Pkg.status() does a decent job right now, but if Pkg3 doesn't handle development versions, how do you track them? There's very little visibility right now into where things get loaded from if they happen to be found somewhere on LOAD_PATH, so I don't think it's an improvement to make that the primary mechanism for developing or modifying registered packages. And the fewer steps it takes for a newcomer to make a change to a line of code, test it locally and turn that into a pull request, the better. Pkg.submit is a really good idea for simplifying this process, it's mostly underused because of authentication-related issues.

@StefanKarpinski
Copy link
Sponsor Member

I think that read-only, checksumming or convention are all viable approaches. Installing packages read-only seems simply enough to me. We should also record checksums and paths and reflect them in Config and Manifest files. I think a key part has to be that if you're building/testing using development versions, that's reflected in the files you would be committing so that you can see it before you commit and tag something to avoid breaking published packages.

@rdeits
Copy link

rdeits commented Oct 31, 2016

This is an issue that's particularly important to me as well. I find that being able to poke around inside an installed package to change a line or add a @show is extremely helpful in my day-to-day work. I recognize the disadvantage of allowing this (since I no longer can prove that I'm using exactly the dependencies I've declared), but I think the trade-off is favorable.

I realize that one goal of Pkg3 is to avoid the requirement that every package be stored in git, but if you'll hypothetically ignore that for a moment, then it may be possible to let git help out here.

git 2.5 introduced the git worktree subcommand, which allows a user to maintain multiple checked-out copies of the same git repository, potentially in completely different places on-disk. I wonder if it might be possible for each installed version of a particular package to be an independent worktree of the same git repository (with the central repository stored in a single location on-disk). This does not help with enforcing immutability, but it would allow us to easily detect that a package had been changed (via git status), without interfering with package development. In this case, it would be just as easy for users to make and commit changes to installed packages (since normal git commands just work inside a worktree). Worktrees can also be safely deleted at any time, which seems to be quite compatible with the Pkg3 cleanup mechanism.

I can imaging users not necessarily wanting to be bound to git for their Julia packages, but I think there are some pretty substantial benefits. The relatively uniform process for downloading and contributing to packages in Julia is one of my favorite things about the language, and I would hate to lose it.

@tbreloff
Copy link

tbreloff commented Nov 1, 2016

I agree this is an important issue. I think it would be really nice if a Pkg.add installed a minimal, immutable, non-git-repo version of a specific version of a package, while a Pkg.checkout did a full mutable install, git history and all, into an alternate directory. This alternate/mutable directory would be similar to the ~/.julia/v0.5 directory, and when loading packages with using XYZ we'd first check for a package of that name in the "development" directory.

One nice thing about this separation is that one can easily see/track the packages they are developing and it would be clear which version would be loading on a call to using. When a user is done with development, they can just remove the git repo from their development folder and then the tagged release will be used again (and that git history won't hang around in caches, etc)

I worry that an attempt at immutability for packages in development would make the workflow very cumbersome.

@rdeits
Copy link

rdeits commented Nov 1, 2016

@tbreloff I like that idea as well. I really like the ability to modify installed packages in-place, but I do agree that it can get cumbersome to remember which packages I've modified and ensure that those changes aren't lost when I eventually nuke my project-specific JULIA_PKGDIR folder.

@StefanKarpinski
Copy link
Sponsor Member

I think @tbreloff's idea is a good one. Packages in development certainly shouldn't be immutable – that almost sounds like an oxymoron. I would also point out that unless Julia pro-actively verifies that code loaded from "immutable" matches a given cryptographic hash, immutability is always going to be by convention – you can go edit the code if you want to. I don't think we should prevent that, but I don't think it should be the recommended way to edit packages. If you edit .julia/packages/Example/1.2.3 that would affect every user of the 1.2.3 version of the Example package. We can certainly allow that, but we should probably print a warning in that case so that you don't edit some version and then forget that you've changed it.

@StefanKarpinski
Copy link
Sponsor Member

I should also point out that I don't particularly want public Julia packages to start using a menagerie of different version control systems. Dictatorial as it may be, I also think that the fact that all packages use git is a good thing. Private packages might be a different story, however, since some companies are pretty much locked into using something that is not git for development. It would be nice to allow for that situation. But the real motivation for moving away from identifying package versions by git commit hashes is the history business and the fact that SHA1 is no longer really secure.

c42f added a commit that referenced this issue Sep 14, 2019
Add a few more links
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants