Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to convert a large repo to glide, not sure what's up #300

Open
thockin opened this issue Mar 4, 2016 · 13 comments
Open

Trying to convert a large repo to glide, not sure what's up #300

thockin opened this issue Mar 4, 2016 · 13 comments

Comments

@thockin
Copy link
Contributor

thockin commented Mar 4, 2016

I'm trying to see how Kubernetes would do with Glide. I ran glide init and it imported our Godeps - great start.

First thing I notice is that Glide imported a lot of files that Godeps didn't. I would prefer not to bloat our repo any more than I have to. Specifically:

$ ls Godeps/_workspace/src/github.com/coreos/rkt/
api  LICENSE

$ ls vendor/github.com/coreos/rkt/
api            common        CONTRIBUTING.md  Godeps      MAINTAINERS  pkg         scripts     store        vendoredApps
app-container  config.guess  DCO              install-sh  Makefile     README.md   stage0      tests        version
autogen.sh     config.sub    dist             LICENSE     makelib      rkt         stage1      tools
CHANGELOG.md   configure.ac  Documentation    logos       networking   ROADMAP.md  stage1_fly  Vagrantfile

Second, as it imported, I saw a lot of "conflict" yellow messages fly past. What I really want is for it to stop and ask. "You have foo.com/bar @12345. Your new import qux.com/zrb vendors foo.com/bar @86753. Do you want to (k)eep what you have, (c)hange to the new version, or (a)bort". Something like that. Even better if it could tell me that 86753 was newer than 12345. Possible?

Third, I want to update one dep. glide get -u github.com/coreos/rkt/api/v1alpha spews a lot of stuff to the screen, but doesn't actually update anything. "Package "github.com/coreos/rkt/api/v1alpha" is already in glide.yaml. Skipping". I asked for an update - why is it skipping?

Fourth, all of the vendor/* dirs still have their .git directories. I am not a git wizard, but that seems wrong to me. Certainly different than Godeps.

I could use some help understanding these - am I doing something wrong?

@mattfarina
Copy link
Member

@thockin I'm happy to try and help.

We like the unix philosophy. Particularly:

combining "small, sharp tools" to accomplish larger tasks

Glide handles the package management side of things but that doesn't mean you need to store the external packages in your projects repo. That's up to you and many, for a variety of reasons, don't want to or choose to store external packages in their projects repo. It's common in most of the other modern languages not to store them.

That being said we're not opposed to this kind of vendoring, we support it to a certain extent, and I'm currently working to make that aspect easier.

The glide init command figures out what's being used and will import information from Godep and others. The glide.yaml file is a configuration file designed to hold intent, versions (including branches, tags, and commit ids), and version ranges (like semver ranges). Running glide update will download the external dependencies, resolve versions, make sure the complete tree is available, and generate a glide.lock file with the complete tree pinned to commit id.

With a glide.lock file the glide install command will reproducibly install all the same versions to the vendor/ folder. If the glide.lock file isn't present the update action will be performed.

Also, there's a flag (because we opt-in to this) to update vendored packages (when VCS data isn't available). glide up --update-vendored if you vendor as you do. VCS data for those that were vendored in the first place is not there on vendored repos after an update.

To update a dependency it's glide up [PACKAGE]. The reason you see a bunch of other things happen is because Glide attempts to re-resolve versions given this new data. It will look at SemVer ranges and work out what versions to use before pinning to commit ids again.

In Glide the first version specified in the glide.yaml or encountered wins. If you need to specify a version it works to put it in there.

Other details are in the docs.

That manages the dependencies. But, what about storing just the packages?

We take the safe road for the masses. Some companies and projects don't want stripping and some combination of licenses you can't vendor and then distribute in open source. So, it's all opt-in.

The first step is stripping VCS data in the first place. I wrote about that yesterday.

For the next release I'm working on what's needed to enable stripping packages not being used. It will follow the unix philosophy (allowing to be combined with other tools). We'll make it easy to work with Glide, too.

We already have the ability to resolve just the package tree in use. I'm estimating that the next release of Glide with more details on this will come out in March.

Does that help?

I'm in the k8s slack channel (username is mattfarina) if you want to talk sometime when I'm in there. I'm happy to help.

@thockin
Copy link
Contributor Author

thockin commented Mar 4, 2016

doesn't mean you need to store the external packages in your projects repo

We have had upstream projects disappear, move, and have git history re-written. In fact all of those have happened. There's no way we're NOT going to copy deps into our repo. :)

That being said we're not opposed to this kind of vendoring, we support it to a certain extent

I am having a hard time comprehending this statement. It APPEARS that glide supports it directly - in fact far more easily that godeps. The only thing that is confusing to me is that godeps trims what it stores down to the bare minimum, and glide doesn't. It's not a deal killer (I think), just different and less efficient. Additionally, some of our deps use Godeps, which means we copy their Godeps/_workspace/src tree into our vendor tree, which is just inane.

VCS data for those that were vendored in the first place is not there on vendored repos after an update

That is not what I am seeing. I am seeing all vendored deps retaining their .git directories.

The first step is stripping VCS data in the first place. I wrote about that yesterday.

Blech! Sorry. I want less deps and less tools and less steps for this. Our instructions are already too complex. I'd reaaaaaaallly rather this as a --strip-vcs-data flag to go along with --update-vendored.

I did read the docs, FWIW. It's a little tricky because what you describe as "vendoring" isn't exactly what I/we think of as vendoring. Vendoring without copying deps into my tree seems almost pointless (or hopelessly naive, at best). As such, I was a little surprised that the whole CLI doesn't do that by default. :)

I'll try to take another look today or this weekend. The transition has revealed that we have some existing deps that use godep rewriting, which means that glide imports their Godeps as packages. I want to update those, but they have a tree of deps that ALSO need to be updated. This is terrifying and tedious, and something I really wish the import tool would help with. Specifically, consider this.

kubernetes depends on github.com/coreos/rkt/api/v1alpha, which transitively depends on github.com/golang/protobuf/proto

I want to update rkt to a newer release. I put their semver tag into glide.yaml and run some glide command to re-vendor it.

The build fails because newer rkt needs a newer protobuf. What other newer libs does it depend on? I have no idea. Now I am scared to update because I will almost certainly miss an update, and I do not have time to go through rkt's Godeps file by hand and see if the git hashes they use (because, sadly, it is all git hashes, not semvers) are "newer" or "older" than the ones I already have vendored.

I want glide to say "I see you're updating 'rkt' and that has its own deps. I will go through them one-by-one and offer you a choice". If glide doesn't do that, then I have to do it by hand, and anything I do by hand I will probably screw up. Am I asking for something impossible?

I feel like dep management is a Really Big Deal and should at least have an option to defer to humans, rather than ignoring conflicts. At least that way, when it screws up, it is my fault rather than yours :)

@technosophos
Copy link
Member

@thockin I've been curious about the legal ramifications of modifying packages (like Godeps does) and then checking in the modified code to your source repository. Have the attorneys at Google vetted that procedure? There's no danger of triggering, say, the LGPL viral clause because of that, is there?

@thockin
Copy link
Contributor Author

thockin commented Mar 5, 2016

I feel like I might hack this weekend, but I'd love some guidance.

  1. Are you saying you're against stripping out VCS metadata in glide itself? I can't see any reason why a --update-vendored run would want to keep vcs info, and I am not super keen on making our instructions even more complicated...

  2. Any feeling on an "interactive mode" which lets the use choose which dep to use in case of conflict?

@mattfarina
Copy link
Member

@thockin thank you for the feedback. We're discussing what to put in the 0.10 release right now. So, the timing is great.

I understand why you want to check external packages into k8s repo. I've seen some of the k8s dependencies go away or move without a redirect being put in place. Working with other languages that fetch them at install time, even from GitHub, I've not encountered the level of change I've seen in k8s.

My hope is to support both the case where someone stores in their repo or fetches at install time.

Decisions like this we (@technosophos and I) talk about. So, I'm going to try and get some of his time in the next couple days to talk this out. We've had a number of deep discussions on this so I want to make sure we're on the same page. I'll post back here as soon as we have some direction.

@mattfarina
Copy link
Member

@thockin Here's what we're thinking. This would be for the 0.10 release coming out the middle of this month.

  • Add a --strip flag that removes the VCS data (directories like .git).
  • Add an interactive resolver for conflicts. This may still be a little bit of manual work for a project with the number of dependencies k8s has but it's possible. I've got some UI ideas to make aide the person making the decisions on conflict resolution.

I'm struggling to see a good experience if packages are stripped from a repo. If the packages are stripped and someone working on the project needs another package that's part of a library they are already using they'll need to go out and fetch it again. This can create a complicated, and sometimes difficult, experience for anyone doing active development. If you see something else or a way to make a good experience I'm happy to talk about it.

If you want to discuss any of this I'm happy to connect on slack (I likely won't be back on there until Monday) or jump on a Hangout. Just let me know.

@thockin
Copy link
Contributor Author

thockin commented Mar 5, 2016

A --strip flag sounds perfect and an interactive resolver will at least give me a chance of doing the right thing. I know that some of it will be manual, but even just pausing and asking me which one I want gives me a big leg up on hunting it all down manually.

I'm hacking a little bit now, just to understand the flow of the tool and to see if I can make the output more obvious. Two things jump out. First, there's a lot of output I don't care about ("X is already set to version V. Skipping update") that I would like to suggest not be output at all (or relegated to debug). Second, there's stuff in debug that is interesting that is NOT printed by default ("X imports Y") which I think would give a lot of context.

I'd like to make some PRs around this just so I can understand the tool better - objections?

I also want to augment the "Conflict" messages to make it clear why there is a conflict. "X (@fe345def28592fbc) imports Y @47369daab367, but you already have @925255deaf24c. Skipping". As a first step towards interactive mode, this would be super useful. It looks like it requires a significant amount of plumbing to pass around a queue of structs that represent the graph edges (from, to) rather than just the dep. Objections?

@thockin
Copy link
Contributor Author

thockin commented Mar 5, 2016

As for minimizing packages, I think that is the least of my concerns. If that is the only thing about the changeover that gives me pause, I'll be pretty darn content.

@mattfarina
Copy link
Member

@thockin no objections on PRs. We like PRs. I'm mostly unavailable until Monday. I'll look at them this week.

@thockin
Copy link
Contributor Author

thockin commented Mar 5, 2016

No sweat. I just don't want to spend a lot of time on something if you can
recognize it as a dead end right away.
On Mar 5, 2016 1:55 PM, "Matt Farina" notifications@github.com wrote:

@thockin https://github.com/thockin no objections on PRs. We like PRs.
I'm mostly unavailable until Monday. I'll look at them this week.


Reply to this email directly or view it on GitHub
#300 (comment).

@sgotti
Copy link

sgotti commented Mar 7, 2016

@thockin @mattfarina I wrote some thoughts on vendor flattening and nested vendor folders in #303

Additionally I wrote an initial glide vendor cleaner here: https://github.com/sgotti/gvc (will probably be renamed to glide-vc and I'm preparing other fixes/changes thanks to @mattfarina suggestions).

It's not clear to me if, as I understand from some comments here, something similar will be part of glide. In this case I'll be happy to open a PR to merge it in glide.

@mattfarina
Copy link
Member

@thockin @sgotti if you have a moment can you weigh in on #318. It relates to this issue.

@mattfarina
Copy link
Member

#319 is an issue to track an interactive mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants