Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving or replacing godep for dependency vendoring #44873

Closed
cblecker opened this issue Apr 24, 2017 · 33 comments
Closed

Improving or replacing godep for dependency vendoring #44873

cblecker opened this issue Apr 24, 2017 · 33 comments
Assignees
Labels
area/build-release lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience.

Comments

@cblecker
Copy link
Member

cblecker commented Apr 24, 2017

There are a number of issues with godep for dependency management including but not limited to:

  • godep requires packages in your GOPATH to match the version in your Godeps file
  • godep doesn't cache specific versions of the dependencies -- again they have to match what you have in your path
  • there isn't a built in way to verify the deps are up to date -- you need to restore and download and then diff

Overall, this creates a barrier to contributors if they haven't used godep. Long term, we may want to look at moving to dep, but it's still in alpha and won't be released for awhile.

I wanted to open up a discussion around possibly moving to another vendor dependency solution such as Glide. It may help deal with things like vendor caching, pinning at specific versions without destroying your GOPATH, and the like. What may be complex however is the staging directory and having that vendor correctly with this tool.

Does anyone have any thoughts or opinions?

Related issues:

cc: @kubernetes/sig-contributor-experience-misc @sttts @thockin @caesarxuchao

@cblecker
Copy link
Member Author

cblecker commented Apr 24, 2017

/area build-release
/sig contributor-experience

@thockin
Copy link
Member

thockin commented Apr 24, 2017 via email

@cblecker
Copy link
Member Author

I don't think it's EOL'd, but perhaps @technosophos (who is also involved with kubernetes/helm) can help us out.

@spxtr
Copy link
Contributor

spxtr commented Apr 24, 2017

I'm not a fan of moving to glide if we're planning on moving to dep one or two years later once it's stable.

@cblecker
Copy link
Member Author

@spxtr I guess it depends on the effort involved. If it's a low bar to move to glide and some of the bugs Tim saw last year are fixed, then it may significantly improve the experience for new contributors even if it's just for the 1-3 years until the first-party dep tool is released (which may not even end up working for us initially, because of the way kubernetes does things like staging/)

Right now godep as is is presenting some huge barriers, and maybe those can be fixed with a layer of bash script duct tape.. but perhaps glide is easier/better.

@spxtr
Copy link
Contributor

spxtr commented Apr 24, 2017

Things that use godep in a nontrivial way:

  1. hack/verify-godeps.sh will need to be replaced with an equivalent script.
  2. hack/update-godep-licenses.sh ditto.
  3. hack/update-staging-godeps.sh
  4. staging/copy.sh
  5. staging/godeps-json-updater.go
  6. Community docs, and others.

I probably missed some, just did ag godep in k/k. I'm not sure how the staging stuff will work out, since I'm not sure what the plan is for the staging directory long-term.

@caesarxuchao
Copy link
Member

If we can get rid of staging/ and develop in the repos directly, 3,4,5 will go away, though it probably requires more effort than switching to glide.

@eparis
Copy link
Contributor

eparis commented Apr 24, 2017

I tried to use dep a couple of months ago and it fell over for our repo. I filed some bugs and they seem to have fixed everything I found. Haven't gone back to try again...

@ericchiang
Copy link
Contributor

ericchiang commented Apr 27, 2017

We routinely have issues with the godep instructions failing, not producing the correct change, taking forever, using too much memory, etc. Internally, CoreOS moved away from godep about a year ago for glide because the tool routinely ate into developers' time, and I've see Kubernetes contributors have similar issues with tool (e.g. #42669 (comment)).

I think moving off godep is something worth pursing.

Even if we want to wait for dep, we should probably figure out early if it'll work for a repo this size. It'd be really unfortunate if we waited for dep to go stable, found out it didn't work for Kubernetes, then were stuck with godep until the next tool came along.

Having been fighting with godep all day, I'd be happy to help with any work in this direction.

@0xmichalis 0xmichalis added the sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. label Apr 27, 2017
@cblecker cblecker changed the title Investigate using Glide instead of godep for dependency vendoring Improving or replacing godep for dependency vendoring Apr 29, 2017
@cblecker
Copy link
Member Author

I had a look at using glide a couple days ago, but my experiment didn't go well due to the complexity of k/k. Ran into two different bugs, as well as issues with the staging dir. The other issue with glide is the way it interacts with bazel. The way it handles subpackages doesn't play well with bazel's build files. I have a feeling it could be worked around, but it would be nontrivial for sure.

Very timely, there was also some movement on dep (golang/dep#170 (comment)). It looks like @ericchiang is taking a poke at it.

Still more looking into to be done.

@sdboyer
Copy link

sdboyer commented Apr 29, 2017

Even if we want to wait for dep, we should probably figure out early if it'll work for a repo this size. It'd be really unfortunate if we waited for dep to go stable, found out it didn't work for Kubernetes, then were stuck with godep until the next tool came along.

We dep folks are trying to actively pay attention to y'all (and I've not forgotten my discussions with @thockin), though there are a lot of plates spinning right now 😄. If someone could continue poking at it regularly, that'd be the best way to ensure we're continuing to iron out bugs that only projects the size of k8s tend to run into. For my purposes, anyway, I think just giving it a poke once a week should be plenty.

For example:

because of the way kubernetes does things like staging/

I don't know how k/k uses staging/, but if it's any kind of well-defined use pattern, it'd be good to know about. (If it's a question of forcing the tool to ignore code within that dir, that's possible now - though wildcard ignores are not supported).

or:

The way [glide] handles subpackages doesn't play well with bazel's build files.

While dep's architecture is quite different from glide's, I've no idea what the underlying issue here might be, as I don't know bazel. This is exactly the sort of thing that I'd love to get information about - and the sooner the better, so that we can validate our architectural decisions. Because...

it's still in alpha and won't be released for awhile.

We're actually trying to move towards, at least, a stable manifest and lock (golang/dep#276), so that folks can feel safe about committing those files and using dep, for real. I'm hoping we land that by mid-May. My cautious side tells me it'd still probably be premature for k8s to try to switch at that point, but stabilizing the command set comes next (golang/dep#277), after which I'd say it should be safe for y'all to evaluate in earnest.

@cblecker
Copy link
Member Author

Commenting to link related issue: kubernetes/client-go#78

@cblecker
Copy link
Member Author

To update on this:

  • hack/update-staging-godeps.sh has been updated so it works inside a docker container now. This seems to be working well, although the execution time of this script has gone from ~260s to ~460s as it's restoring the godeps a second time (Jenkins does this once to the "real" gopath, and this script does it a second time inside the container). But, this step no longer relies on the user's environment other than having docker installed.
  • client-go is now authoritative so a number of other scripts (including staging/copy.sh) are now no longer needed and have been removed (kudos to @caesarxuchao and @sttts! 🎉 )
  • The next steps as I see it are to work on:
    • Move hack/verify-godeps.sh to run in a container as well
    • Clean up jenkins scripts as they wouldn't need to restore godeps at all into the "real" gopath
    • Add scripts/tooling for contributors to easily add/remove/change godeps via the containers without needing to alter their local environment. Something like hack/godep.sh.

This would get us to the point that users wouldn't need to change their local environment/GOPATH to be able to work on k8s, or to run the make verify steps on their PRs. These steps would also hopefully make it easy to migrate to another dependency manager like dep at a later date without needing to change contributor workflows.

Please let me know if anyone has any thoughts. I'll operate on lazy consensus and if there are no objections, I'll continue work on this 😃 .

@liggitt
Copy link
Member

liggitt commented Aug 10, 2017

although the execution time of this script has gone from ~260s to ~460s

that's really unfortunate... godep already dominates the verify CI job... ~30 minutes of the 1 hour run time from what I've seen debugging timeouts on https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/pr-logs/pull/49642/pull-kubernetes-verify/

@cblecker
Copy link
Member Author

It's because we're restoring the godeps potentially up to 4 times depending on a couple different factors. However, once we transition to restoring the godeps in a container and passing that data container between steps, this should drop the time by a good chunk.

@liggitt
Copy link
Member

liggitt commented Aug 10, 2017

we should not knowingly merge changes that drastically lengthen the run time of the godep scripts... the verify CI job is already by far the longest running job on the merge queue (it's now at 1h9m when godeps are verified, the next longest job is at 45m)

@liggitt
Copy link
Member

liggitt commented Aug 10, 2017

The next steps as I see it are to work on:

  • Move hack/verify-godeps.sh to run in a container as well

I'm especially concerned about this step taking longer... it currently takes ~18 minutes

k8s-github-robot pushed a commit that referenced this issue Sep 30, 2017
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Godep scripts cleanup

Another try at sanitizing godep scripts.  Instead of messing with containers, or assuming anything about GOPATHs, let's just work with whatever GOPATH we are given.  If you want sane godep/restore behavior you can use `./hack/run-in-gopath.sh ./hack/godep-restore.sh`.  This will restore into your _output dir.  You can update deps yourself, and then run `./hack/run-in-gopath.sh ./hack/godep-save.sh`.

xref  #44873

This also checks out godep into your working GOPATH.  Without this, we have to `go get` it every time.  We should just vendor it.
@ibuildthecloud
Copy link
Contributor

I want to throw my hat into the ring here and propose trash https://github.com/rancher/trash. Trash is not a beautiful peace of work. It was designed to be quick and dirty. It took inspiration from Docker's old vendor script that just took a list of repos, cloned then, and deleted unused code. The usage model is just edit vendor.conf in the root of your repo and add ${REPO} ${HASH} run trash and it builds the vendor folder. It does just that, but very fast. It is not a dependency manager. It's a vendor folder builder. After trash was written the Docker folks replaced their vendor script with vndr https://github.com/LK4D4/vndr which is almost exactly the same as trash but for whatever reason trash is faster. I tried out trash on k8s repo.

$ time trash
real	0m13.083s
user	0m17.876s
sys	0m3.881s

So it takes 13 second to process k8s (on my i7-7600U laptop, with SSD). That is assuming all git repos are already cached (trash maintains a cache of repos in ~/.trash-cache). If you have to pull the git repos fresh it just depends on bandwidth and all that.

The output of trash is not a perfect match of what Godeps produces though. It's not far off though. I could get an engineer at Rancher Labs to modify trash specifically to make it work for k8s. But I'd rather not do that if it won't get accepted. I agree that long term dep is the way to go. But I'm impatient. So what I'm proposing is that we replace Godeps with trash and trash will be modified to match k8s oddities (like staging/). One day when dep is done we trash trash (see what I did there?) and move to dep.

@sttts
Copy link
Contributor

sttts commented Dec 30, 2017

So what I'm proposing is that we replace Godeps with trash and trash will be modified to match k8s oddities (like staging/). One day when dep is done we trash trash (see what I did there?) and move to dep.

As written on Twitter, I have my concerns switching to some intermetiate tool (before dep). Main reason: there are more artifacts and consumers than in a "normal" Go project. We have to

  1. support something like staging/
  2. output Godeps.json at least for the published staging repos, preferably also for k8s.io/kubernetes itself because downstream consumers expect it.

There are reasons that godep is as slow as it is. Compare my old tools/godep#538. If speed is our only issue, we can also fork godep.

In general, changing the vendoring tool can easily disrupt the workflow of k/k devs, possibly the publishing tool chain and a number of downstream consumers. As much as I hate godep and would love to see us move to something better, we have to be careful to understand all the consequences. As the very least such a switch must be prototyped with the new tool before we can realistically discuss whether it is worth it. The devil is in the details. If trash happens to be a drop-in replacement hidden in our hack/ scripts, 30x faster, but with the same artifacts (Godeps.json for k/k and all the staging repos), this prototype could be very convincing though.

@sttts
Copy link
Contributor

sttts commented Dec 30, 2017

So it takes 13 second to process k8s (on my i7-7600U laptop, with SSD). That is assuming all git repos are already cached

Afaik in CI we do not cache. Am I wrong? How does it look like without caching? @liggitt mentions 18 min which trash has to beat.

@ibuildthecloud
Copy link
Contributor

@sttts Yep, totally understand the concerns with the switch which is why I'm asking if anybody would want this. Basically what I'm saying is that I'm willing to bend trash to the requirements of k8s. Its our stupid little tool and it works well for our needs. We work more and more with k/k code base and frankly godeps is like pulling teeth for us being that our day to day activities we usually use a much faster tool. So I'm more than willing to support Godeps.json, make it produce the same output.

It basically comes down to this. Do you want the "correct" solution, or one that "works." I think everybody agrees long term dep should be the correct answer. For that to happen dep needs to mature, and k8s needs to change to match how dep works. Going with trash would be something that just works but it won't exactly be a pretty solution. It would mostly keep the status quo and just make things faster and easier to deal with. So it's a hack....

If you aren't caching the git clones in CI then it's hard to make things very fast. I just tried running trash using an empty cache on k8s repo and it takes about 10 minutes just to clone the dependencies using trash. That was on a 2GB digital ocean VM (so not super fast machine).

@cblecker
Copy link
Member Author

cblecker commented Feb 5, 2018

An update on this:

  • godep as it is is still.. well godep. It's still awful, but the scripts around it have been touched up to be slightly less awful. There are updated docs here about how to work with it, as it stands currently in k/k. We've also made some changes to speed things up in CI when running the verify job, as well as make it clearer and more verbose when things fail.
  • The direction forward is seems to be coalescing around dep. Development on dep has been rapid, and @sttts has a mostly working prototype here: EXPERIMENTAL: golang/dep support #57795
  • We also have dep running on k/test-infra, and have been working through some of the kinks in the interactions between dep and the bazel build system.

As @sttts mentioned, actually moving away from godep is a big change, and will be disruptive to both developer workflow and those who vendor kubernetes (and the derivative staging repos), and developers working on kubernetes itself. With how close dep is to working for us, I don't think moving to another vendoring tool makes sense.

This was referenced Feb 5, 2018
@justinsb
Copy link
Member

justinsb commented Feb 5, 2018

Adding notes about how we got this to work for kops (mostly). Originally posted in #59332, but reposting here.

With the latest version of dep and a few tweaks, it now appears to be possible to get dep working with kubernetes.

Here are some learnings from getting dep to not crash for kops (cf kubernetes/kops#4382 ):

  • We created a script kubernetes/kops@9fbbbcb which compares the versions in Godeps.json to the versions resolved by dep
  • dep version resolution can be complicated, so we ended up effectively locking every dependency. This is probably no bad thing, as otherwise it seems that introduction of new code in either the repo or a dependency could change the versions resolved (i.e. the builds might not be repeatable). (cf kubernetes/kops@8c18d03 )
  • requires blocks let us declare dependencies that are used by our build script, for example for apimachinery code generation (cf kubernetes/kops@69ff65d )
  • github.com/chai2010/gettext-go has a weird dependency in the version declared in Godeps.json (at least in the 1.9 branch), so we had to deviate from that version and use bf70f2a70fb1b1f36d90d671a72795984eab0fcb (hat-tip to @eparis for the fix)
  • For use with bazel, we use some horrific trickery: we dep ensure, then we delete all the bazel files from vendor, then we run bazel with bazel run //:gazelle -- -proto disable. For projects which use proto in bazel, we then need to revert the BUILD.bazel changes outside the vendor directory (as we only want --proto disable in the vendor directory). (cf kubernetes/kops@61b23a6)

It's not ideal, we're basically use dep as a glorified wrapper around git pull, but at least it finally can be made to work.

One intriguing option: this could solve our problem of coordinating versions for a release in a multi-repo world: we could publish a list of the dep overrides for each release. In theory dep won't use those overrides unless the source code actually uses them (in theory).

@filbranden
Copy link
Contributor

What's the status here? I'm very interested in this issue (in fact, I was looking for one because I'd file one if this wasn't here yet... Been quite frustrated with godep lately.)

I saw that kubernetes/test-infra#4868 (comment) from January says "We are using dep now" but #44873 (comment) from February describes a plan to move from godep to dep... So I imagine this is still in progress?

One part that's been particularly frustrating to me with godep (because there's an easy fix for it) is how it uses unstable git describe strings, so whenever I update Godeps.json it basically touches the whole file... I filed tools/godep#560 to fix that, but not sure how much priority is in fixing godep at this point... If someone has an "in" with the godep project and could look into getting that PR in? If we're gonna have to live with godep for another while... Can we at least get that part fixed?

Thanks!
Filipe

@cblecker
Copy link
Member Author

@filbranden The discrepancy you note there is test-infra has moved to dep, but kubernetes/kubernetes has not yet. The goal is still to move towards dep, but some of the work stalled out due to other priorities (specifically the near the end of the 1.10 release).

@filbranden
Copy link
Contributor

Oh thanks a lot for the clarification @cblecker !

So looks like things are on-track for dep in kubernetes/kubernetes as well... That's great! I'll be very happy when godep is gone...

Cheers,
Filipe

@fasaxc
Copy link
Contributor

fasaxc commented May 31, 2018

Looks like vgo is gaining traction before dep has even reached GA...

@lavalamp
Copy link
Member

I filed #63607 about figuring out our vgo strategy, though that is from the perspective of what we publish and this seems to be about how we consume.

@cblecker
Copy link
Member Author

An update on this.. yes, we're looking at moving towards vgo now that it is experimental in go1.11. There's a POC here: #65683

Both the publish and consumption use cases need to be taken into account.

@cblecker cblecker added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Nov 9, 2018
@paralin
Copy link
Contributor

paralin commented Feb 26, 2019

Is there an update on this now that we have reached Go 1.12 and go.mod has been supported for a full version cycle?

@nikhita
Copy link
Member

nikhita commented Mar 10, 2019

Is there an update on this now that we have reached Go 1.12 and go.mod has been supported for a full version cycle?

#74877

@liggitt liggitt self-assigned this Mar 19, 2019
@liggitt
Copy link
Member

liggitt commented Apr 6, 2019

@k8s-ci-robot
Copy link
Contributor

@liggitt: Closing this issue.

In response to this:

addressed in https://github.com/kubernetes/enhancements/blob/master/keps/sig-architecture/2019-03-19-go-modules.md and #74877

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build-release lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience.
Projects
None yet
Development

No branches or pull requests