New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vendor directory as $GOPATH/src/vendor #313

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@campoy
Copy link

campoy commented Mar 10, 2017

Updates #148

@googlebot googlebot added the cla: yes label Mar 10, 2017

@jessfraz

This comment has been minimized.

Copy link
Contributor

jessfraz commented Mar 10, 2017

LGTM!

context.go Outdated
@@ -124,6 +124,11 @@ func (c *Ctx) LoadProject(path string) (*Project, error) {
//
// The second returned string indicates which GOPATH value was used.
func (c *Ctx) SplitAbsoluteProjectRoot(path string) (string, error) {
// allow vendor directry to be directly under GOPATH/src

This comment has been minimized.

@fern4lvarez

fern4lvarez Mar 10, 2017

directory ☺️

This comment has been minimized.

@campoy
context.go Outdated
@@ -124,6 +124,11 @@ func (c *Ctx) LoadProject(path string) (*Project, error) {
//
// The second returned string indicates which GOPATH value was used.
func (c *Ctx) SplitAbsoluteProjectRoot(path string) (string, error) {
// allow vendor directry to be directly under GOPATH/src

This comment has been minimized.

@sdboyer

sdboyer Mar 10, 2017

Member

nit: s/directry/directory/

This comment has been minimized.

@campoy
Francesc Campoy

@campoy campoy force-pushed the campoy:master branch from 05bcf26 to 233cb1f Mar 10, 2017

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 10, 2017

So on the one hand YAAY 👏 welcome @campoy 😄

On the other...I'm actually not sure what the effects are of trying to do it this way. I think it'll be one of two things:

  1. every possible import path is considered to be the "current project", so dep does nothing because it assumes that all imports are internal and everything is taken care of
  2. no import path is considered to be internal, which is...actually kinda weird, I'm not sure how it would work out.

Unless I've totally misimagined how it would handle this case, and it...somehow works out?

I need to test a bit locally and see how it behaves. And we'll also probably want tests to verify behavior.

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 10, 2017

So it seems to work out as some weird combination of the two - nothing is considered to be internal, but dep still doesn't write anything to the lock/vendor.

I think we need a specification about what we actually want the behavior to be. dep/gps are built around the idea that there is a non-empty import path that is the "project root," and that all packages within the project are the lexical children of that root in a slash-delimited import path hierarchy.

If that root is "." (the result of this PR), then no import path will be its child. If that root is the empty string "", then every import path would be its child. Neither of these are really the desired outcome.

Note that this has nothing to do with what's actually on your GOPATH - we abstract from that, interacting symbolically with import paths. (Without doing that, the behavior of dep would be entirely driven by folks' local GOPATH setup, which is the opposite of what we want).

That's maybe bigger than the scope of this PR, though - I think maybe here, if the goal is just to write vendor to the current GOPATH root instead of the project root (but run dep from within a project, not at the root of the GOPATH), then that's much easier.

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 10, 2017

My case is a bit strange, so let me document this.

I have a project that uses many languages, Go is one of them.
To avoid forcing non gophers to checkout the project in a specific path I created the following file structure:

*
|- .git
|- frontend (with java inside)
\- backend (with Go inside)
  \- src (GOPATH is set here by Makefiles for make build, etc)
    |- server
    |- proxy
    |- tools
    \- vendor (packages used by server, proxy, and tools)

In this case calling dep ensure from backend/src will create the vendor directory and everything will work correctly, and I (and my non Go writing friends) will be happy.

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 10, 2017

I can't share my code yet but this does work, it generates this lock.json:

{
    "memo": "93f2c886cd1a50ba14c0af5b308587b66c1faae7cdd52cfe3bd05f576707b392",
    "projects": [
        {
            "name": "github.com/Sirupsen/logrus",
            "version": "v0.11.4",
            "revision": "0208149b40d863d2c1a2f8fe5753096a9cf2cc8b",
            "packages": [
                "."
            ]
        },
        {
            "name": "github.com/golang/glog",
            "branch": "master",
            "revision": "23def4e6c14b4da8ac2ed8007337bc5eb5007998",
            "packages": [
                "."
            ]
        },
        {
            "name": "github.com/golang/protobuf",
            "branch": "master",
            "revision": "c9c7427a2a70d2eb3bafa0ab2dc163e45f143317",
            "packages": [
                "jsonpb",
                "proto",
                "protoc-gen-go/descriptor"
            ]
        },
        {
            "name": "github.com/grpc-ecosystem/grpc-gateway",
            "version": "v1.1.0",
            "revision": "a8f25bd1ab549f8b87afd48aa9181221e9d439bb",
            "packages": [
                "runtime",
                "third_party/googleapis/google/api",
                "utilities"
            ]
        },
        {
            "name": "golang.org/x/net",
            "branch": "master",
            "revision": "a6577fac2d73be281a500b310739095313165611",
            "packages": [
                "context",
                "http2",
                "http2/hpack",
                "trace"
            ]
        },
        {
            "name": "golang.org/x/sys",
            "branch": "master",
            "revision": "99f16d856c9836c42d24e7ab64ea72916925fa97",
            "packages": [
                "unix"
            ]
        },
        {
            "name": "google.golang.org/grpc",
            "version": "v1.0.5",
            "revision": "708a7f9f3283aa2d4f6132d287d78683babe55c8",
            "packages": [
                ".",
                "codes",
                "grpclog",
                "metadata"
            ]
        }
    ]
}

and populates the vendor directory accordingly

@peterbourgon

This comment has been minimized.

Copy link
Member

peterbourgon commented Mar 10, 2017

We went through this quite extensively in the past — put vendor/ anywhere other than the repo root and you're gonna have a bad time. I'm super dubious...

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 10, 2017

So, what do you recommend for my usecase where I need to have GOPATH inside of my repo, @peterbourgon? 😕

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 10, 2017

Huh. Weird. I wonder how the environment I set up is different. Let me try...

Ah! I put my main.go file in $GOPATH/src/main.go. If you put it into a subdir it...works. Which just makes this even weirder.

@campoy to my mind, we should think of this as a case where we're actually trying to use the tool without a GOPATH: "This dir and below is a bunch of Go source; work out the deps for it." Of course, a world without GOPATH does not exist (right now), but it's the behavior we're trying to simulate here, no?

@peterbourgon re: locating the vendor at a different place than the project root? Yeah, it doesn't strike me as a good idea, either, but there are definitely some use patterns out there we're missing, so I'm keeping an open mind.

@peterbourgon

This comment has been minimized.

Copy link
Member

peterbourgon commented Mar 10, 2017

Yeah, at this stage, @campoy, I'm afraid dep has the same restrictions as the rest of the go toolchain: it needs to operate on repositories that are "properly homed" within a GOPATH.

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 11, 2017

How is $GOPATH/src not under GOPATH?
How is it different from $GOPATH/src/foo?

@freeformz

This comment has been minimized.

Copy link
Contributor

freeformz commented Mar 11, 2017

A vendor/ dir at the top level, under $GOPATH's src/ dir is supported by the go tooling, in so much as the go tool will look for packages in it when resolving package locations. So strictly speaking this is supported by the go tool. Furthermore, the packages in his $GOPATH are properly namespaced underneath $GOPATH/src (they live in sub dirs). @campoy patch solves this issue where the project root == "$GOPATH/src" AFAICT, which is how he has explained using the tool AFAICT, since AFAIR, our spec just looks for manfiest files to determine the project root from the current directory upwards.

I'm 👍 on this change.

@campoy The only thing I would like to see is a test for this behavior to make sure it's not broken in the future.

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 12, 2017

Thanks for your input, @freeformz
I added a simple test, let me know if you'd like more.

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 12, 2017

How is $GOPATH/src not under GOPATH?
How is it different from $GOPATH/src/foo?

When we say "under GOPATH", we really mean "under $GOPATH/src"; go get has established that as the root of the import path namespace for source code. So src is (arguably) different from src/foo, as it's the root of the tree, vs a node in the tree.

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 12, 2017

Note: this ended up longer than I'd intended. But this is part of a knotty set of problems that IMO will require elbow grease; thus, I'm hoping this writeup will be a useful reference more generally.

My concerns here are that this all works more or less by accident, and only in a somewhat restricted case. We COULD decide this is fine. But I'd rather not just blindly step into that decision; my preference is to take advantage of this experimental stage as a way of trying out deeper solutions to this problem.

Now that I've had some time to really dig at this, I can be more concrete about what the problems are.

Say that, instead of the path structure @campoy originally proposed, you have this structure:

*
|- .git
|- frontend (with java inside)
\- backend (with Go inside)
  \- src (GOPATH is set here by Makefiles for make build, etc)
    \- example.com
      |- server
      |- proxy
      |- tools
    \- vendor (packages used by server, proxy, and tools)

Maybe not the most normal thing, but not crazy, either. (Mentally substitute mycompany.com for example.com, if that helps)

If there are any imports between the local packages (which it would be wildly uncommon for there NOT to be), init/ensure will currently fail with something like:

sm.DeduceProjectRoot: unable to deduce repository and source type for: "example.com/server"

If, instead of example.com, the path structure is something like, say, github.com/kr/pretty that the tool knows how to deduce a root from, then it won't error...but lock.json and vendor/ will end up with the github.com/kr/pretty project in them, even though those paths are members of the local/current project:

{
    "memo": "1542339c5bb5ce57680a30b9ef308b6d822fe634c38dcf33458cfdfd354b0050",
    "projects": [
        {
            "name": "github.com/kr/pretty",
            "branch": "master",
            "revision": "cfb55aafdaf3ec08f0db22699ab822c50091b1c4",
            "packages": [
                "."
            ]
        },
        {
            "name": "github.com/kr/text",
            "branch": "master",
            "revision": "7cafcd837844e784b526369c9bce262804aebc60",
            "packages": [
                "."
            ]
        }
    ]
}

Note that the versions of those libs on the GOPATH have no effect on what's selected in the lock.json - which makes this even less like the current behavior of dep init.

The basic problem here has to do with how gps classifies import paths. For that, some background is needed.


When running either dep init or dep ensure, we need to know the list of imports that are external to our/the current project. So, one of our first steps is to perform static analysis on the tree of packages within the current project by calling gps.ListPackages(), which constructs a gps.PackageTree, starting from the ProjectRoot. Shortly thereafter, we call PackageTree.ToReachMap(), which computes a partial transitive closure of imports, where external paths are treated as terminal nodes.

Here's some visuals, to help explain.

Imagine we have three projects, A, B, and C. They all contain at least two packages, and some packages in A imports some in B and C. Let's consider A the current project - the one for which we're solving its dependencies.

reachmap

The orange packages are the ones that were scanned in by ListPackages() when called pointing at A as the root; arrows designate that the tail package has an import statement corresponding to the head; the teal packages are members of other projects that are directly imported by some package in A, and the grey area is the logical scope of the tree covered by the import path root named A.

When we call ToReachMap() on the PackageTree representing A, then the returned ReachMap will indicate the following linkages:

reachmapt2

Grey arrows indicate tree-internal import statements, and black arrows indicate tree-external import statements. (I've also dropped the groupings around B and C because a real ReachMap does not make any such distinctions - they're just strings.)

The problem with the change introduced by this patch is that it breaks a basic assumption of the model - that the packages scanned in to the current project's PackageTree operate on paths that are logical descendants of the ProjectRoot. When the ProjectRoot is ".", as this patch sets it to be, but A is the analyzed tree, the resulting ReachMap looks like this:

reachmapbt

The ReachMap contains entries for the four orange A* packages (even though they don't match the ProjectRoot), but there are no more internal imports - A importing A/bar is now considered a tree-external import, because, by design, internality/externality is not determined by the relationship between A and A/bar, but between . and A/bar.

OK, that's the first part. Part two is how gps treats internal vs. external imports. Internal imports are given cursory checks for correctness (e.g., does an internal import name a path without a corresponding package? like if A/bar imported a nonexistent A/zot in our example), but as long as the tree is internally consistent, they don't induce a requirement within the solver.

This is the "door carefully left ajar" I mentioned, because it allows people to use a project root that isn't deducible - like example.com above, or maybe company-monorepo/subpath/subpath/blah/ fizz/myproject - for the project they're working on, and they won't get that error I mentioned earlier:

sm.DeduceProjectRoot: unable to deduce repository and source type for: "example.com/server"

External paths, however, DO induce a requirement, because that's literally the point of this exercise - to find external dependency requirements, and ensure they're satisfied. There is, however, an exception - if one of those "external" imports is in stdlib, then we also ignore it as a requirement. This is a reasonable thing to do because the Go 1 guarantee promises us backwards-compatibility, and by definition, imports into stdlib never escape stdlib.

This exception for stdlib is why @campoy's example works; this function decides if an import path points to stdlib. If you have a look at it, you'll see why it's mistaking everything under server, proxy, and tools for stdlib. Consequently, the everything-is-external misclassification described above is happening, but the packages in the current project don't get re-investigated because of the stdlib rule.

So, it works...for now. I don't really anticipate a change to the isStdLib() func anytime soon, but I also don't think it's inconceivable. It would be bizarre and absurd for this use pattern to stop working because we change that function.


To reiterate, I believe this is an important use case, and I want to address it. But I'd really prefer it worked because we designed for it, rather than by accident with weird edge cases.

In writing this up, it's seemed increasingly reasonable to me that we could accommodate an explicitly empty ProjectRoot. It would take some doing, but I think we could thread it through in a way that lacks side effects...for the solver. Eliminating weirdness in the UX depending on what's in the GOPATH may be a larger challenge, though.

@campoy

This comment has been minimized.

Copy link

campoy commented Mar 13, 2017

Wow, that's a great explanation @sdboyer!

Looking forward to this on your keynote at GopherCon too

It is interesting that this will work only because I'm not using any . on my import paths, and indeed that feels like an accident rather than a feature.

The solution for my case is, for now, to create a directory containing all of my packages. It's not a huge pain, but it could be blocking adoption for existing codebases that use this directory structure. If I recall correctly, that is the structure promoted by gdb, so I expect some friction there.

Thanks again for the analysis, I agree my PR does not solve the deeper problem, and I'm OK with you closing it.

I'll have to fix some typo on the docs to become a contributor now ... 😄

@sdboyer

This comment has been minimized.

Copy link
Member

sdboyer commented Mar 20, 2017

Thanks! sorry for delayed response - busy week :)

I do think it would be worth it to have explicit support for an empty ProjectRoot. I'll close this PR, but I'm going to open some follow-ups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment