Skip to content
This repository has been archived by the owner on Feb 26, 2019. It is now read-only.

godep unintentionally stripping license files from vendored libraries when target is a subdirectory #245

Closed
dropwhile opened this issue Jul 16, 2015 · 26 comments

Comments

@dropwhile
Copy link

It appears godep strips license files and other data from dependencies when godep save is used.
I have a project which uses a nested subdirectory as the main entry point for most folks. When godep save is used by consumers of the library, this is the resulting file hierarchy.

Godeps/_workspace/src/github.com/cactus/go-statsd-client
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/buffered.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/buffered_bench_test.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/buffered_test.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/doc.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/main.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/main_bench_test.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/main_test.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/noop.go
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/test-client
Godeps/_workspace/src/github.com/cactus/go-statsd-client/statsd/test-client/main.go

Note the lack of a LICENSE file which is present in the original repository. This means projects using the repo in conjunction with godep would be in non-compliance with the license, wholly unintentionally.

There are several other projects which use similar hierarchy layouts. Here is one such example.

@dropwhile dropwhile changed the title godep unintentionally stripping license files with nested projects godep unintentionally stripping license files from vendored libraries when target is a subdirectory Jul 16, 2015
@freeformz
Copy link

Hmm. Godep uses go's own libraries/tools to handle package/file selection, which basically ignores everything that isn't a *.go file. This is unlikely to be a "trivial" fix. :-(

@kr
Copy link
Member

kr commented Jul 17, 2015

It should be copying all files, not just .go source files. See copySrc in save.go.

@kr
Copy link
Member

kr commented Jul 17, 2015

Aha, it seems the license file is in a parent directory of all the go packages. Since godep copies only packages, it misses the parent directory. Yes, this would be difficult to fix.

Although godep mainly operates on packages, it already has some special cases for the root of a repository. It is not too hard to imagine another special case that copies all regular files from the repo root of any dependency. (We still want to avoid simply copying the entire repo, because it might contain some very large packages that are unused and could otherwise be omitted.)

@dropwhile
Copy link
Author

Copying the files at the repo root would certainly be sufficient to cover consumers of my repos.
I do not have any repos more deeply nested than that, nor do I have any individually exportable sub-directories with different licenses.

However, I could potentially imagine someone doing something like this, no matter how confusing or possibly ill advised it may be.

github.com/someuser/someproject
github.com/someuser/someproject/README
github.com/someuser/someproject/pickles/LICENSE(BSD)
github.com/someuser/someproject/pickles/canning/canning.go
github.com/someuser/someproject/pickles/canning/vinegar.go
github.com/someuser/someproject/relish/LICENSE(GPL)
github.com/someuser/someproject/relish/canning/canning.go
github.com/someuser/someproject/relish/canning/vinegar.go

If the consumer simply imported github.com/someuser/someproject/relish/canning, maybe godep could copy bare files from the import up to and including the root? In this case LICENSE(GPL) and README.
Not sure if this would be easier or more difficult, than simply special casing the root.

Like I said, special casing the root would cover my needs though.

@kr
Copy link
Member

kr commented Jul 17, 2015

Yeah, that would be reasonable as well.

@ahmetb
Copy link

ahmetb commented Oct 12, 2015

+1 for this. I think godep can preserve LICENSE* or COPYING* files in the vendored source trees. @cactus are you going to be working on this?

@dropwhile
Copy link
Author

@ahmetalpbalkan Nope. I do not use godep. Someone using one of my libs had the license stripped (not cool as it means they were unknowingly noncompliant with just about any open source license when using godep), so I filed this ticket.

@ahmetb
Copy link

ahmetb commented Oct 12, 2015

@kr are you ok with having this feature preserving LICENSEs by default?

@freeformz
Copy link

I am okay with it. Basic implementation would be to do a case insensitive check for common License files (LICENSE, LICENSE.TXT, LICENSE.MD, ??) in the root of a repo and include the file if it exists. @ahmetalpbalkan Feel free to work on it and submit a PR when you are ready.

@client9
Copy link
Contributor

client9 commented Oct 13, 2015

FYI - https://github.com/ryanuber/go-license
checks the following files:

// A set of reasonable license file names to use when guessing where the
// license may be.
var DefaultLicenseFiles = []string{
        "LICENSE", "LICENSE.txt", "LICENSE.md", "license.txt",
        "COPYING", "COPYING.txt", "COPYING.md", "copying.txt",
        "UNLICENSE",
}

and I'm looking at adding README*

@ahmetb
Copy link

ahmetb commented Oct 13, 2015

@client9 the list above looks enough but I'm not sure if we need READMEs to be vendored. I know they might have a license clauses in them, but maybe include them if they contain the word “license”?

@client9
Copy link
Contributor

client9 commented Oct 13, 2015

@ahmetalpbalkan yeah, sorry I wasn't clear. I don't recommend you copy the readme unless it has license info (but even then you might not want to include that logic in godep). Im only updating go-license code to peek inside readmes as a last resort if a COPYING/LICENSE is missing.

@ahmetb
Copy link

ahmetb commented Oct 13, 2015

@client9 hmm maybe even that's a bit overkill. I think anyone serious with the licenses would just have it in a separate file. I also looked at go-license repo and spotted an issue with case-sensitivity in file names. Other than that, we should be able to just use it.

@client9
Copy link
Contributor

client9 commented Oct 13, 2015

yeah, you'd be amazed at the number of packages that just add a link in readme and call it a day. I attempt to file tickets to get the authors to convert to using a separate file, but...

Agree they should ignore case. I'll hope to hack on go-license a bit more today.

Thanks for looking into this and for your work in godep!

@client9
Copy link
Contributor

client9 commented Oct 17, 2015

Hi @ahmetalpbalkan , I think Im about done with a PR for go-license

ryanuber/go-license#9

My guess is .. do you even care about the license itself? meaning you might just want a function "give me files that look like a license" (e.g. COPYING, LICENSE, etc) so you can copy it Anyways, let me know if you want a different API in go-license or if you want some of the private functions made Public.

Of note are

// returns a []string of files in a directory, or error
func readDirectory(dir string) ([]string, error)
// returns files that case-insensitive matches any of the license
// files.  This is generic functionality so pulled out into separate
// function for testing
func matchLicenseFile(licenses []string, files []string) []string {

@ahmetb
Copy link

ahmetb commented Oct 18, 2015

@client9 I think a method returning license filenmes would be just fine. I don't think we need the licenses parameter above. Something like

func licenseFiles(dir string) (licenseFiles []string)

would be more than enough.

@client9
Copy link
Contributor

client9 commented Oct 18, 2015

yeah that my thinking too.. ok will hack away!

@client9
Copy link
Contributor

client9 commented Oct 18, 2015

hi @ahmetalpbalkan

Looking into this a bit more, I think you want to copy a few more "files of interest". While most cases are handled by go-license there are number of other cases that don't quite fall into software license, that you may wish to copy as well.

Im happy to write up (perhaps not in go-license) that captures all the examples below

func FilesOfInterest(dir string) ([]string)

Let me know what you think.

examples to follow:

Ex: everything Facebook open sources is now BSD in one file, and patents file in another.
Using https://github.com/facebookgo/startstop as typical example we got:

https://github.com/facebookgo/startstop/blob/master/license
https://github.com/facebookgo/startstop/blob/master/patents

Ex: other legal notices, eg. Docker stuff contains NOTICE
https://github.com/docker/docker/blob/master/NOTICE

The following aren't go, but we can expect this as time goes on:

Ex: Copying + Copyright
ex: https://github.com/airbnb/airflow

Ex. Duplicate licenses in different files. Haven't seen so much so in go but some repos have COPYLEFT and COPYING/LICENSE
https://github.com/elmom/pypsyc
https://github.com/discourse/discourse

Ex: More of everything. If one notice is good, more must be better

these guys
https://github.com/janestreet/fieldslib

Sure have a license but also:
https://github.com/janestreet/fieldslib/blob/master/LICENSE.txt

Legal disclaimer
https://github.com/janestreet/fieldslib/blob/master/INRIA-DISCLAIMER.txt

Separate copyright
https://github.com/discourse/discourse/blob/master/COPYRIGHT.txt

And notice of other stuff
https://github.com/janestreet/fieldslib/blob/master/THIRD-PARTY.txt

Phew!

@freeformz
Copy link

@ahmetalpbalkan Recent changes I made to the vcs file listing commands should play well with this. vcs also now has a way to determine the repo root (since it's given the package path).

@client9
Copy link
Contributor

client9 commented Oct 22, 2015

@ahmetalpbalkan @freeformz

Im happy to donate IsLicenseFile and/or IsLegalFile (along with tests)

https://github.com/client9/gosupplychain/blob/master/license.go
https://github.com/client9/gosupplychain/blob/master/license_test.go

if it helps. Let me know if you'd like a pull request, and/or any file/function naming changes.

regards,

n

@freeformz
Copy link

@client9 A PR for the functionality described in this issue would be great!

@client9
Copy link
Contributor

client9 commented Oct 22, 2015

Pull request #301 ready for review @freeformz

@client9
Copy link
Contributor

client9 commented Oct 22, 2015

Ok I think see the code that needs to change, and nifty way of unit testing cpySrc @kr
Im happy to take a stab at this if you want @freeformz (or you can of course too.. not sure what other work is in progress)

@freeformz
Copy link

@client9 Go for it. My focus for the next week or so is elsewhere (other projects) unfortunately. I'm just trying to keep anything godep related (Issues/PRs) moving along until I can get back to working directly on things.

@client9
Copy link
Contributor

client9 commented Oct 23, 2015

Ok, not soo bad. It's a bit tricky since things are working in filesystem paths, but this seems to work.

For each dependency, it walks backwards grabbing any license files, until it hits the top level. meaning if we have github.com/foo/bar/something it will check the following directories for license files.

  • github.com/foo/bar
  • github.com/foo
  • github.com

There is a simple test case added.

Next step is to make a fake app that uses github.com/cactus/go-statsd-client and try it out.

@client9
Copy link
Contributor

client9 commented Nov 2, 2015

pull request #301 updated, waiting for review.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants