Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan local go mod licenses for golang packages #1645

Merged
merged 21 commits into from
Mar 23, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f399e5c
support for scanning license files in golang packages
deitch Feb 28, 2023
4fe1cbc
Merge remote-tracking branch 'upstream/main' into golang-licenses-local
kzantow Mar 7, 2023
6f955d8
chore: refactor local go mod to use FileResolver and add configuration
kzantow Mar 17, 2023
7b306e7
Merge remote-tracking branch 'upstream/main' into golang-licenses-local
kzantow Mar 17, 2023
dd36f8b
chore: add processCaps test function
kzantow Mar 17, 2023
53eb828
chore: PR feedback, add more testing
kzantow Mar 20, 2023
41b6724
chore: more PR feedback
kzantow Mar 20, 2023
5db49e1
chore: update README
kzantow Mar 20, 2023
cab8224
chore: tweak go license test
kzantow Mar 20, 2023
193fc15
chore: use t.Setenv
kzantow Mar 20, 2023
2b13ab6
Merge remote-tracking branch 'upstream/main' into golang-licenses-local
kzantow Mar 20, 2023
9009dba
chore: update to use homedir lib
kzantow Mar 20, 2023
0333860
Merge remote-tracking branch 'upstream/main' into golang-licenses-local
kzantow Mar 21, 2023
299bd2d
chore: update naming and address PR feedback
kzantow Mar 21, 2023
d9cb99b
chore: add licenses for replace directives & update tests
kzantow Mar 21, 2023
606bd93
chore: fix flaky license sorting
kzantow Mar 21, 2023
5cd38ad
chore: add configuration option for gopath
kzantow Mar 23, 2023
252b980
chore: PR feedback
kzantow Mar 23, 2023
334e0f0
Merge remote-tracking branch 'upstream/main' into golang-licenses-local
kzantow Mar 23, 2023
4f1da6d
chore: add docs, adjust mod cache lookup behavior
kzantow Mar 23, 2023
911f884
chore: correct doc
kzantow Mar 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -494,6 +494,12 @@ package:
# same as -s ; SYFT_PACKAGE_CATALOGER_SCOPE env var
scope: "squashed"

golang:
# search for go package licences in the GOPATH of the system running Syft, note that this is outside the
# container filesystem and potentially outside the root of a local directory scan
# SYFT_GOLANG_SEARCH_LOCAL_MOD_CACHE_LICENSES env var
search-local-mod-cache-licenses: false

# cataloging file contents is exposed through the power-user subcommand
file-contents:
cataloger:
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ require (
github.com/anchore/stereoscope v0.0.0-20230317134707-7928713c391e
github.com/docker/docker v23.0.1+incompatible
github.com/google/go-containerregistry v0.14.0
github.com/google/licensecheck v0.3.1
github.com/invopop/jsonschema v0.7.0
github.com/knqyf263/go-rpmdb v0.0.0-20221030135625-4082a22221ce
github.com/opencontainers/go-digest v1.0.0
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,8 @@ github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeN
github.com/google/go-containerregistry v0.14.0 h1:z58vMqHxuwvAsVwvKEkmVBz2TlgBgH5k6koEXBtlYkw=
github.com/google/go-containerregistry v0.14.0/go.mod h1:aiJ2fp/SXvkWgmYHioXnbMdlgB8eXiiYOY55gfN91Wk=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/licensecheck v0.3.1 h1:QoxgoDkaeC4nFrtGN1jV7IPmDCHFNIVh54e5hSt6sPs=
github.com/google/licensecheck v0.3.1/go.mod h1:ORkR35t/JjW+emNKtfJDII0zlciG9JgbT7SmsohlHmY=
github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs=
github.com/google/martian/v3 v3.0.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
github.com/google/martian/v3 v3.1.0/go.mod h1:y5Zk1BBys9G+gd6Jrk0W3cC1+ELVxBWuIGO+w/tUAp0=
Expand Down
5 changes: 5 additions & 0 deletions internal/config/application.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"github.com/anchore/syft/internal"
"github.com/anchore/syft/internal/log"
"github.com/anchore/syft/syft/pkg/cataloger"
golangCataloger "github.com/anchore/syft/syft/pkg/cataloger/golang"
kzantow marked this conversation as resolved.
Show resolved Hide resolved
)

var (
Expand Down Expand Up @@ -48,6 +49,7 @@ type Application struct {
Log logging `yaml:"log" json:"log" mapstructure:"log"` // all logging-related options
Catalogers []string `yaml:"catalogers" json:"catalogers" mapstructure:"catalogers"`
Package pkg `yaml:"package" json:"package" mapstructure:"package"`
Golang golang `yaml:"golang" json:"golang" mapstructure:"golang"`
Attest attest `yaml:"attest" json:"attest" mapstructure:"attest"`
FileMetadata FileMetadata `yaml:"file-metadata" json:"file-metadata" mapstructure:"file-metadata"`
FileClassification fileClassification `yaml:"file-classification" json:"file-classification" mapstructure:"file-classification"`
Expand All @@ -69,6 +71,9 @@ func (cfg Application) ToCatalogerConfig() cataloger.Config {
},
Catalogers: cfg.Catalogers,
Parallelism: cfg.Parallelism,
Golang: golangCataloger.GoCatalogerOpts{
SearchLocalModCacheLicenses: cfg.Golang.SearchLocalModCacheLicenses,
},
}
}

Expand Down
11 changes: 11 additions & 0 deletions internal/config/golang.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
package config

import "github.com/spf13/viper"

type golang struct {
SearchLocalModCacheLicenses bool `json:"search-local-mod-cache-licenses" yaml:"search-local-mod-cache-licenses" mapstructure:"search-local-mod-cache-licenses"`
kzantow marked this conversation as resolved.
Show resolved Hide resolved
}

func (cfg golang) loadDefaultValues(v *viper.Viper) {
v.SetDefault("golang.search-local-mod-cache-licenses", false)
}
53 changes: 53 additions & 0 deletions internal/licenses/list.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
package licenses

import "github.com/anchore/syft/internal"

// all of these taken from https://github.com/golang/pkgsite/blob/8996ff632abee854aef1b764ca0501f262f8f523/internal/licenses/licenses.go#L338
// which unfortunately is not exported. But fortunately is under BSD-style license.

var (
FileNames = []string{
"COPYING",
"COPYING.md",
"COPYING.markdown",
"COPYING.txt",
"LICENCE",
"LICENCE.md",
"LICENCE.markdown",
"LICENCE.txt",
"LICENSE",
"LICENSE.md",
"LICENSE.markdown",
"LICENSE.txt",
"LICENSE-2.0.txt",
"LICENCE-2.0.txt",
"LICENSE-APACHE",
"LICENCE-APACHE",
"LICENSE-APACHE-2.0.txt",
"LICENCE-APACHE-2.0.txt",
"LICENSE-MIT",
"LICENCE-MIT",
"LICENSE.MIT",
"LICENCE.MIT",
"LICENSE.code",
"LICENCE.code",
"LICENSE.docs",
"LICENCE.docs",
"LICENSE.rst",
"LICENCE.rst",
"MIT-LICENSE",
"MIT-LICENCE",
"MIT-LICENSE.md",
"MIT-LICENCE.md",
"MIT-LICENSE.markdown",
"MIT-LICENCE.markdown",
"MIT-LICENSE.txt",
"MIT-LICENCE.txt",
"MIT_LICENSE",
"MIT_LICENCE",
"UNLICENSE",
"UNLICENCE",
}

FileNameSet = internal.NewStringSet(FileNames...)
)
33 changes: 33 additions & 0 deletions internal/licenses/parser.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
package licenses

import (
"io"

"github.com/google/licensecheck"
"golang.org/x/exp/slices"
)

const (
coverageThreshold = 75
unknownLicenseType = "UNKNOWN"
)

// Parse scans the contents of a license file to attempt to determine the type of license it is
func Parse(reader io.Reader) (licenses []string, err error) {
contents, err := io.ReadAll(reader)
if err != nil {
return nil, err
}
cov := licensecheck.Scan(contents)

if cov.Percent < float64(coverageThreshold) {
licenses = append(licenses, unknownLicenseType)
}
for _, m := range cov.Match {
if slices.Contains(licenses, m.ID) {
continue
}
licenses = append(licenses, m.ID)
}
return
}
10 changes: 5 additions & 5 deletions syft/pkg/cataloger/cataloger.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ func ImageCatalogers(cfg Config) []pkg.Cataloger {
java.NewJavaCataloger(cfg.Java()),
java.NewNativeImageCataloger(),
apkdb.NewApkdbCataloger(),
golang.NewGoModuleBinaryCataloger(),
golang.NewGoModuleBinaryCataloger(cfg.Go()),
dotnet.NewDotnetDepsCataloger(),
portage.NewPortageCataloger(),
sbom.NewSBOMCataloger(),
Expand All @@ -72,8 +72,8 @@ func DirectoryCatalogers(cfg Config) []pkg.Cataloger {
java.NewJavaPomCataloger(),
java.NewNativeImageCataloger(),
apkdb.NewApkdbCataloger(),
golang.NewGoModuleBinaryCataloger(),
golang.NewGoModFileCataloger(),
golang.NewGoModuleBinaryCataloger(cfg.Go()),
golang.NewGoModFileCataloger(cfg.Go()),
rust.NewCargoLockCataloger(),
dart.NewPubspecLockCataloger(),
dotnet.NewDotnetDepsCataloger(),
Expand Down Expand Up @@ -105,8 +105,8 @@ func AllCatalogers(cfg Config) []pkg.Cataloger {
java.NewJavaPomCataloger(),
java.NewNativeImageCataloger(),
apkdb.NewApkdbCataloger(),
golang.NewGoModuleBinaryCataloger(),
golang.NewGoModFileCataloger(),
golang.NewGoModuleBinaryCataloger(cfg.Go()),
golang.NewGoModFileCataloger(cfg.Go()),
rust.NewCargoLockCataloger(),
rust.NewAuditBinaryCataloger(),
dart.NewPubspecLockCataloger(),
Expand Down
6 changes: 6 additions & 0 deletions syft/pkg/cataloger/config.go
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
package cataloger

import (
"github.com/anchore/syft/syft/pkg/cataloger/golang"
"github.com/anchore/syft/syft/pkg/cataloger/java"
)

type Config struct {
Search SearchConfig
Golang golang.GoCatalogerOpts
Catalogers []string
Parallelism int
}
Expand All @@ -23,3 +25,7 @@ func (c Config) Java() java.Config {
SearchIndexedArchives: c.Search.IncludeIndexedArchives,
}
}

func (c Config) Go() golang.GoCatalogerOpts {
return c.Golang
}
18 changes: 14 additions & 4 deletions syft/pkg/cataloger/golang/cataloger.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,24 @@ import (
"github.com/anchore/syft/syft/pkg/cataloger/generic"
)

type GoCatalogerOpts struct {
SearchLocalModCacheLicenses bool
}

// NewGoModFileCataloger returns a new Go module cataloger object.
func NewGoModFileCataloger() *generic.Cataloger {
func NewGoModFileCataloger(opts GoCatalogerOpts) *generic.Cataloger {
c := goModCataloger{
licenses: newGoLicenses(opts.SearchLocalModCacheLicenses),
}
return generic.NewCataloger("go-mod-file-cataloger").
WithParserByGlobs(parseGoModFile, "**/go.mod")
WithParserByGlobs(c.parseGoModFile, "**/go.mod")
}

// NewGoModuleBinaryCataloger returns a new Golang cataloger object.
func NewGoModuleBinaryCataloger() *generic.Cataloger {
func NewGoModuleBinaryCataloger(opts GoCatalogerOpts) *generic.Cataloger {
c := goBinaryCataloger{
licenses: newGoLicenses(opts.SearchLocalModCacheLicenses),
}
return generic.NewCataloger("go-module-binary-cataloger").
WithParserByMimeTypes(parseGoBinary, internal.ExecutableMIMETypeSet.List()...)
WithParserByMimeTypes(c.parseGoBinary, internal.ExecutableMIMETypeSet.List()...)
}
4 changes: 2 additions & 2 deletions syft/pkg/cataloger/golang/cataloger_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ func Test_Mod_Cataloger_Globs(t *testing.T) {
FromDirectory(t, test.fixture).
ExpectsResolverContentQueries(test.expected).
IgnoreUnfulfilledPathResponses("src/go.sum").
TestCataloger(t, NewGoModFileCataloger())
TestCataloger(t, NewGoModFileCataloger(GoCatalogerOpts{}))
})
}
}
Expand All @@ -52,7 +52,7 @@ func Test_Binary_Cataloger_Globs(t *testing.T) {
pkgtest.NewCatalogTester().
FromDirectory(t, test.fixture).
ExpectsResolverContentQueries(test.expected).
TestCataloger(t, NewGoModuleBinaryCataloger())
TestCataloger(t, NewGoModuleBinaryCataloger(GoCatalogerOpts{}))
})
}
}
108 changes: 108 additions & 0 deletions syft/pkg/cataloger/golang/licenses.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
package golang

import (
"fmt"
"os"
"path"
"regexp"
"strings"

"github.com/mitchellh/go-homedir"

"github.com/anchore/syft/internal/licenses"
"github.com/anchore/syft/internal/log"
"github.com/anchore/syft/syft/source"
)

type goLicenses struct {
searchLocalModCacheLicenses bool
localModCacheResolver source.FileResolver
}

func newGoLicenses(searchLocalModCacheLicenses bool) goLicenses {
return goLicenses{
searchLocalModCacheLicenses: searchLocalModCacheLicenses,
localModCacheResolver: deferredModCacheResolver,
}
}

// this needs to be shared between GoMod & GoBinary so it's only scanned once
var deferredModCacheResolver = newDeferredModCacheResolver()

func newDeferredModCacheResolver() source.FileResolver {
return source.NewDeferredResolverFromSource(func() (source.Source, error) {
goPath := os.Getenv("GOPATH")
kzantow marked this conversation as resolved.
Show resolved Hide resolved

if goPath == "" {
homeDir, err := homedir.Dir()
if err != nil {
log.Debug("unable to determine user home dir: %v", err)
}
goPath = path.Join(homeDir, "go")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't look like we should use the value of homeDir if error != nil... should we return an error instead (and remove the log statement)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment. I don't think we should try and recreate the GOPATH. Either it is set and we look there, or it is not, and we do not even try.

If we get requests later that lots of people want us to determine this automatically, we always can add it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, I think I like that better. If goPath is empty, then return don't return a resolver to search against at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the go docs, the default gopath is:

The GOPATH environment variable specifies the location of your workspace. It defaults to a directory named go inside your home directory, so $HOME/go on Unix, $home/go on Plan 9, and %USERPROFILE%\go (usually C:\Users\YourName\go) on Windows.

Why wouldn't we default to this same location if GOPATH is not explicitly set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, you are right. I was thinking that the logic of, "if you set this flag, we will look wherever GOPATH points, trusting that you know where you want us to go; if not, we will not check." But that is not aligned with the go standard, as @kzantow pointed out.

That leaves us with two possibilities, based on two assumptions:

  • we assume that the user only wants us to go where they explicitly declare; therefore if GOPATH is not set, we do not go anywhere.
  • we assume that the user wants to follow the go standard, whether GOPATH is implicit or explicit; therefore if GOPATH is not set, we replicate the logic

Much as I like the first, I think the second is what people will expect. Requiring them to set the env var to enable it is sufficient for them to say, "go where GOPATH takes you, whether explicit or implicit." If you don't want us to go there, do not enable it.

I was hoping the go command's resolution of GOPATH was a library we could just hook into, but not such luck; it is a private func inside src/cmd/go/internal/cfg/, i.e. internal.

}

return source.NewFromDirectory(path.Join(goPath, "pkg", "mod"))
})
}

func (c *goLicenses) getLicenses(resolver source.FileResolver, moduleName, moduleVersion string) (licenses []string, err error) {
moduleName = processCaps(moduleName)

licenses, err = findLicenses(resolver,
fmt.Sprintf(`**/go/pkg/mod/%s@%s/*`, moduleName, moduleVersion),
)

if c.searchLocalModCacheLicenses && err == nil && len(licenses) == 0 {
// if we're running against a directory on the filesystem, it may not include the
// user's homedir / GOPATH, so we defer to using the localModCacheResolver
licenses, err = findLicenses(c.localModCacheResolver,
fmt.Sprintf(`**/%s@%s/*`, moduleName, moduleVersion),
)
}

// always return a non-nil slice
if licenses == nil {
licenses = []string{}
}

return
}

func findLicenses(resolver source.FileResolver, globMatch string) (out []string, err error) {
if resolver == nil {
return
}

locations, err := resolver.FilesByGlob(globMatch)
if err != nil {
return nil, err
}

for _, l := range locations {
fileName := path.Base(l.RealPath)
if licenses.FileNameSet.Contains(fileName) {
contents, err := resolver.FileContentsByLocation(l)
if err != nil {
return nil, err
}
parsed, err := licenses.Parse(contents)
if err != nil {
return nil, err
}

if parsed != nil {
kzantow marked this conversation as resolved.
Show resolved Hide resolved
out = append(out, parsed...)
}
}
}

return
}

var capReplacer = regexp.MustCompile("[A-Z]")

func processCaps(s string) string {
return capReplacer.ReplaceAllStringFunc(s, func(s string) string {
return "!" + strings.ToLower(s)
})
}
Loading