Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve discovery of Maven modules and dependencies #449

Merged
merged 19 commits into from Apr 30, 2019

Conversation

dchenk
Copy link
Contributor

@dchenk dchenk commented Apr 24, 2019

Two important things had previously been overlooked:

  1. The CLI did not recurse into directories below that of any discovered pom.xml file, yet some codebases have Maven projects within one repository which are not Maven modules of any module or project higher in the tree. So now we will recurse down but keep track of the modules that we have already looked at.
  2. The BuildTarget field of FOSSA modules wasn't as useful as it could've been because it didn't consistently hold just the path to the corresponding Maven module (or "project"). We use it with the --projects flag when running commands such as dependency:list or dependency:tree, which should now take the relative path to a module or a module's POM file. Importantly, we no longer assume that only a "pom.xml" file can be the manifest for a Maven project, since a <module> element can point directly to such an XML file with another name.

@codecov
Copy link

codecov bot commented Apr 24, 2019

Codecov Report

Merging #449 into master will increase coverage by 0.23%.
The diff coverage is 57.79%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #449      +/-   ##
==========================================
+ Coverage   49.92%   50.16%   +0.23%     
==========================================
  Files          86       88       +2     
  Lines        4467     4645     +178     
==========================================
+ Hits         2230     2330     +100     
- Misses       2020     2084      +64     
- Partials      217      231      +14
Impacted Files Coverage Δ
analyzers/maven/maven.go 20% <16.12%> (ø)
buildtools/maven/maven.go 57.14% <62.22%> (+25.89%) ⬆️
buildtools/maven/pom.go 84.84% <84.84%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad23bb5...d3a70dd. Read the comment docs.

@dchenk dchenk requested review from elldritch and zlav April 24, 2019 22:41
analyzers/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
analyzers/maven/maven.go Outdated Show resolved Hide resolved
@dchenk dchenk changed the title Improve discovery of Maven modules Improve discovery of Maven modules and dependencies Apr 25, 2019
@dchenk dchenk force-pushed the feat/improve-maven-support branch from adae403 to cb2a87e Compare April 26, 2019 18:30
Copy link
Contributor

@cnr cnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 -- a few minor nits. I'll wait for @zlav to approve

analyzers/maven/maven_test.go Show resolved Hide resolved
analyzers/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
Copy link
Member

@zlav zlav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pom file parsing looks great and I'm really happy you dove into how nested pom files works, I'm convinced that this PR is going to solve more of our maven analysis than I initially anticipated. Areas I think we can improve this:

  1. I think the optimizations here are a little eager at the cost of increasing the complexity of the code, specifically in relation to `manifestCache.
  2. We should have the maven.buildtools return a graph.Deps object instead of handling that instead of handling it in the analyzer. This gives us the ability to contain the logic for each strategy into their build tool and

analyzers/maven/maven.go Outdated Show resolved Hide resolved
analyzers/maven/maven.go Show resolved Hide resolved
log.WithField("path", path).Debug("skipping")
// Don't continue recursing, because anything else is probably a
// subproject.
return filepath.SkipDir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does removing filepath.SkipDir makes the call to maven.Modules()redundant? I think it does, previously we assumed that a pom file governed the entire directory and so any modules found through it were added to the list of modules. Removing this line means we are now both scanning for child modules and adding all modules. This solution does a good job of finding all modules in a directory, but if we were to remove the call tomave.Modules()` would it do any worse? It looks like we would still get the same results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To those who weren't on our call: Removing SkipDir doesn't make looking at <modules> redundant because of projects (such as this) which have a generic pom.xml at root but split out their logic into other .xml files that are named something other than "pom.xml". The root pom.xml file would have, for example: <module>pom-gwt.xml</module>. So if we don't look at these <module> elements we won't know what the project's depedencies are.

@@ -141,16 +141,26 @@ func (a *Analyzer) Analyze() (graph.Deps, error) {
Command: a.Options.Command,
})
if err != nil {
// Because this was a custom shell command, we do not fall back to any other strategies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the same topic as this, can we add a switch statement to handle user-defined strategies? These would be set in a.Options.Strategy. Right now we would have mvn-tree and pom.

analyzers/maven/maven.go Outdated Show resolved Hide resolved
analyzers/maven/maven.go Outdated Show resolved Hide resolved
Version string `xml:"version"`

// Scope is where the dependency is used, such as "test" or "runtime".
Scope string `xml:"scope"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this to support filtering on test dependencies? For example, gradle has compile and implementation deps which we look for and ignore all others. Would this be possible? For reference, other analyzers scan for production dependencies by default unless you specify something like --all-dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was trying to not also do this in this PR because of PR #240 which addresses this question. But there's enough that I needed to change that I think that PR may not turn out to be very helpful. What do you think, do it here anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, srclib-java applies scope to each dependency it returns, and by default, the fossa UI filters to ‘compile’ and ‘runtime’ deps. You can turn other scopes on in the project settings as well. Might be worth a discussion before merging this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this with @zlav —we decided that at least in this PR we'll focus on just discovery of modules and simply parsing the pom.xml files, and next we'll check out #240 to see how best to add this feature

buildtools/maven/maven.go Outdated Show resolved Hide resolved
analyzers/maven/maven_test.go Show resolved Hide resolved
Copy link
Contributor

@anuccio1 anuccio1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Keep in mind that I have 0 context on this feature and am just bored so decided to review some Pr’s)

buildtools/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
@dchenk dchenk requested a review from zlav April 30, 2019 00:52
Copy link
Member

@zlav zlav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Lets add tests for ToGraphDeps once you move all the logic into the build tool. Other than that this review is mostly style nits and questions about style choices.

case "maven-tree":
return a.Maven.DependencyTree(a.Module.Dir, a.Module.BuildTarget)
default:
if a.Options.Command != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not suggesting we do this in this PR, but we should add tests for the Maven shell command similar to how we did for Composer. This would enable us to accurately test the fallback method,

buildtools/maven/maven.go Show resolved Hide resolved
return graph.Deps{},
errors.Wrapf(err, "could not identify dependencies; original error: %v", mvnError)
}
return pom.ToGraphDeps(), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make ToGraphDeps a private function so thatResolveManifestFromBuildTarget returns the dependency graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need something like ResolveManifestFromBuildTarget to get just the contents of a POM file because we sometimes need things like the groupId, artifactId, and parent from it

analyzers/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
buildtools/maven/maven.go Outdated Show resolved Hide resolved
if err != nil {
return graph.Deps{}, err
}
return graph.Deps{Direct: depsList(imports).toImports(), Transitive: depsMap(deps).toPkgGraph()}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] This might look better if you break the lines out.

graph.Deps{
   Direct: depsList(imports).toImports(), 
   Transitive: depsMap(deps).toPkgGraph()
}, nil

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of type conversions in order to support methods, although this is another area where I don't have a lot of opinions. I think this would be much more clear if we were to make this:

dependencyListToImports(dependencyList)

func dependencyListToImports(imports []Dependency)

This still enforces the type of []Dependency but allows us to remove both the type conversion and the type definition. Why is your instinct here to make this conversion and add a method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed—this is something I too was going to revise to be as you suggest after thinking about it some more

//go:generate bash -c "genny -in=$GOPATH/src/github.com/fossas/fossa-cli/graph/readtree.go gen 'Generic=Dependency' | sed -e 's/package graph/package maven/' > readtree_generated.go"

func ParseDependencyTree(stdin string) ([]Dependency, map[Dependency][]Dependency, error) {
func ParseDependencyTree(stdin string) (graph.Deps, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for taking care of this one too!

buildtools/maven/maven.go Outdated Show resolved Hide resolved
if err2 != nil {
// Using buildTarget as a module ID or as a path to a manifest did not work.
// Return just the error from running the mvn goal the first time.
return "", errors.Wrap(err, "could not use Maven to list dependencies")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this fallback instead of return an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's nothing really to fall back to, since this is already the fallback method: when Maven cannot take our original buildTarget we try again but with --project=groupId:artifactId and --file=the-pom-file.xml. If we can't identify the groupId or artifactId from the POM file, then there's nothing else we can do.

buildtools/maven/pom_test.go Outdated Show resolved Hide resolved
buildtools/maven/pom_test.go Outdated Show resolved Hide resolved
buildtools/maven/pom_test.go Outdated Show resolved Hide resolved
@dchenk dchenk merged commit 89eb004 into master Apr 30, 2019
@dchenk dchenk deleted the feat/improve-maven-support branch April 30, 2019 21:59
@dchenk dchenk mentioned this pull request Apr 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants