New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve discovery of Maven modules and dependencies #449
Conversation
Codecov Report
@@ Coverage Diff @@
## master #449 +/- ##
==========================================
+ Coverage 49.92% 50.16% +0.23%
==========================================
Files 86 88 +2
Lines 4467 4645 +178
==========================================
+ Hits 2230 2330 +100
- Misses 2020 2084 +64
- Partials 217 231 +14
Continue to review full report at Codecov.
|
adae403
to
cb2a87e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍 -- a few minor nits. I'll wait for @zlav to approve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pom file parsing looks great and I'm really happy you dove into how nested pom files works, I'm convinced that this PR is going to solve more of our maven analysis than I initially anticipated. Areas I think we can improve this:
- I think the optimizations here are a little eager at the cost of increasing the complexity of the code, specifically in relation to `manifestCache.
- We should have the
maven.buildtools
return agraph.Deps
object instead of handling that instead of handling it in the analyzer. This gives us the ability to contain the logic for each strategy into their build tool and
log.WithField("path", path).Debug("skipping") | ||
// Don't continue recursing, because anything else is probably a | ||
// subproject. | ||
return filepath.SkipDir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does removing filepath.SkipDir makes the call to
maven.Modules()redundant? I think it does, previously we assumed that a pom file governed the entire directory and so any modules found through it were added to the list of modules. Removing this line means we are now both scanning for child modules and adding all modules. This solution does a good job of finding all modules in a directory, but if we were to remove the call to
mave.Modules()` would it do any worse? It looks like we would still get the same results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To those who weren't on our call: Removing SkipDir
doesn't make looking at <modules>
redundant because of projects (such as this) which have a generic pom.xml
at root but split out their logic into other .xml
files that are named something other than "pom.xml". The root pom.xml
file would have, for example: <module>pom-gwt.xml</module>
. So if we don't look at these <module>
elements we won't know what the project's depedencies are.
analyzers/maven/maven.go
Outdated
@@ -141,16 +141,26 @@ func (a *Analyzer) Analyze() (graph.Deps, error) { | |||
Command: a.Options.Command, | |||
}) | |||
if err != nil { | |||
// Because this was a custom shell command, we do not fall back to any other strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the same topic as this, can we add a switch statement to handle user-defined strategies? These would be set in a.Options.Strategy
. Right now we would have mvn-tree
and pom
.
buildtools/maven/maven.go
Outdated
Version string `xml:"version"` | ||
|
||
// Scope is where the dependency is used, such as "test" or "runtime". | ||
Scope string `xml:"scope"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use this to support filtering on test dependencies? For example, gradle has compile
and implementation
deps which we look for and ignore all others. Would this be possible? For reference, other analyzers scan for production dependencies by default unless you specify something like --all-dependencies
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I was trying to not also do this in this PR because of PR #240 which addresses this question. But there's enough that I needed to change that I think that PR may not turn out to be very helpful. What do you think, do it here anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, srclib-java applies scope to each dependency it returns, and by default, the fossa UI filters to ‘compile’ and ‘runtime’ deps. You can turn other scopes on in the project settings as well. Might be worth a discussion before merging this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Keep in mind that I have 0 context on this feature and am just bored so decided to review some Pr’s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Lets add tests for ToGraphDeps
once you move all the logic into the build tool. Other than that this review is mostly style nits and questions about style choices.
case "maven-tree": | ||
return a.Maven.DependencyTree(a.Module.Dir, a.Module.BuildTarget) | ||
default: | ||
if a.Options.Command != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not suggesting we do this in this PR, but we should add tests for the Maven shell command similar to how we did for Composer. This would enable us to accurately test the fallback method,
analyzers/maven/maven.go
Outdated
return graph.Deps{}, | ||
errors.Wrapf(err, "could not identify dependencies; original error: %v", mvnError) | ||
} | ||
return pom.ToGraphDeps(), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make ToGraphDeps
a private function so thatResolveManifestFromBuildTarget
returns the dependency graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need something like ResolveManifestFromBuildTarget
to get just the contents of a POM file because we sometimes need things like the groupId
, artifactId
, and parent
from it
buildtools/maven/maven.go
Outdated
if err != nil { | ||
return graph.Deps{}, err | ||
} | ||
return graph.Deps{Direct: depsList(imports).toImports(), Transitive: depsMap(deps).toPkgGraph()}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] This might look better if you break the lines out.
graph.Deps{
Direct: depsList(imports).toImports(),
Transitive: depsMap(deps).toPkgGraph()
}, nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of type conversions in order to support methods, although this is another area where I don't have a lot of opinions. I think this would be much more clear if we were to make this:
dependencyListToImports(dependencyList)
func dependencyListToImports(imports []Dependency)
This still enforces the type of []Dependency
but allows us to remove both the type conversion and the type definition. Why is your instinct here to make this conversion and add a method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed—this is something I too was going to revise to be as you suggest after thinking about it some more
//go:generate bash -c "genny -in=$GOPATH/src/github.com/fossas/fossa-cli/graph/readtree.go gen 'Generic=Dependency' | sed -e 's/package graph/package maven/' > readtree_generated.go" | ||
|
||
func ParseDependencyTree(stdin string) ([]Dependency, map[Dependency][]Dependency, error) { | ||
func ParseDependencyTree(stdin string) (graph.Deps, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for taking care of this one too!
if err2 != nil { | ||
// Using buildTarget as a module ID or as a path to a manifest did not work. | ||
// Return just the error from running the mvn goal the first time. | ||
return "", errors.Wrap(err, "could not use Maven to list dependencies") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this fallback instead of return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's nothing really to fall back to, since this is already the fallback method: when Maven cannot take our original buildTarget
we try again but with --project=groupId:artifactId
and --file=the-pom-file.xml
. If we can't identify the groupId or artifactId from the POM file, then there's nothing else we can do.
Two important things had previously been overlooked:
pom.xml
file, yet some codebases have Maven projects within one repository which are not Maven modules of any module or project higher in the tree. So now we will recurse down but keep track of the modules that we have already looked at.BuildTarget
field of FOSSA modules wasn't as useful as it could've been because it didn't consistently hold just the path to the corresponding Maven module (or "project"). We use it with the--projects
flag when running commands such asdependency:list
ordependency:tree
, which should now take the relative path to a module or a module's POM file. Importantly, we no longer assume that only a "pom.xml" file can be the manifest for a Maven project, since a<module>
element can point directly to such an XML file with another name.