Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect dependencies from build.gradle files #2822

Merged
merged 2 commits into from
Mar 1, 2022
Merged

Conversation

pombredanne
Copy link
Member

* Create parser for dependency info from build.gradle files

Signed-off-by: Jono Yang jyang@nexb.com

Copy link
Contributor

@tardyp tardyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I thinks its going to be useful. I can foresee some problems with this:

  • we need tests with some example files, to see what works exactly. I can provide some twisted examples from my twisted kotlin dev collegues :)

  • build.gradle can be either groovy or kotlin.

  • those are programmation langage so developers are factorizing deps, so for me it just unreliable to just parse the file, you have to execute it.

  • gradle is supporting transiant dependencies. So build.gradle only contains rank 1 deps.

  • ort, whitesource do run gradle to work. this is painful as you have to provision your build container with your build toolchain and the scanner toolchain

For those reason I advise my devs to use lockfiles like nebula.
Not all of them have made the switch so this will certainly be useful.

from packageurl import PackageURL
from pygments import highlight
from pygments.formatter import Formatter
from pygments.lexers import GroovyLexer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interresting use of pygments :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tardyp the idea of https://github.com/nexB/pygmars/ is to be used in tandem with pygments to create quickly some lightweight parsers for programming languages.

This is used for instance to lex bash in https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/bashlex.py and parse it in https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/bashparse.py
In this case this is used to enable lightweight Bash parameter extension to parse the Apkbuild scripts from Alpine Linux (to collect metadata) in https://github.com/nexB/scancode-toolkit/blob/9ed2cb4e78a8b1b138bcb201f50203b43098febb/src/packagedcode/alpine.py#L161

The same approach will be used whenever there is a need to parse code to extract structured data as this is easier to read and maintain than a bunch of regex. The upcoming Yocto/Bitbake support likely can benefit from some of it too... and we should be able to parse also Autotools scripts that contain interesting metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is not perfect... and still only static. Next up will be to enable dependency resolution on the deps collected (which may require some live network access ... but this will be an option with proxy support for airgapped envt. ;) )

@JonoYang
Copy link
Member

@tardyp Thanks for the review! I've got a script going that uses pygmars to roughly glean what I can from build.gradle files. I'll bring the code over.

    * Create parser for dependency info from build.gradle files

Signed-off-by: Jono Yang <jyang@nexb.com>
    * Move build.gradle parsing code to new source file
    * Create tests for build.gradle parsing

Signed-off-by: Jono Yang <jyang@nexb.com>
@pombredanne
Copy link
Member Author

@tardyp Some more thoughts in reply to your insightful comments:

Thanks for this. I thinks its going to be useful. I can foresee some problems with this:

* we need tests with some example files, to see what works exactly. I can provide some twisted examples from my twisted kotlin dev collegues :)

This would be great if you can attach these.

* build.gradle can be either groovy or kotlin.

This is may work also with kotlin, but the devils is in the details.

* those are programmation langage so developers are factorizing deps, so for me it just unreliable to just parse the file, you have to execute it.

yeah, that's the problem with these executable scripts....

* gradle is supporting transiant dependencies. So build.gradle only contains rank 1 deps. 
* ort, whitesource do run gradle to work. this is painful as you have to provision your build container with your build toolchain and the scanner toolchain

So the plan here would be to:

  1. enable resolving deps WITHOUT running gradle (or in general a package manager) using DependentCode, univers and purl/vers

This is not going to be a perfect solution in all cases.

  1. also run tools using gradle minimally with containers (possibly borrowing or reusing code from ORT) to collect a full dependency tree

  2. process also the lockfiles...

For those reason I advise my devs to use lockfiles like nebula. Not all of them have made the switch so this will certainly be useful.

which would be super useful to have as a parser and having some test files would be super useful too. Do you think nebula is the most common one? Any other common lockfile format otherwise?

Copy link
Member Author

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pombredanne
Copy link
Member Author

See also #2761

@pombredanne pombredanne merged commit a07a5fb into develop Mar 1, 2022
@pombredanne pombredanne deleted the detect-build-gradle branch March 1, 2022 18:37
@pombredanne
Copy link
Member Author

pombredanne commented Mar 1, 2022

I am keeping #793 (comment) open until we have explicit tests for Kotlin

@tardyp
Copy link
Contributor

tardyp commented Mar 2, 2022

which would be super useful to have as a parser and having some test files would be super useful too. Do you think nebula is the most common one? Any other common lockfile format otherwise?

FWIW, we do have some nebula support code, and we do need to find some slot in our agenda to upstream it :-}

for the rest, sounds like a plan!

@pombredanne
Copy link
Member Author

FWIW, we do have some nebula support code, and we do need to find some slot in our agenda to upstream it :-}

Awesome... I look forward to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants