Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an add-on pipeline for collecting dwarfs from elfs #1068

Merged
merged 15 commits into from
Feb 19, 2024
Merged

Conversation

TG1999
Copy link
Contributor

@TG1999 TG1999 commented Feb 1, 2024

Fixing aboutcode-org/purldb#260 (comment)

Create a PURL a service backed by new SCIO analysis pipelines that takes a PURL for a native binary package as an input, and returns the DWARF debug symbol compilation unit paths when available. This will specifically be worked out for Debian native binary packages.

Copy link
Contributor Author

@TG1999 TG1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits for my consideration

scanpipe/models.py Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
scanpipe/tests/test_models.py Outdated Show resolved Hide resolved
scanpipe/tests/test_models.py Outdated Show resolved Hide resolved
Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR lacks context.

  • Is it a duplicate of https://github.com/nexB/scancode.io/pull/819/files ?
  • What's the goal of simply collecting dwarf to put on extra_data ?
  • Missing changelog entry
  • Missing pipeline documentation
  • Should this pipeline be included with the default built-in ones?

scanpipe/tests/test_models.py Outdated Show resolved Hide resolved
scanpipe/pipelines/get_dwarfs_from_elfs.py Outdated Show resolved Hide resolved
docs/built-in-pipelines.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 You haven't address my concerns:

@pombredanne
Copy link
Member

Is it a duplicate of https://github.com/nexB/scancode.io/pull/819/files ?

#819 was an early end to end prototype. We should reuse and inform the code here from this experiment alright and later close #819 unmerged

@pombredanne
Copy link
Member

What's the goal of simply collecting dwarf to put on extra_data ?
Eventually a d2d for ELFs and other binaries.... there #819 was a proof of concept

Should this pipeline be included with the default built-in ones?

We could start a contrib section?

@pombredanne
Copy link
Member

So #819 is actually doing the d2d and not the data collection. This is the data collection part

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TG1999 TG1999 marked this pull request as ready for review February 19, 2024 05:06
TG1999 and others added 9 commits February 19, 2024 10:37
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Copy link
Contributor

@tdruez tdruez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Shouldn't we use the "ELF" case/syntax everywhere instead of "Elf"?
  2. Why are we reverting to an older version of python-inspector?

The code looks generally fine, but let me re-iter my concern about including those very specialized pipelines along the default ones.

setup.cfg Outdated Show resolved Hide resolved
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@pombredanne
Copy link
Member

The code looks generally fine, but let me re-iter my concern about including those very specialized pipelines along the default ones.

I think this is OK. We need these in first in and we can refactor code at a later stage. The general direction will be eventually to subsume these pipelines in larger binary and source symbols extraction pipelines and pipes. But we need to walk before we run! And if we do not have these handy in one place until then, we are wasting time with extra packaging and dependency management.

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@@ -1948,7 +1948,7 @@ def has_directory_content_fingerprint(self):

def elfs(self):
"""
Resources that are ``files`` and their filetype starts with "elf" and
Resources that are ``files`` and their filetype starts with "ELF" and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not correct, the string is still "elf"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -27,7 +27,7 @@
from scanpipe.pipelines import Pipeline


class InspectElfBinaries(Pipeline):
class InspectELFBinaries(Pipeline):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update the references to this as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

TG1999 and others added 3 commits February 19, 2024 23:22
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@tdruez tdruez merged commit 0b0fc4a into main Feb 19, 2024
7 checks passed
@tdruez tdruez deleted the dwarfs_from_elfs branch February 19, 2024 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants