Skip to content

Conversation

tromai
Copy link
Member

@tromai tromai commented Mar 22, 2023

Closes #95 .

Tasks:

  • Add the parameters to accept the SBOM path.
  • Extract the dependencies from the SBOM content and run the analysis.
  • Add tests for this feature.
  • Update README.md

@tromai tromai linked an issue Mar 22, 2023 that may be closed by this pull request
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 22, 2023
@tromai tromai changed the title feat: running the analysis with a CycloneDx SBOM feat: support an existing SBOM as input Mar 22, 2023
@tromai tromai force-pushed the 95-accept-an-sbom-as-the-input-for-macaron branch from 7bed406 to 54bffc5 Compare March 22, 2023 04:53
@tromai tromai changed the base branch from main to staging March 22, 2023 05:09
@tromai tromai marked this pull request as ready for review March 22, 2023 07:02
@tromai tromai requested a review from behnazh-w March 22, 2023 07:02
@@ -71,7 +71,9 @@ def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None
# Get user config from yaml file
run_config = YamlLoader.load(analyzer_single_args.config_path)

status_code = analyzer.run(run_config, analyzer_single_args.skip_deps)
# TODO(tromai): Support running the analysis with the SBOM without having to provide the repo path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add your name :)
Also, it would be better to add this item as an issue to GitHub and not as a TODO here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue has been created here - #108. I will remove this TODO line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9d7a5f

def _get_deps_from_dep_analyzer(
self, main_ctx: AnalyzeContext, dep_analyzer: DependencyAnalyzer, working_dirs: Iterable[Path]
) -> dict[str, DependencyInfo]:
"""Get the dependencies by running the Dependency Analyzer for the target repo."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the full docstring. Also not sure why this function starts with _. Are you making assumptions that would make it unsuitable to call the function from other modules?

Same comment for the previous function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9d7a5f

@@ -234,21 +237,22 @@ def generate_reports(self, report: Report) -> None:
for reporter in self.reporters:
reporter.generate(output_target_path, report)

def resolve_dependencies(self, main_ctx: AnalyzeContext) -> dict[str, DependencyInfo]:
def resolve_dependencies(self, main_ctx: AnalyzeContext, sbom_path: str) -> dict[str, DependencyInfo]:
"""Resolve the dependencies of the main target repo.

Parameters
----------
main_ctx : AnalyzeContext
The context of object of the target repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for newline here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9d7a5f


return deps_resolved

def _get_deps_from_sbom(self, sbom_path: str) -> dict[str, DependencyInfo]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a unit test for this function.

Copy link
Member Author

@tromai tromai Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I think it is better if I explain my design regarding these two functions. This comment also addresses other feedback:

  • The reason I put these two functions as private functions because they only exist at the moment to support extracting the dependencies from the SBOM if it is given by the user. With regard to Support running the analysis with the SBOM without having to provide the repo path #108, these two will definitely be updated to better handle the SBOM and the analysis of Macaron. Therefore, I think it makes sense to have them as private functions because they are very likely to change and shouldn't be called from other modules for now.
  • From those reasons, it's also makes sense to only have integration tests to capture the behaviors that we want for this feature instead of having a unit tests to cover something that is likely to change (which I think will save us from rewriting that unit test). In additions, the _get_deps_from_sbom function only reuses existing APIs (get_dep_components and convert_components_to_artifacts which already have unit tests for them).

Please let me know your thoughts on this Behnaz.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I put these two functions as private functions because they only exist at the moment to support extracting the dependencies from the SBOM if it is given by the user. With regard to #108, these two will definitely be updated to better handle the SBOM and the analysis of Macaron.

get_deps_from_sbom should be a fairly robust function and callable from external modules if necessary. That's independent of #108 where we need to change/add new workflows entirely to avoid requiring a target repo. That's why a unit test is necessary to make sure we can get final dependencies from this function.

the _get_deps_from_sbom function only reuses existing APIs (get_dep_components and convert_components_to_artifacts which already have unit tests for them).

That's fine. The goal here is to compare the final dependencies.

Copy link
Member Author

@tromai tromai Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fairly robust function and callable from external modules if necessary

In this case, do you think it's better if we put this function inside the cyclonedx.py file to be an API instead ? The Analyzer would then only need to call it when trying to get the dependencies.
As for the function _get_deps_from_dep_analyzer below, it originally exists inside the public function Analyzer.resolve_dependencies() . I extract it out to prevent the code from clustering too much. So should it be a private function because it should be reached only from calling Analyzer.resolve_dependencies()? 🤔 Or it makes more sense to make it public.

Copy link
Member

@behnazh-w behnazh-w Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, do you think it's better if we put this function inside the cyclonedx.py file to be an API instead ?

Yes, that would make sense. It would be just a simple function like this

return convert_components_to_artifacts(get_dep_components(Path(sbom_path)))

You could refactor the tests to call this new function. Currently the unit tests call convert_components_to_artifacts(get_dep_components(Path(sbom_path))) to check SBOM files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the function _get_deps_from_dep_analyzer below, it originally exists inside the public function Analyzer.resolve_dependencies() . I extract it out to prevent the code from clustering too much. So should it be a private function because it should be reached only from calling Analyzer.resolve_dependencies()? thinking Or it makes more sense to make it public.

After having a closer look, I think that _get_deps_from_dep_analyzer is not needed. We only need to add this block to the beginning of Analyzer.resolve_dependencies() function without changing else.

if sbom_path:
            logger.info("Getting the dependencies from the SBOM defined at %s.", sbom_path)
            return get_deps_from_sbom(sbom_path)

Copy link
Member Author

@tromai tromai Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9d7a5f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be more suitable to be added as a dependency analyzer test and not e2e test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's more related to the dependency analyzer tests. Do you think that we only need to check the dependencies.json file only in this case? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's more related to the dependency analyzer tests. Do you think that we only need to check the dependencies.json file only in this case? thinking

Yes, dependencies.json would be good enough.

Copy link
Member Author

@tromai tromai Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9d7a5f

@tromai tromai force-pushed the 95-accept-an-sbom-as-the-input-for-macaron branch from bffab4f to e9d7a5f Compare March 23, 2023 02:37
README.md Outdated
```

The main input parameters of the `analyze` command:
- `-rp <path_to_target_repo>` specifies the path to the repository to analyze. This path can be both a local path (e.g. `/path/to/repo`) or a remote path (e.g. `git@github.com:organization/repo_name.git` or `https://github.com/organization/repo_name.git`).
- `[-b <branch_name>]`: The name of the branch to analyze. If not specified, Macaron will checkout the default branch (if a remote path is supplied) or use the current branch that HEAD is on (if a local path is supplied).
- `[-d <digest>]`: The hash of the commit to checkout in the current branch. This hash must be in full form. If not specified, Macaron will checkout the latest commit of the current branch.
- `-c <config_path>`: The path to the configuration yaml file. This option cannot be used together with the `-rp` option.
- `[-sbom <sbom_path>]`: The path to the CycloneDX format SBOM for analyzing the dependencies of the target repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `[-sbom <sbom_path>]`: The path to the CycloneDX format SBOM for analyzing the dependencies of the target repo.
- `[-sbom <sbom_path>]`: The path to the CyclondeDX SBOM of the target repo.

README.md Outdated
@@ -97,6 +98,7 @@ The results of the examples above will be stored in ``output/reports/github_com/
- Macaron automatically detects and analyzes direct dependencies for Java Maven projects. This process might take a while during the first run but will be faster during the subsequent runs. To skip analyzing the dependencies you can pass ``--skip-deps`` option.
- If you supply a remote path, the repository is cloned to `git_repos/` before the analysis starts. If the repository has already been cloned to `git_repos/`, Macaron will not clone the repository and proceed to analyze the local repo instead.
- The `branch` and `digest` in the config file or `-b` and `-d` in the CLI are all optional and can be omitted.
- At the moment, if an SBOM is provided with `-sbom`, Macaron uses the dependency data within the SBOM instead of running the dependency plugins against the target repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- At the moment, if an SBOM is provided with `-sbom`, Macaron uses the dependency data within the SBOM instead of running the dependency plugins against the target repo.
- If an SBOM is provided via `--sbom-path` option, Macaron will not detect the dependencies automatically. Note that `--skip-deps` option disables dependency analysis even if an SBOM file is provided.

@tromai tromai merged commit e569bb4 into staging Mar 23, 2023
@tromai tromai deleted the 95-accept-an-sbom-as-the-input-for-macaron branch March 23, 2023 05:20
art1f1c3R pushed a commit that referenced this pull request Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accept an SBOM as the input for Macaron
2 participants