Refactor Package Manifest Parsing into a Reader Interface #21

abhisek · 2023-02-19T12:19:29Z

Currently different operations are performed to read package manifests from:

Directory
Lockfiles
JSON Dump

In future, we may need to be able to read from SBOM (SPDX, CycloneDX). To be able to ensure separation of concerns, we should

Define a reader interface for reading package manifests
Have implementation for different sources of package manifest

abhisek · 2023-04-06T21:07:46Z

Overview

vet has multiple source of input for packages to be analysed. It can

Scan a directory to read supported package manifests (e.g. pom.xml)
Parse a list of package manifests provided as input (using --lockfile option)
Scan a directory for JSON files (generated with --json-dump-dir option)

It also supports the undocumented feature of reading declared dependencies from a python binary wheel file. In future, it will support binary artifacts such as jar, gem, docker image etc. as source of input.

The current implementation for this package manifest reading capability is strongly coupled with the scanner package and is treated like utility functions. This causes the unnecessary need for scanner package to change whenever we add a new package manifest source. In reality, the package reading capability is large enough to be an independent concern to be used by vet scanner as well as by other tools needing such capability.

Goal

The goal of this feature is to refactor the package manifest reader logic implemented in scanner package into a package of its own with a standardized interface. This ensures separation of concerns, loose coupling and improves developer experience by making it easy to implement new data sources of reading package manifests.

Requirements

Define an interface to standardize the contract for a package manifest reader
Refactor existing manifest reader functions as implementation of the standard interface
Apply dependency inversion principle to decouple scanner from details of package manifest readers

Benefits

Separation of concern (non-functional)
Improved developer experience for package manifest reader development
Re-usability of package manifest readers outside of vet
Improved testability of package manifest readers as independent concerns

Design

Current Design

Proposed Design

Proposed Interface

The readers implementation will work on following core models:

Package Manifest
Package

Following ISP, we will define 2 interfaces:

Package Manifest Reader Interface
Package Reader Interface

[1] will be used to read manifest(s) from a source such as directory or a single binary artifact (e.g. jar). [1] will expose the package manifest entity and a default implementation of [2] to the caller. [2] in turn is a contract for reading packages from a single manifest, hiding the parser and other details.

API Changes

The package manifest reader interface will introduce a new API. This API will be used by implementations (e.g. directory based package manifest loaders). The scanner package will depend on this API to read package manifests for analysis.

Data Model Changes

No data model change is required for this feature. The package manifest interface will work on the existing core models.

UI Changes

No UI change is required for this feature

Implementation

The implementation will extend the reader interface implemented as a stop gap solution for #13

The implementation will hide all details regarding reading a manifest and subsequently packages from the manifest for rest of the system (vet) while honoring global concerns such as exceptions management

Following package manifest readers will be implemented following the reader interface (contract):

Directory based reader
Lockfiles reader
JSON dump reader

The UI adapter for scan command need to change to initialize appropriate package manifest readers based on command line arguments. The list of package manifest readers will subsequently used to initialize the scanner

Testing

Test cases will be implemented following Go Table Driven Testing for each implementation of package manifest readers. The impact of change is limited to package manifest loading.

Deployment

The feature will be first merged with the main branch which will trigger CI/CD workflow for container image release. Subsequently the feature will be rolled out as a versioned release during the next release schedule.

Documentation

No user documentation change is needed. Code documentation will be written following godoc conventions for generating developer documentation as required.

abhisek added the enhancement New feature or request label Feb 19, 2023

abhisek self-assigned this Apr 7, 2023

abhisek linked a pull request Apr 9, 2023 that will close this issue

Refactor Package Readers into a Standard Interface #53

Merged

abhisek mentioned this issue Apr 10, 2023

Refactor Package Readers into a Standard Interface #53

Merged

abhisek closed this as completed in #53 Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Package Manifest Parsing into a Reader Interface #21

Refactor Package Manifest Parsing into a Reader Interface #21

abhisek commented Feb 19, 2023

abhisek commented Apr 6, 2023 •

edited

Loading

Refactor Package Manifest Parsing into a Reader Interface #21

Refactor Package Manifest Parsing into a Reader Interface #21

Comments

abhisek commented Feb 19, 2023

abhisek commented Apr 6, 2023 • edited Loading

Overview

Goal

Requirements

Benefits

Design

Current Design

Proposed Design

Proposed Interface

API Changes

Data Model Changes

UI Changes

Implementation

Testing

Deployment

Documentation

abhisek commented Apr 6, 2023 •

edited

Loading