feat: add experimental offline mode #183

G-Rath · 2023-02-03T18:31:43Z

Resolves #81

This is based off a lot of the core of the detector - it's not working yet because I need to figure how to handle passing in the queries to the local db given that the detector takes PackageDetails, but really the key thing there is how to handle PURL which comes from SBOMs that I don't really know how to use 😅 (idk if I'm just dumb or what, but for some reason I've still not been able to figure how to accurately generate one from a Gemfile.lock, package-lock.json, etc)

If someone could provide some sample SBOMs that would be very useful (I'll also do a PR adding tests using them as fixtures), and also happy to receive feedback on the general approach - there are some smaller bits to discuss, like if fields should be omitted from the JSON output vs an empty array, and the Describe related stuff too.

This is now working, though personally it feels pretty awkward codewise - I know I'm bias but I feel like it would be better to trying to bring across the whole database package from the detector, as the API db is pretty much the same and then you'd have support for zips, directories, and the API with extra configs like working directories + an extensive test suite for all three (I don't think it would be as painful as one might first think, even with osvscanner having just been made public because that's relatively small).

Still, this does work as advertised - there's definitely a few things that could do with some cleaning up (including if fields should be omitted from the JSON output vs an empty array, and the Describe related stuff too) but am leaving them for now until I hear what folks think of the general implementation + my above comment.

I've also gone with two boolean flags rather than the url-based flag @oliverchang suggested because I didn't feel comfortable trying to shoehorn that into this PR as well, and now that we're using --experimental it should be fine to completely change these flags in future.

oliverchang

nice, thanks! some initial comments.

pkg/models/vulnerabilities.go

internal/offline/zip.go

pkg/models/vulnerabilities_test.go

pkg/models/vulnerability.go

cmd/osv-scanner/main.go

oliverchang · 2023-02-07T06:56:17Z

@oliverchang thanks for the initial review - a few of the things you've flagged I already know about but have deliberately left to show a possible future and their use-case is actually already being introduced in parallel PRs.

For now I'd like to focus on the overall direction, after which I'll adjust the whole PR to match and then you can review the finer points, but until then you don't need to spend time reviewing those :)

Sure! What kind of feedback re direction are you looking for in particular? The general structure? (It is a little hard to tease out all of these given the unrelated changes this seems to be bringing in :) ). That said I'll review this in a bit more detail later this week.

You've already started to touch on that by asking for a Db interface, but I don't think that's worth doing until there are multiple DBs since it's all internal - having said that, the detector already has three DBs including one for the API, so I'd be happy to port that across in which case you'd have your interface.

That seems like an easy/cheap thing to do today in a way that allows easy future extension, even if these interfaces are all private?

This reduces the diff with #183 a bit, and is good practice

G-Rath · 2023-07-02T21:19:07Z

@oliverchang @another-rex I think you can probably start doing some reviewing if you have time during this week - I've still got to add some CLI tests, and do a maybe final review of the whole thing against our design doc but I don't think there's anything major missing.

If nothing else, I would like to know your thoughts on ScannerActionsExperimental - ideally I'd like to actually land that in its own PR pulling in the existing experimental flags, so if you're happy with that I'll get that done later this week too

pkg/models/vulnerability.go

another-rex

Thanks! Added some initial comments.

Also:

What is the performance of the localdb when doing the local vuln matching like, is there any need for further optimisations, or is it already pretty fast as it is?
We could add some logging or a loading indicator when downloading the local zip files.

pkg/models/vulnerability.go

internal/offline/zip.go

G-Rath · 2023-07-03T19:47:44Z

What is the performance of the localdb when doing the local vuln matching like, is there any need for further optimisations, or is it already pretty fast as it is?

This hasn't passed the @spencerschrock check yet, but generally downloading the databases (which takes at least 5-10 seconds) is slower than the rest of the process so yes it's pretty fast.

I am interested in hearing how it performs against all the scorecard repos though because while I tried to benchmark I could have missed something 🤷

We could add some logging or a loading indicator when downloading the local zip files.

Whoops yeah adding some logs are another thing I still need to do 😅

spencerschrock · 2023-07-05T18:12:24Z

What is the performance of the localdb when doing the local vuln matching like, is there any need for further optimisations, or is it already pretty fast as it is?

This hasn't passed the @spencerschrock check yet, but generally downloading the databases (which takes at least 5-10 seconds) is slower than the rest of the process so yes it's pretty fast.

I am interested in hearing how it performs against all the scorecard repos though because while I tried to benchmark I could have missed something 🤷

From the Scorecard weekly analysis side of things, I imagine a local/cached copy could certainly speed things up and would eliminate a lot of API traffic to osv (assuming we can configure it via our current entry point of DoScan).

I'd be interested in seeing how much of a difference the lack of commit based matching hurts.

oliverchang · 2023-07-10T00:46:14Z

From the Scorecard weekly analysis side of things, I imagine a local/cached copy could certainly speed things up and would eliminate a lot of API traffic to osv (assuming we can configure it via our current entry point of DoScan).

I'd be interested in seeing how much of a difference the lack of commit based matching hurts.

Today, perhaps not so much. But in a few months we'll have a large chunk of the NVD imported that will be very useful for C/C++ in general where source/commit-based matching is the best we have.

pkg/models/vulnerability.go

another-rex

Mostly looks good! Left some more comments.

cmd/osv-scanner/main.go

internal/offline/check.go

cmd/osv-scanner/main_test.go

internal/offline/check.go

cmd/osv-scanner/main_test.go

G-Rath · 2023-07-29T21:50:33Z

Ok this should be good to go now I think, with the main discussion point I'm expecting to be about the introduction of setting the local db path which we'd previously decided would only be configurable by an env variable. I've since realised it makes it harder to test by not having it not just for the scanner as a CLI but anyone using the library that wants to write their own tests.

Also I've deliberately not :

added many CLI tests because the core behaviour is already covered by more package-specific tests and it's a lot more work to craft CLI tests to cover the same sorts of things
done anything fancy with messaging output or loading handling, like tracking if a db for an ecosystem failed to load as I think there's some pro/cons on the different ways to handle it and that the is good enough for a first implementation, so I'm leaving them as follow-up changes if/when needed (which can be based on user feedback)

Finally, I think this has discovered a bug with the sbom parsing as it looks reports the ecosystem for Alpine packages as APK instead of Alpine.

another-rex · 2023-07-31T02:16:51Z

Fixed the alpine issue in #457

oliverchang

very nice!! Just mostly have some nits, otherwise this looks pretty good to me.

internal/offline/zip.go

pkg/models/vulnerability.go

pkg/osvscanner/osvscanner.go

internal/offline/zip.go

internal/offline/check.go

internal/offline/zip.go

oliverchang

thanks! LGTM with some final nits.

internal/offline/zip.go

pkg/models/vulnerability.go

internal/offline/check.go

oliverchang · 2023-08-02T01:23:59Z

internal/offline/zip.go

+	// the url that the zip archive was downloaded from
+	ArchiveURL string
+	// whether this database should make any network requests
+	Offline bool


Maybe we should name this ShouldUpdate or something similar ? That makes the intent a lot clearer, and currently as it's named it's a bit confusing in the context of this being inside an "offline" package.

what do you think about renaming the package itself to something like local? I'm not too fussed about renaming the property either, I just felt that Offline is pretty clear on what should (or should not) happen and that it is conceivable that other features get added that involve talking to a network.

local SGTM.

This pulls across the experimental struct from #183 which I'm assuming folks are happy with given that PR is approved - I think it is good to land this in its own PR as its technically an allowed breaking change (or "experimental change" if you will), and that way it can be explicitly linked to in changelogs etc without having to understand #183 as much.

… no git

… tried

…xplicit

…lnerable

G-Rath mentioned this pull request Feb 5, 2023

Add io.Reader variants to lockfile package #176

Closed

another-rex requested a review from oliverchang February 5, 2023 23:52

oliverchang reviewed Feb 6, 2023

View reviewed changes

This comment was marked as resolved.

Sign in to view

G-Rath mentioned this pull request Jun 27, 2023

test: make models tests their own package #423

Merged

another-rex pushed a commit that referenced this pull request Jun 27, 2023

test: make models tests their own package (#423)

d378b92

This reduces the diff with #183 a bit, and is good practice

G-Rath requested review from oliverchang and another-rex July 2, 2023 21:15

G-Rath marked this pull request as ready for review July 2, 2023 21:15

G-Rath commented Jul 2, 2023

View reviewed changes

pkg/models/vulnerability.go Outdated Show resolved Hide resolved

another-rex reviewed Jul 3, 2023

View reviewed changes

oliverchang reviewed Jul 12, 2023

View reviewed changes

pkg/models/vulnerability.go Outdated Show resolved Hide resolved

pkg/models/vulnerability.go Outdated Show resolved Hide resolved

pkg/models/vulnerability.go Show resolved Hide resolved

another-rex requested review from another-rex and oliverchang July 25, 2023 03:37

another-rex reviewed Jul 28, 2023

View reviewed changes

oliverchang reviewed Aug 1, 2023

View reviewed changes

oliverchang approved these changes Aug 2, 2023

View reviewed changes

G-Rath mentioned this pull request Aug 2, 2023

refactor: move experimental flags into their own struct #463

Merged

G-Rath added 3 commits August 8, 2023 09:50

feat: add experimental local and offline modes

04816fe

fix: offline implies comparing locally, and comparing locally implies…

c2944f8

… no git

refactor: remove unused WorkingDirectory field

1ec628d

G-Rath added 23 commits August 8, 2023 09:50

feat: store database as a plain zip file rather than JSON

a60ee00

feat: only re-download databases if their contents have changed

4e99005

perf: use cachedregexp

71c286b

fix: ensure custom user agent has a value before using

b8d5b8d

test: add cli cases

ad2cdf9

fix: improve error handling when databases cannot be found

54bb2f8

refactor: use constants

ca90763

refactor: don't use dedicated types to avoid breaking change

fffee83

fix: ensure that events are stored

979454a

feat: add hidden local db path option

7bc2890

fix: check status code of response when there is no existing local db

b1f54f5

feat: mention when a local database has been loaded

e106799

refactor: remove local package name normalization

8acbbe6

refactor: make method private

843a852

fix: invert condition for checking if temp directory has already been…

9742e90

… tried

fix: consider env based-path as explicitly provided paths

989dcbb

refactor: improve setupLocalDBDirectory implementation

a3ebead

refactor: invert condition to be based on implicit path rather than e…

7fdbc16

…xplicit

chore: add comments to struct properties

6f56008

chore: explain why packages without versions are always considered vu…

7f39e13

…lnerable

fix: ensure body is closed regardless of response status code

e29fe7a

refactor: rename offline.Check to offline.MakeRequest

5698e16

refactor: rename offline to local

6d813c2

oliverchang approved these changes Aug 9, 2023

View reviewed changes

oliverchang merged commit 53107dd into google:main Aug 9, 2023
7 checks passed

G-Rath deleted the support-offline branch August 15, 2023 07:34

G-Rath restored the support-offline branch August 15, 2023 07:34

G-Rath deleted the support-offline branch August 15, 2023 07:34

G-Rath restored the support-offline branch August 15, 2023 07:34

chenrui333 mentioned this pull request Sep 14, 2023

osv-scanner 1.4.0 Homebrew/homebrew-core#142527

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add experimental offline mode #183

feat: add experimental offline mode #183

G-Rath commented Feb 3, 2023 •

edited

Loading

oliverchang left a comment

This comment was marked as resolved.

oliverchang commented Feb 7, 2023

This comment was marked as resolved.

G-Rath commented Jul 2, 2023

another-rex left a comment

G-Rath commented Jul 3, 2023 •

edited

Loading

spencerschrock commented Jul 5, 2023

oliverchang commented Jul 10, 2023

another-rex left a comment

G-Rath commented Jul 29, 2023

another-rex commented Jul 31, 2023

oliverchang left a comment

oliverchang left a comment

oliverchang Aug 2, 2023

G-Rath Aug 7, 2023

oliverchang Aug 8, 2023

feat: add experimental offline mode #183

feat: add experimental offline mode #183

Conversation

G-Rath commented Feb 3, 2023 • edited Loading

oliverchang left a comment

Choose a reason for hiding this comment

This comment was marked as resolved.

oliverchang commented Feb 7, 2023

This comment was marked as resolved.

G-Rath commented Jul 2, 2023

another-rex left a comment

Choose a reason for hiding this comment

G-Rath commented Jul 3, 2023 • edited Loading

spencerschrock commented Jul 5, 2023

oliverchang commented Jul 10, 2023

another-rex left a comment

Choose a reason for hiding this comment

G-Rath commented Jul 29, 2023

another-rex commented Jul 31, 2023

oliverchang left a comment

Choose a reason for hiding this comment

oliverchang left a comment

Choose a reason for hiding this comment

oliverchang Aug 2, 2023

Choose a reason for hiding this comment

G-Rath Aug 7, 2023

Choose a reason for hiding this comment

oliverchang Aug 8, 2023

Choose a reason for hiding this comment

G-Rath commented Feb 3, 2023 •

edited

Loading

G-Rath commented Jul 3, 2023 •

edited

Loading