-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: only check for db update if local db is old enough #2005
Conversation
Previously, grype would check for a database update on every invocation. However, this was causing slow downs in the CDN. Since new databases are typically made available every 24 hours, change grype to, by default, only check if a new database is available if the local database is more than 12 hours old. Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
cmd/grype/cli/commands/db_check.go
Outdated
@@ -30,6 +30,8 @@ func DBCheck(app clio.Application) *cobra.Command { | |||
} | |||
|
|||
func runDBCheck(opts options.Database) error { | |||
// `grype db check` should _always_ check for updates, regardless of config | |||
opts.MinAgeToCheckForUpdate = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about adding a separate bool "force check" option, or a force check parameter to IsUpdateAvailable and Update. Would that have been preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that not changing the behavior of db check is the right call for now from an options-perspective. We can always add more config items later to opt into force
.
But, this param should be changed within PreRunE
such that the config options are finalized before logging them.
Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
@@ -18,6 +18,7 @@ type Database struct { | |||
AutoUpdate bool `yaml:"auto-update" json:"auto-update" mapstructure:"auto-update"` | |||
ValidateByHashOnStart bool `yaml:"validate-by-hash-on-start" json:"validate-by-hash-on-start" mapstructure:"validate-by-hash-on-start"` | |||
ValidateAge bool `yaml:"validate-age" json:"validate-age" mapstructure:"validate-age"` | |||
MinAgeToCheckForUpdate time.Duration `yaml:"min-age-to-check-for-update" json:"min-age-to-check-for-update" mapstructure:"min-age-to-check-for-update"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestion: what about stale-age
with an attached description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
db:
min-age-to-check-for-update: 12h
vs
db:
stale-age: 12h
The first one seems clearer to me in terms of what it controls. But I agree the name is pretty wordy. How do we communicate to a user what stale-age
does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to other naming suggestions too. Here's how we attach descriptions to fields https://github.com/anchore/syft/blob/9573f557d12756f8c873b98a6af476835f611055/cmd/syft/internal/options/update_check.go#L15-L17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to be parallel to MaxAllowedBuiltAge
, itself perhaps not an ideal age, since "built" seems a bit of an implementation concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about min-age-to-update
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about something like update-check-interval
? And what happens 12 hours after the db is released? With this behavior do we simply move the busy time to download a database by 12 hours? Should we update the metadata to indicate the last time we checked so we can spread this out more evenly in some sort of meaningful way?
Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
defer f.Close() | ||
err = json.NewEncoder(f).Encode(tt.existingMetadata) | ||
require.NoError(t, err) | ||
getter := newTestGetter(fs, files, dirs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some issues getting this working in all the scenarios I wanted to test because it seemed as though I couldn't mock every call and the getter code was ultimately calling os.Open
, instead of using the afero.FS
. If you happen to run into similar issues, I wrote a mocked but functional server to emulate the listing file / db download.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree testing with a local http server is a cleaner solution. I'll wait for that PR to get in and then rebase and change these tests to use that, assuming we still decide we want this behavior.
This is probably the wrong approach, because once the local DB is old enough, grype will still check every time, so from the CDN's perspective, there will still be a window from when the N hours expires until when the new database is downloaded by most clients that will have the same requests per second as today, so it won't increase reliability during that time window. |
Previously, grype would check for a database update on every invocation. However, this was causing slow downs in the CDN. Since new databases are typically made available every 24 hours, change grype to, by default, only check if a new database is available if the local database is more than 12 hours old.
See #1731