dynamic updaters #196

hdonnay · 2020-07-01T20:41:28Z

This PR provides for having all updaters dynamically constructed upon use.

This allows databases to be discovered at runtime (e.g. RHEL's use of a Pulp repository) and for graceful degradation when distributions fall out of support (e.g. Ubuntu's non-LTS releases).

internal/updater/executor.go

ldelossa

PR looks pretty good.

I have a concern around concurrency control.

Currently concurrency control is happening in Libvuln via a weighted sem.
Once the weighted sem hands work to each executor, that executor runs all update operations in parallel.

Ideally the configuration would reliably say "this is how many update operations will occur in parallel".
I'd be less concerned about this, but fetching, reading, parsing, and batch insertion into the database are our heaviest tasks on the matcher side, and being deterministic about how many of those tasks are running at once will be a key factor in scaling/tuning/troubleshooting.

My concerns also exist in slowing down VulnerabilityReportGeneration when too many background update operations are occurring. IMO, we should design to maintain predictable vuln report generation times, as Quay will be querying this endpoint often.

I'd like to see some control structure which deterministically says "this is the max number of update operations running" in terms of concurrency control.

ldelossa

The flow of control for form Pulp -> RHEL Updaters seems a bit silly.

I have to construct a pulp object in another package, feed it a function in the rhel package, to have it construct updaters in the rhel package.

an init function is used to hardcore the construction of the pulp object anyway.

Can we please collapse this into just the rhel package and not make Pulp a stateful object that just couples itself to the rhel package immediately?

hdonnay · 2020-07-07T18:51:56Z

The way it's implemented, the pulp object isn't coupled to rhel at all. It's the rhel object that's coupled to the pulp object, because that's how the databases are delivered. I like the separate pulp object because it's a nice, self-contained bit of code.

The init function is only used to preserve the "default" behavior. I'll rework that to construct the factories in the libvuln constructor.

ldelossa · 2020-07-07T18:55:31Z

But what's the purpose of self containing it, is what I'm failing to see. All you do is feed it a function implemented somewhere else to construct what you need. It being collapsed would make a lot more sense and pulp being placed inside rhel/ it will never be used outside RHEL.

This commit removes remaining sqlx usage in the postgres package. The uses were mostly as a stdlib database/sql adapter, which could be satisfied by pgx's adapter. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

Signed-off-by: Hank Donnay <hdonnay@redhat.com>

This commit removes the updater.Controller struct and replaces it with updater.Executor, which is similar in purpose but doesn't do any internal looping. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

The rhel updater now reads from a pulp repository by default. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

This was causing problems where two transactions would race on inserting a row, but because of the structure of the statement the loser wouldn't be able to see the committed row and would error out the transaction. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

The new databases have no-entry entries in the affected list. This is a hack to make things work, the most correct solution would be for the generator to not emit `<cpe/>` objects. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

The new databases being read have tests that are just "exists", so we handle those without emitting log spam. Our model works "backwards" from the normal OVAL process: we're matching packages we know to exist. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

This changes the Ubuntu updater to probe database files before creating the updaters. This means that distributions falling out of support won't spam logs with that information. It'd be nice to be able to construct the initial list automatically, but that's another commit. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

ldelossa

LGTM, cool PR.

hdonnay requested review from Allda and ldelossa July 1, 2020 20:41

ldelossa reviewed Jul 1, 2020

View reviewed changes

internal/updater/executor.go Outdated Show resolved Hide resolved

ldelossa reviewed Jul 1, 2020

View reviewed changes

hdonnay force-pushed the hank/pulp branch 5 times, most recently from 81b0a7e to 8bb987a Compare July 2, 2020 18:22

ldelossa suggested changes Jul 7, 2020

View reviewed changes

hdonnay force-pushed the hank/pulp branch 2 times, most recently from 472d999 to 4f444e8 Compare July 7, 2020 18:22

hdonnay force-pushed the hank/pulp branch 4 times, most recently from c259052 to 6088ec1 Compare July 8, 2020 14:15

hdonnay changed the title ~~WIP: dynamic updaters~~ dynamic updaters Jul 8, 2020

hdonnay marked this pull request as ready for review July 8, 2020 14:31

hdonnay requested review from alecmerdler and jzelinskie as code owners July 8, 2020 14:31

hdonnay requested a review from ldelossa July 8, 2020 14:31

hdonnay force-pushed the hank/pulp branch 3 times, most recently from 122334e to f76b64f Compare July 10, 2020 13:50

hdonnay added 4 commits July 10, 2020 11:33

postgres: remove sqlx usage

27359db

This commit removes remaining sqlx usage in the postgres package. The uses were mostly as a stdlib database/sql adapter, which could be satisfied by pgx's adapter. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

distlock: implement interface over pgxpool

24c305f

Signed-off-by: Hank Donnay <hdonnay@redhat.com>

libvuln: use Executor + UpdateSetFactory

b9b5dec

This commit removes the updater.Controller struct and replaces it with updater.Executor, which is similar in purpose but doesn't do any internal looping. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

rhel: use pulp factory

65fae38

The rhel updater now reads from a pulp repository by default. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

hdonnay added 4 commits July 10, 2020 11:35

rhel: handle empty cpes

c3bada8

The new databases have no-entry entries in the affected list. This is a hack to make things work, the most correct solution would be for the generator to not emit `<cpe/>` objects. Signed-off-by: Hank Donnay <hdonnay@redhat.com>

hdonnay force-pushed the hank/pulp branch from f76b64f to c9b6274 Compare July 10, 2020 15:35

ldelossa approved these changes Jul 10, 2020

View reviewed changes

hdonnay merged commit c9b6274 into master Jul 10, 2020

hdonnay deleted the hank/pulp branch July 10, 2020 21:45

Allda mentioned this pull request Aug 10, 2020

Uses Red Hat's Oval v2 instead of v1 #112

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dynamic updaters #196

dynamic updaters #196

hdonnay commented Jul 1, 2020 •

edited

ldelossa left a comment •

edited

ldelossa left a comment

hdonnay commented Jul 7, 2020

ldelossa commented Jul 7, 2020 •

edited

ldelossa left a comment

dynamic updaters #196

dynamic updaters #196

Conversation

hdonnay commented Jul 1, 2020 • edited

ldelossa left a comment • edited

Choose a reason for hiding this comment

ldelossa left a comment

Choose a reason for hiding this comment

hdonnay commented Jul 7, 2020

ldelossa commented Jul 7, 2020 • edited

ldelossa left a comment

Choose a reason for hiding this comment

hdonnay commented Jul 1, 2020 •

edited

ldelossa left a comment •

edited

ldelossa commented Jul 7, 2020 •

edited