Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic updaters #196

Merged
merged 8 commits into from Jul 10, 2020
Merged

dynamic updaters #196

merged 8 commits into from Jul 10, 2020

Conversation

hdonnay
Copy link
Member

@hdonnay hdonnay commented Jul 1, 2020

This PR provides for having all updaters dynamically constructed upon use.

This allows databases to be discovered at runtime (e.g. RHEL's use of a Pulp repository) and for graceful degradation when distributions fall out of support (e.g. Ubuntu's non-LTS releases).

@hdonnay hdonnay requested review from Allda and ldelossa July 1, 2020 20:41
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks pretty good.

I have a concern around concurrency control.

Currently concurrency control is happening in Libvuln via a weighted sem.
Once the weighted sem hands work to each executor, that executor runs all update operations in parallel.

Ideally the configuration would reliably say "this is how many update operations will occur in parallel".
I'd be less concerned about this, but fetching, reading, parsing, and batch insertion into the database are our heaviest tasks on the matcher side, and being deterministic about how many of those tasks are running at once will be a key factor in scaling/tuning/troubleshooting.

My concerns also exist in slowing down VulnerabilityReportGeneration when too many background update operations are occurring. IMO, we should design to maintain predictable vuln report generation times, as Quay will be querying this endpoint often.

I'd like to see some control structure which deterministically says "this is the max number of update operations running" in terms of concurrency control.

@hdonnay hdonnay force-pushed the hank/pulp branch 5 times, most recently from 81b0a7e to 8bb987a Compare July 2, 2020 18:22
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow of control for form Pulp -> RHEL Updaters seems a bit silly.

I have to construct a pulp object in another package, feed it a function in the rhel package, to have it construct updaters in the rhel package.

an init function is used to hardcore the construction of the pulp object anyway.

Can we please collapse this into just the rhel package and not make Pulp a stateful object that just couples itself to the rhel package immediately?

@hdonnay hdonnay force-pushed the hank/pulp branch 2 times, most recently from 472d999 to 4f444e8 Compare July 7, 2020 18:22
@hdonnay
Copy link
Member Author

hdonnay commented Jul 7, 2020

The way it's implemented, the pulp object isn't coupled to rhel at all. It's the rhel object that's coupled to the pulp object, because that's how the databases are delivered. I like the separate pulp object because it's a nice, self-contained bit of code.

The init function is only used to preserve the "default" behavior. I'll rework that to construct the factories in the libvuln constructor.

@ldelossa
Copy link
Contributor

ldelossa commented Jul 7, 2020

But what's the purpose of self containing it, is what I'm failing to see. All you do is feed it a function implemented somewhere else to construct what you need. It being collapsed would make a lot more sense and pulp being placed inside rhel/ it will never be used outside RHEL.

@hdonnay hdonnay force-pushed the hank/pulp branch 4 times, most recently from c259052 to 6088ec1 Compare July 8, 2020 14:15
@hdonnay hdonnay changed the title WIP: dynamic updaters dynamic updaters Jul 8, 2020
@hdonnay hdonnay marked this pull request as ready for review July 8, 2020 14:31
@hdonnay hdonnay requested a review from ldelossa July 8, 2020 14:31
@hdonnay hdonnay force-pushed the hank/pulp branch 3 times, most recently from 122334e to f76b64f Compare July 10, 2020 13:50
This commit removes remaining sqlx usage in the postgres package. The
uses were mostly as a stdlib database/sql adapter, which could be
satisfied by pgx's adapter.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
Signed-off-by: Hank Donnay <hdonnay@redhat.com>
This commit removes the updater.Controller struct and replaces it with
updater.Executor, which is similar in purpose but doesn't do any
internal looping.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
The rhel updater now reads from a pulp repository by default.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
This was causing problems where two transactions would race on inserting
a row, but because of the structure of the statement the loser wouldn't
be able to see the committed row and would error out the transaction.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
The new databases have no-entry entries in the affected list.

This is a hack to make things work, the most correct solution would be
for the generator to not emit `<cpe/>` objects.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
The new databases being read have tests that are just "exists", so we
handle those without emitting log spam.

Our model works "backwards" from the normal OVAL process: we're matching
packages we know to exist.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
This changes the Ubuntu updater to probe database files before creating
the updaters. This means that distributions falling out of support won't
spam logs with that information.

It'd be nice to be able to construct the initial list automatically, but
that's another commit.

Signed-off-by: Hank Donnay <hdonnay@redhat.com>
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cool PR.

@hdonnay hdonnay merged commit c9b6274 into master Jul 10, 2020
@hdonnay hdonnay deleted the hank/pulp branch July 10, 2020 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants