Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PDB-4420) partition of resource_events #3027

Merged
merged 16 commits into from
Nov 13, 2019
Merged

(PDB-4420) partition of resource_events #3027

merged 16 commits into from
Nov 13, 2019

Conversation

robdaemon
Copy link
Contributor

@robdaemon robdaemon commented Jul 18, 2019

Prototype of partitioning the resource_events table. This creates a base
table (from the same schema as the existing resource_events table) and
will create 8 total partitions during migration (this ISO week +- 4)

Partitions are based on ISO 8601 weeks. Inserting into the partitions is
controlled via a trigger that directs the insert to the appropriate
partition.

Partitions are dynamically created by scanning the events to be inserted
and performing a "CREATE TABLE IF NOT EXISTS" for the partition. This is
akin to the (ensure-certname) code that already exists.

The inheritance-based implementation was chosen to maintain backwards
compatibility with PostgreSQL 9.x, since the other components of PE are
not ready for PostgreSQL 11.

@robdaemon robdaemon added the work in progress (...and please don't merge) label Jul 18, 2019
@puppetcla
Copy link

CLA signed by all contributors.

@robdaemon robdaemon added please review and removed work in progress (...and please don't merge) labels Aug 1, 2019
@austb austb self-requested a review August 5, 2019 16:06
Copy link
Contributor

@austb austb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me so far. I do theorize that when we ask around, they'll want us to delimit this on the day boundary. The most common setting I've seen for reports-ttl in production is 3d and if I wrapped my head around this properly, that means this change, while making gc cheaper, would significantly increase the size of their reports and resource events table.

@robdaemon
Copy link
Contributor Author

I chose ISO weeks because it's the most straightforward approach. Even if we want the granularity for queries to be lower, I think the storage granularity has to stay at the week, otherwise we risk making this either too complicated, or a performance problem on the number of partitions we create (imagine 365.25 tables per year, that's a lot)

@npwalker
Copy link
Contributor

I agree that partitioning on week seems like it would miss most of the benefit of partitioning. If you partition on day then a query that looks for reports in the last 6 hours (anything less than a day is probably the most common report query) only has to look at the one smaller partition. If you partition on week then you can look at less data if your report-ttl is large enough and also the performance profile of your query will change day over day as the day after you make a new partition things will be speedy and will get slower as the week progresses.

Why would creating more tables create a performance impact?

@robdaemon
Copy link
Contributor Author

If the average report TTL is < 7 days, then partitioning by day would be better. Do we have details on the actual use of report-ttl in the field?

See: https://www.postgresql.org/docs/9.6/ddl-partitioning.html

All constraints on all partitions of the master table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions.

This is where the impact on the number of partitions comes into play, and one of the driving reasons for using ISO weeks.

@robdaemon robdaemon requested a review from a team as a code owner September 4, 2019 21:13
@puppetlabs-jenkins
Copy link
Contributor

Tests Failed \o/

Copy link
Contributor

@rbrw rbrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems pretty close.

(doall
(map (fn [week-offset]
(partitioning/create-resource-events-partition (.plusWeeks now week-offset)))
weeks))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't need to keep/return the seq, then might change this to dorun, or just collapse it to a doseq.

documentation/partitioning.markdown Outdated Show resolved Hide resolved
documentation/partitioning.markdown Outdated Show resolved Hide resolved
documentation/partitioning.markdown Outdated Show resolved Hide resolved
src/puppetlabs/puppetdb/scf/migrate.clj Outdated Show resolved Hide resolved
src/puppetlabs/puppetdb/scf/storage.clj Outdated Show resolved Hide resolved
src/puppetlabs/puppetdb/scf/storage.clj Outdated Show resolved Hide resolved
test/puppetlabs/puppetdb/integration/resource_events.clj Outdated Show resolved Hide resolved
test/puppetlabs/puppetdb/integration/resource_events.clj Outdated Show resolved Hide resolved
test/puppetlabs/puppetdb/integration/resource_events.clj Outdated Show resolved Hide resolved
@austb
Copy link
Contributor

austb commented Sep 25, 2019

Looks like Jenkins might actually be lying to us here. The acceptance tests are failing https://cinext-jenkinsmaster-enterprise-prod-1.delivery.puppetlabs.net/view/puppetdb/view/master/job/enterprise_puppetdb_integration-system-puppetdb_full-master/1947/

I think it's because only the ezbake job is nested in the PR Kickoff job. Might take a bit of cjc wizardry to fix that without duplicating jobs... I'll make a ticket

src/puppetlabs/puppetdb/cli/services.clj Show resolved Hide resolved
src/puppetlabs/puppetdb/scf/partitioning.clj Outdated Show resolved Hide resolved
@@ -0,0 +1,49 @@
# Partitioning in PuppetDB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some stuff in dev-docs/ related to how to reconcile git commits/Jira tickets for release. This could probably end up there.

test/puppetlabs/puppetdb/examples/reports.clj Show resolved Hide resolved
@austb austb removed the don't merge label Oct 24, 2019
Copy link
Contributor

@austb austb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Robert Roland added 10 commits November 13, 2019 10:55
Prototype of partitioning the resource_events table. This creates a base
table (from the same schema as the existing resource_events table) and
will create 8 total partitions during migration (this ISO week +- 4)

Partitions are based on ISO 8601 weeks. Inserting into the partitions is
controlled via a trigger that directs the insert to the appropriate
partition.

Partitions are dynamically created by scanning the events to be inserted
and performing a "CREATE TABLE IF NOT EXISTS" for the partition. This is
akin to the (ensure-certname) code that already exists.

The inheritance-based implementation was chosen to maintain backwards
compatibility with PostgreSQL 9.x, since the other components of PE are
not ready for PostgreSQL 11.

TODO: Migrate existing events to partitioned tables - they are currently
dropped.

TODO: GC by dropping expired partitions.
(PDB-4468) migration test for partitioned tables

Add a migration test for creating the new partitioned tables

Migrate existing resource events data to the new table. Rolls up the
previous migration that creates the event_hash column into this new
migration.
This query returns a different order in pg11 than it does in pg9.6, so
adding this ORDER BY clause makes it behave the same in both.
* Adds a config parameter: database.resource-events-ttl with a default of 14d
* Adds two admin metrics: resource-events-purges and resource-events-purge-time
* Adds documentation for this new feature

This will allow the resource_events table to be cleaned up at a different interval
than the reports table.
uses calendar year and day of year (1-366) for the partition names
Perform a GC by dropping tables past our expiration date. Rounded to the
nearest day.
Change use of LocalDate/LocalDateTime to ZonedDateTime and use UTC for
all partitioning operations.
Fixed issues found in code review.
Port of #3074 to partitioning work
Use the formatting string that matches PostgreSQL for outputting UTC
offsets as +00 instead of Z
Robert Roland added 6 commits November 13, 2019 10:55
Previous commit to add batched inserts removed the creation of
partitions based on row dates, fixed

Ensure that new tables are created in the active transaction,
otherwise you get an error:

    org.postgresql.util.PSQLException: ERROR: portal "C_8" does not exist
Remove the fk constraint to reports. Creating this constantly will cause us
to utilize all available locks in a migration transaction (default is 64).
This fk was to be removed when the reports table is partitioned anyhow.
Group inserts by day, so they insert directly into the partition
instead of routing through the trigger during migration

Enable rewriteBatchInserts to additionally speed up bulk inserts

Runtime on the customer dataset on n2:

    2019-10-11 11:35:02,037 INFO  [main] [p.t.internal] Finished shutdown sequence

    real    19m13.106s
    user    14m44.906s
    sys     1m28.030s
Applies a floor of a day to the resource-events-ttl, since we partition
based on days, the TTL of resource-events cannot be any less than a day.
Allow rewriteBatchedInserts to be set per connection pool
Moving a doc to dev-docs/

Replacing a use of str with trs for a log message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants