Add pg_class optimizer and autovacuum columns to postgres schema collection#22617
Add pg_class optimizer and autovacuum columns to postgres schema collection#22617
Conversation
f00af7f to
baf0a3d
Compare
|
✅ Tests 🎉 All green!❄️ No new flaky tests detected 🔗 Commit SHA: f31a007 | Docs | Datadog PR Page | Was this helpful? Give us feedback! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files🚀 New features to boost your workflow:
|
Add 8 new columns from pg_class to provide comprehensive table metadata: - has_indexes (relhasindex) - whether table has indexes - relation_kind (relkind) - type of relation (r=table, p=partition, f=foreign) - num_columns (relnatts) - number of user columns in table - num_check_constraints (relchecks) - number of CHECK constraints - has_triggers (relhastriggers) - whether table has triggers - row_security_enabled (relrowsecurity) - whether row security is enabled - is_populated (relispopulated) - whether table is populated (for materialized views) - is_partition (relispartition) - whether this is a partition (PG 10+) These columns complement the existing optimizer and autovacuum columns to provide a complete view of table configuration and characteristics. Updated all 19 test snapshots and changelog to reflect the changes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add two new column-level attributes from pg_attribute: - is_dropped (attisdropped) - whether the column has been dropped - storage_type (attstorage) - storage mode (p=plain, e=external, m=main, x=extended) These attributes provide insight into column lifecycle and storage optimization strategies. Updated all 19 test snapshots and changelog. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
d5fe936 to
dc86a87
Compare
Refactored schema collection query to inline column definitions instead of building them as separate string variables. This makes the code cleaner, reduces repetition, and improves maintainability. - Replaced three string-building variables with direct inline columns - Created single is_pg10_or_newer boolean to avoid repeated version checks - Used inline conditionals to add is_partition column for PG 10+ - All schema tests pass Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Break long lines in schemas.py to comply with 120 character limit and run auto-formatter on all files. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix formatting to use ruff format instead of black to match CI expectations. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
| # TODO: Use the submission debouncer to only send this every 6 hours | ||
| self.health.submit_health_event( | ||
| name=HealthEvent.INITIALIZATION, | ||
| status=HealthStatus.ERROR |
| query += f" AND ({ | ||
| ' OR '.join(f"datname ~ '{include_regex}'" for include_regex in self._config.include_databases) | ||
| })" | ||
| include_or_conditions = ' OR '.join( |
There was a problem hiding this comment.
Why was this and the other include blocks changed?
| tables.frozen_xid, tables.min_mxid, tables.table_options, | ||
| tables.has_indexes, tables.relation_kind, tables.num_columns, | ||
| tables.num_check_constraints, tables.has_triggers, tables.row_security_enabled, | ||
| tables.is_populated{', tables.is_partition' if is_pg10_or_newer else ''} |
There was a problem hiding this comment.
This feels fragile and hard to read, would it be better to pull all these columns out into an array?
| {partition_joins} | ||
| GROUP BY schema_tables.schema_id, schema_tables.schema_name, schema_tables.schema_owner, | ||
| schema_tables.table_id, schema_tables.table_name, schema_tables.table_owner, schema_tables.toast_table | ||
| schema_tables.table_id, schema_tables.table_name, schema_tables.table_owner, schema_tables.toast_table, |
There was a problem hiding this comment.
Do we have any sense of if a group by this large will cause performance problems?
| ] | ||
| if cursor_row.get("table_name") | ||
| else [], | ||
| "tables": ( |
There was a problem hiding this comment.
Why did this whole block shift?
| { | ||
| "id": "16678", | ||
| "name": "personsdup10", | ||
| "id": "16697", |
There was a problem hiding this comment.
Why did all of these churn? Without being able to clearly map the changes to the intended changes we can't be sure there weren't unintended side effects
| c.relallvisible AS all_visible_pages, | ||
| c.relfrozenxid::text AS frozen_xid, | ||
| c.relminmxid::text AS min_mxid, | ||
| c.reloptions AS table_options, |
There was a problem hiding this comment.
These columns don't match the ones in the description
Summary
Adds 7 new columns from
pg_classto the PostgreSQL integration's schema collection to support optimizer suggestions and autovacuum tuning insights.New Columns
Optimizer Statistics:
reltuples(float) - Estimated row count used by query plannerrelpages(int) - Table size in pages (8KB blocks)Autovacuum Configuration & Tracking:
reloptions(text[]) - Table-level storage parameters including autovacuum settingsrelallvisible(int) - Pages marked all-visible in visibility maprelfrozenxid(text) - Frozen XID threshold for wraparound preventionrelminmxid(text) - Frozen multixact ID thresholdrelallfrozen(int) - Pages marked all-frozen (PostgreSQL 18+ only)Motivation
These columns enable several key use cases:
The
reloptionsfield in particular exposes table-level autovacuum parameters likeautovacuum_vacuum_threshold,autovacuum_vacuum_scale_factor,autovacuum_freeze_min_age, and 10+ other parameters.Implementation Details
relallfrozenis only collected on PostgreSQL 18+ using conditional query logicreloptionskept as PostgreSQL text array formattotal=False🤖 Generated with Claude Code