Skip to content

Add pg_class optimizer and autovacuum columns to postgres schema collection#22617

Open
dujuku wants to merge 11 commits intomasterfrom
dujuku/pg-schema-columns
Open

Add pg_class optimizer and autovacuum columns to postgres schema collection#22617
dujuku wants to merge 11 commits intomasterfrom
dujuku/pg-schema-columns

Conversation

@dujuku
Copy link
Contributor

@dujuku dujuku commented Feb 11, 2026

Summary

Adds 7 new columns from pg_class to the PostgreSQL integration's schema collection to support optimizer suggestions and autovacuum tuning insights.

New Columns

Optimizer Statistics:

  • reltuples (float) - Estimated row count used by query planner
  • relpages (int) - Table size in pages (8KB blocks)

Autovacuum Configuration & Tracking:

  • reloptions (text[]) - Table-level storage parameters including autovacuum settings
  • relallvisible (int) - Pages marked all-visible in visibility map
  • relfrozenxid (text) - Frozen XID threshold for wraparound prevention
  • relminmxid (text) - Frozen multixact ID threshold
  • relallfrozen (int) - Pages marked all-frozen (PostgreSQL 18+ only)

Motivation

These columns enable several key use cases:

  1. Detecting which tables have custom autovacuum tuning
  2. Verifying suggested tuning actions were applied
  3. Correlating autovacuum behavior with table-level settings
  4. Identifying tables that need tuning based on bloat/activity
  5. Supporting query optimizer suggestions

The reloptions field in particular exposes table-level autovacuum parameters like autovacuum_vacuum_threshold, autovacuum_vacuum_scale_factor, autovacuum_freeze_min_age, and 10+ other parameters.

Implementation Details

  • Version handling: relallfrozen is only collected on PostgreSQL 18+ using conditional query logic
  • Data types: XIDs cast to text for safe JSON serialization, reloptions kept as PostgreSQL text array format
  • Backward compatibility: Works across PostgreSQL versions 9.6-18
  • TypedDict updated: Added all new fields with proper type hints using total=False
  • Payload impact: ~40-240 bytes per table (avg ~100 bytes), negligible for typical databases

🤖 Generated with Claude Code

@dujuku dujuku force-pushed the dujuku/pg-schema-columns branch from f00af7f to baf0a3d Compare February 11, 2026 23:18
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Feb 12, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: f31a007 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.34%. Comparing base (e218424) to head (f31a007).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dujuku and others added 2 commits February 12, 2026 14:57
Add 8 new columns from pg_class to provide comprehensive table metadata:
- has_indexes (relhasindex) - whether table has indexes
- relation_kind (relkind) - type of relation (r=table, p=partition, f=foreign)
- num_columns (relnatts) - number of user columns in table
- num_check_constraints (relchecks) - number of CHECK constraints
- has_triggers (relhastriggers) - whether table has triggers
- row_security_enabled (relrowsecurity) - whether row security is enabled
- is_populated (relispopulated) - whether table is populated (for materialized views)
- is_partition (relispartition) - whether this is a partition (PG 10+)

These columns complement the existing optimizer and autovacuum columns
to provide a complete view of table configuration and characteristics.

Updated all 19 test snapshots and changelog to reflect the changes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add two new column-level attributes from pg_attribute:

- is_dropped (attisdropped) - whether the column has been dropped

- storage_type (attstorage) - storage mode (p=plain, e=external, m=main, x=extended)

These attributes provide insight into column lifecycle and storage optimization strategies.

Updated all 19 test snapshots and changelog.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@dujuku dujuku force-pushed the dujuku/pg-schema-columns branch 2 times, most recently from d5fe936 to dc86a87 Compare February 13, 2026 04:13
Refactored schema collection query to inline column definitions instead
of building them as separate string variables. This makes the code
cleaner, reduces repetition, and improves maintainability.

- Replaced three string-building variables with direct inline columns
- Created single is_pg10_or_newer boolean to avoid repeated version checks
- Used inline conditionals to add is_partition column for PG 10+
- All schema tests pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
dujuku and others added 3 commits February 13, 2026 15:46
Break long lines in schemas.py to comply with 120 character limit
and run auto-formatter on all files.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix formatting to use ruff format instead of black to match
CI expectations.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
# TODO: Use the submission debouncer to only send this every 6 hours
self.health.submit_health_event(
name=HealthEvent.INITIALIZATION,
status=HealthStatus.ERROR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed?

query += f" AND ({
' OR '.join(f"datname ~ '{include_regex}'" for include_regex in self._config.include_databases)
})"
include_or_conditions = ' OR '.join(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this and the other include blocks changed?

tables.frozen_xid, tables.min_mxid, tables.table_options,
tables.has_indexes, tables.relation_kind, tables.num_columns,
tables.num_check_constraints, tables.has_triggers, tables.row_security_enabled,
tables.is_populated{', tables.is_partition' if is_pg10_or_newer else ''}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels fragile and hard to read, would it be better to pull all these columns out into an array?

{partition_joins}
GROUP BY schema_tables.schema_id, schema_tables.schema_name, schema_tables.schema_owner,
schema_tables.table_id, schema_tables.table_name, schema_tables.table_owner, schema_tables.toast_table
schema_tables.table_id, schema_tables.table_name, schema_tables.table_owner, schema_tables.toast_table,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any sense of if a group by this large will cause performance problems?

]
if cursor_row.get("table_name")
else [],
"tables": (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this whole block shift?

{
"id": "16678",
"name": "personsdup10",
"id": "16697",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did all of these churn? Without being able to clearly map the changes to the intended changes we can't be sure there weren't unintended side effects

c.relallvisible AS all_visible_pages,
c.relfrozenxid::text AS frozen_xid,
c.relminmxid::text AS min_mxid,
c.reloptions AS table_options,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These columns don't match the ones in the description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants