Skip to content

sql: add system.vcpu_hours_audit table#170328

Open
sadaf-crl wants to merge 1 commit into
cockroachdb:masterfrom
sadaf-crl:CC-35137
Open

sql: add system.vcpu_hours_audit table#170328
sadaf-crl wants to merge 1 commit into
cockroachdb:masterfrom
sadaf-crl:CC-35137

Conversation

@sadaf-crl
Copy link
Copy Markdown
Contributor

@sadaf-crl sadaf-crl commented May 14, 2026

Add the system.vcpu_hours_audit table to track per-node, per-hour vCPU consumption for license auditing. This table stores:

  • node_id: ID of the node reporting vCPU usage
  • license_id: license identifier under which vCPUs are consumed
  • hour_timestamp: timestamp of the hour bucket for this measurement
  • num_vcpu: number of vCPUs on the node during this hour

The table has a composite primary key on (node_id, license_id, hour_timestamp) and is configured with admin read/write privileges. Data retention is intended to be 30 days via a future GC mechanism.

This is part of the self-hosted license audit feature to enable customers to generate vCPU consumption reports for compliance tracking and renewals.

Changes include:

  • Schema definition and descriptor in systemschema/system.go
  • Table name constant in catconstants/constants.go
  • Bootstrap metadata registration in bootstrap/metadata.go
  • Privilege configuration in catprivilege/system.go
  • Cluster version gate V26_3_AddVcpuHoursAuditTable
  • Upgrade migration for existing clusters
  • SystemDatabaseSchemaBootstrapVersion updated
  • NumSystemTablesForSystemTenant incremented to 71

Epic: CC-35515
Release note: None

@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 14, 2026

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@sadaf-crl sadaf-crl force-pushed the CC-35137 branch 4 times, most recently from f0f3f21 to b957903 Compare May 20, 2026 09:32
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 20, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@sadaf-crl sadaf-crl force-pushed the CC-35137 branch 6 times, most recently from 918209d to c5855d4 Compare May 21, 2026 16:46
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 21, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

Add the system.vcpu_hours_audit table to track per-node, per-hour
vCPU consumption for license auditing. This table stores:
- node_id: ID of the node reporting vCPU usage
- license_id: license identifier under which vCPUs are consumed
- hour_timestamp: timestamp of the hour bucket for this measurement
- num_vcpu: number of vCPUs on the node during this hour

The table has a composite primary key on (node_id, license_id,
hour_timestamp) and is configured with admin read/write privileges.

This is part of the self-hosted license audit feature to enable
customers to generate vCPU consumption reports for compliance
tracking and renewals.

Changes include:
- Schema definition and descriptor in systemschema/system.go
- Table name constant in catconstants/constants.go
- Bootstrap metadata registration in bootstrap/metadata.go
- Privilege configuration in catprivilege/system.go
- Backup configuration in backup/system_schema.go (opted out)
- Cluster version gate V26_3_AddVcpuHoursAuditTable
- Upgrade migration for existing clusters
- SystemDatabaseSchemaBootstrapVersion updated
- NumSystemTablesForSystemTenant incremented to 71
- Generated settings documentation updated
- Regenerated golden files for updated system table counts

Epic: CC-35515
Release note: None
@sadaf-crl sadaf-crl marked this pull request as ready for review May 25, 2026 05:43
@sadaf-crl sadaf-crl requested review from a team as code owners May 25, 2026 05:43
@sadaf-crl sadaf-crl requested review from a team as code owners May 25, 2026 05:43
@sadaf-crl sadaf-crl requested review from a team, angles-n-daemons, bghal, kev-cao, rahulcrl, vishalv1994 and visheshbardia and removed request for a team May 25, 2026 05:43
Comment on lines +1446 to +1456
// * license_id: the license ID under which vCPUs are consumed (NULL if no license installed).
// * hour_timestamp: the timestamp of the hour bucket for this measurement.
// * node_id: the ID of the node reporting vCPU usage.
// * num_vcpu: the number of vCPUs on the node during this hour.
VcpuHoursAuditTableSchema = `
CREATE TABLE system.vcpu_hours_audit (
license_id STRING,
hour_timestamp TIMESTAMPTZ NOT NULL,
node_id INT8 NOT NULL,
num_vcpu FLOAT NOT NULL,
CONSTRAINT "primary" PRIMARY KEY (license_id, hour_timestamp, node_id),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The comment on line 1446 documents license_id as "NULL if no license installed," but license_id is part of the PRIMARY KEY (line 1456), which makes it implicitly NOT NULL. The bootstrap testdata confirms the column resolves to license_id STRING NOT NULL. Any future insertion with license_id = NULL for unlicensed nodes will fail with a NOT NULL violation.

Suggested fix: either update the comment to document a sentinel value (e.g., empty string '') instead of NULL when no license is installed, or restructure the primary key to exclude license_id if NULL semantics are truly needed.

@github-actions
Copy link
Copy Markdown
Contributor

AI Review: Potential Issue Detected

An inline comment has been added to pkg/sql/catalog/systemschema/system.go identifying a contradiction between the license_id column's documented behavior (NULL when no license is installed) and its inclusion in the PRIMARY KEY (which makes it implicitly NOT NULL).

View full analysis


If helpful: add O-AI-Review-Real-Issue-Found label.
If not helpful: add O-AI-Review-Not-Helpful label.

@github-actions github-actions Bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label May 25, 2026
license_id STRING,
hour_timestamp TIMESTAMPTZ NOT NULL,
node_id INT8 NOT NULL,
num_vcpu FLOAT NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using FLOAT for vCPUs. Can it be fractional?

node_id INT8 NOT NULL,
num_vcpu FLOAT NOT NULL,
CONSTRAINT "primary" PRIMARY KEY (license_id, hour_timestamp, node_id),
FAMILY "primary" (license_id, hour_timestamp, node_id, num_vcpu)
Copy link
Copy Markdown
Contributor

@rahulcrl rahulcrl May 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the PK with license_id + hour_timestamp prefix, will hotspot writes on a single range. One license per cluster + same hour bucket means every node's hourly insert shares the same key prefix, so they all
sort adjacent and land in the same range IIUC and one leaseholder serializes the whole cluster's writes. One range might do all the work, which can be a problem if deployment has a huge no. of nodes.

Can we use a hash-sharded leading column like statement_statistics does? https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/catalog/systemschema/system.go#L778

// * hour_timestamp: the timestamp of the hour bucket for this measurement.
// * node_id: the ID of the node reporting vCPU usage.
// * num_vcpu: the number of vCPUs on the node during this hour.
VcpuHoursAuditTableSchema = `
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query: The PR desc says retention is 30 days "via a future GC mechanism". Are we planning to add a TTL later?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants