[cleanup] Remove unnecessary backward-compat check in DataTable serialization#18002
Conversation
The conditional check on CPU time and memory measurement during DataTable serialization was kept solely for backward compatibility per the inline comment. All current server versions support thread resource usage reporting, making this guard unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep strong assertions (> 0) for the enabled-measurement case. Use assertNotNull for the disabled case where values are always populated but may be zero. Improve comment accuracy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #18002 +/- ##
============================================
- Coverage 63.36% 63.30% -0.07%
Complexity 1543 1543
============================================
Files 3200 3200
Lines 194114 194074 -40
Branches 29893 29883 -10
============================================
- Hits 122993 122849 -144
- Misses 61461 61587 +126
+ Partials 9660 9638 -22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Removes the conditional “backward-compat” guard so DataTable V4 serialization always includes response serialization CPU/memory metadata, and updates tests to align with always-populated metadata behavior.
Changes:
- Always write
RESPONSE_SER_CPU_TIME_NSandRESPONSE_SER_MEM_ALLOCATED_BYTESmetadata inDataTableImplV4#toBytes() - Update
DataTableSerDeTestexpectations for response-serialization metadata presence when measurement is disabled - Adjust inline test comments to reflect the new behavior
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pinot-core/src/test/java/org/apache/pinot/core/common/datatable/DataTableSerDeTest.java | Updates assertions/comments to reflect metadata that is now always populated. |
| pinot-common/src/main/java/org/apache/pinot/common/datatable/DataTableImplV4.java | Removes conditional checks and always serializes response serialization CPU/memory metadata. |
| // When measurement is enabled, response serialization metadata should have positive values. | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_CPU_TIME_NS.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.SYSTEM_ACTIVITIES_CPU_TIME_NS.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_MEM_ALLOCATED_BYTES.getName())); |
There was a problem hiding this comment.
The updated comment says response serialization metadata should have positive values when measurement is enabled, but the test does not assert anything about RESPONSE_SER_CPU_TIME_NS / RESPONSE_SER_MEM_ALLOCATED_BYTES in this enabled branch. Add assertions that these keys are present and validate the values (e.g., parse as long and assert > 0) or adjust the comment to match what the test actually verifies.
| // Response serialization metadata is always populated, but may be zero when measurement is disabled. | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_CPU_TIME_NS.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.SYSTEM_ACTIVITIES_CPU_TIME_NS.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.RESPONSE_SER_CPU_TIME_NS.getName())); | ||
| Assert.assertNotNull(newDataTable.getMetadata().get(MetadataKey.RESPONSE_SER_CPU_TIME_NS.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_MEM_ALLOCATED_BYTES.getName())); | ||
| Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.RESPONSE_SER_MEM_ALLOCATED_BYTES.getName())); | ||
| Assert.assertNotNull(newDataTable.getMetadata().get(MetadataKey.RESPONSE_SER_MEM_ALLOCATED_BYTES.getName())); |
There was a problem hiding this comment.
Since the intent is that these fields are always populated but might be zero when measurement is disabled, asserting only NotNull is relatively weak. Consider also validating that the values are numeric (parseable as long) and match the expected disabled semantics (e.g., == 0 if guaranteed, or >= 0 if not). This makes the test more robust and ensures the serialization contract is enforced.
| // Add table serialization time and memory metadata. | ||
| getMetadata().put(MetadataKey.RESPONSE_SER_CPU_TIME_NS.getName(), | ||
| String.valueOf(resourceSnapshot.getCpuTimeNs())); | ||
| getMetadata().put(MetadataKey.RESPONSE_SER_MEM_ALLOCATED_BYTES.getName(), | ||
| String.valueOf(resourceSnapshot.getAllocatedBytes())); |
There was a problem hiding this comment.
Now that response serialization metadata is always emitted, the comment should document the expected values when measurement is disabled/unavailable (e.g., zero-filled). Adding that note here (or on the corresponding MetadataKey docs) helps downstream consumers understand whether to treat 0 differently from missing values, especially since this PR changes the old absence-based behavior.
| getMetadata().put(MetadataKey.RESPONSE_SER_MEM_ALLOCATED_BYTES.getName(), | ||
| String.valueOf(resourceSnapshot.getAllocatedBytes())); | ||
| } | ||
| // Add table serialization time and memory metadata. |
There was a problem hiding this comment.
I feel we should keep the current logic and revise the comment. When CPU time/memory usage not collectable, we shouldn't return their value
Per reviewer feedback, the conditional check on CPU time and memory measurement should be kept — when these are not collectable, we should not return their values. Reverted the production code change and updated the comment to accurately describe the intended behavior instead of the old misleading TODO. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Please update the PR title and description
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
pinot-core/src/test/java/org/apache/pinot/core/common/datatable/DataTableSerDeTest.java:287
- The metadata keys RESPONSE_SER_CPU_TIME_NS / RESPONSE_SER_MEM_ALLOCATED_BYTES are defined as LONGs, but the assertions below use Integer.parseInt, which can overflow and make this test flaky on slower/large allocations. Prefer parsing as long (Long.parseLong) and comparing using long values.
// When ThreadCpuTimeMeasurement is enabled, responseSerializationCpuTimeNs should be positive.
Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_CPU_TIME_NS.getName()));
Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.SYSTEM_ACTIVITIES_CPU_TIME_NS.getName()));
Assert.assertNull(newDataTable.getMetadata().get(MetadataKey.THREAD_MEM_ALLOCATED_BYTES.getName()));
Assert.assertTrue(
| // Add table serialization time and memory metadata when the corresponding measurement is enabled. | ||
| // When CPU time/memory usage is not collectable, we omit these values from the metadata. | ||
| if (ThreadResourceUsageProvider.isThreadCpuTimeMeasurementEnabled()) { | ||
| getMetadata().put(MetadataKey.RESPONSE_SER_CPU_TIME_NS.getName(), | ||
| String.valueOf(resourceSnapshot.getCpuTimeNs())); |
There was a problem hiding this comment.
This change only updates the comment, but the code still conditionally writes response serialization CPU/memory metadata based on ThreadResourceUsageProvider.isThread*MeasurementEnabled(). The PR title/description says the backward-compat guard was removed and the values are now always populated—either the implementation needs to be updated to match that behavior or the PR description/title should be corrected.
Summary
Test plan
./mvnw spotless:apply -pl pinot-commonpasses./mvnw checkstyle:check -pl pinot-commonpasses./mvnw license:check -pl pinot-commonpasses🤖 Generated with Claude Code