test(workflow-core): add unit test coverage for ResultSchema by aglinxinyuan · Pull Request #4786 · apache/texera

aglinxinyuan · 2026-05-03T01:51:28Z

What changes were proposed in this PR?

Add ResultSchemaSpec pinning the canonical column layout of ResultSchema:

runtimeStatisticsSchema exposes the canonical columns in order
runtimeStatisticsSchema pins the type of every column (operatorId STRING; time TIMESTAMP; inputTupleCnt/inputTupleSize/outputTupleCnt/outputTupleSize/dataProcessingTime/controlProcessingTime/idleTime LONG; numWorkers/status INTEGER) so a type change in any slot fails the spec
consoleMessagesSchema exposes a single STRING message column

Any related issues, documentation, discussions?

Closes #4785

How was this PR tested?

sbt "WorkflowCore/testOnly org.apache.texera.amber.core.storage.result.ResultSchemaSpec" — 3/3 tests pass.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.7)

Add ResultSchemaSpec pinning the canonical column layout of runtimeStatisticsSchema (column order and per-column types) and consoleMessagesSchema, guarding against silent drift. Closes apache#4785 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds a new unit spec for ResultSchema in workflow-core to pin the expected layout of runtime-statistics and console-messages result documents, helping catch schema drift before it breaks downstream consumers.

Changes:

Add ResultSchemaSpec covering the runtime-statistics column order.
Add assertions for selected runtime-statistics column types and the console-messages schema shape.
Introduce focused unit coverage for a schema contract used by result-storage readers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-05-03T01:54:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 42.69%. Comparing base (a5b8957) to head (743cfb5).
⚠️ Report is 28 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #4786      +/-   ##
============================================
+ Coverage     42.13%   42.69%   +0.56%     
- Complexity     2001     2038      +37     
============================================
  Files           957      957              
  Lines         34094    34094              
  Branches       3753     3753              
============================================
+ Hits          14364    14556     +192     
+ Misses        18952    18748     -204     
- Partials        778      790      +12

Flag	Coverage Δ
access-control-service	`28.12% <ø> (ø)`
amber	`41.56% <ø> (+1.23%)`	⬆️
computing-unit-managing-service	`0.00% <ø> (ø)`
config-service	`0.00% <ø> (ø)`
file-service	`33.24% <ø> (ø)`
workflow-compiling-service	`47.72% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Per Copilot feedback on apache#4786: include `operatorId`, `inputTupleSize`, `outputTupleSize`, `dataProcessingTime`, and `controlProcessingTime` in the type assertions. Downstream readers deserialize this schema positionally and cast every slot, so a type change in any column should fail the spec. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Yicong-Huang · 2026-05-03T07:41:23Z

@@ -0,0 +1,70 @@
+/*


do you think this file needs test? seems it only defined three static schemas. Can we skip it?

let's check the coverage about this change. I kind of don't believe the current numbers shown.

Fair concern. The motivation is that downstream readers deserialize this schema positionally and cast each slot to a concrete type, so a silent reorder/retype would break consumers without any local-file change being obvious in review (the OG Copilot thread above made the same point). Pinning the column list + types here lets a CI red flag catch the drift instead of waiting for a runtime cast failure.

Expanded the spec in 743cfb5 so it goes beyond redeclaring the schema. New runtimeStatisticsSchema coverage: stable name → index mapping for positional readers; unknown-name lookup throws and the error message names the missing column; containsAttribute returns false for unknown names; column names are unique; toRawSchema → fromRawSchema round-trips names + types intact (the cross-language serialization contract that Python and external consumers actually depend on); singleton-val identity. Parallel coverage added for consoleMessagesSchema. 12 tests total, all passing.

Let me know if this changes your mind on closing — happy to either keep it or close if it still feels like ceremony.

Drop the awkward "expose ... canonical ..." phrasing in favor of plain "list its columns in the declared order" / "have a single STRING column" per Yicong-Huang's review note. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address Yicong-Huang's review on apache#4786: the previous version mostly re-stated the source of truth. Lift the spec from "schema definition parrot" to "schema contract pinning" by exercising real lookup, serialization, and identity behaviors that downstream consumers depend on. Pull the (name, type) layout into a single `runtimeStatsLayout` source of truth and drive multiple tests off it. New behaviors covered for runtimeStatisticsSchema: - name → index mapping is stable for positional readers - getAttribute on an unknown name throws and the error message names it - containsAttribute returns false for an unknown name - column names are unique (no accidental dupes) - toRawSchema → fromRawSchema round-trips names + types intact (the cross-language serialization contract that Python / external consumers actually depend on) - the schema is a singleton val (same instance per access) Parallel coverage added for consoleMessagesSchema: index of `message` is 0, unknown-name lookup throws, toRawSchema equals `{message: STRING}`, and singleton identity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 3, 2026 01:51

github-actions Bot assigned aglinxinyuan May 3, 2026

github-actions Bot added the common label May 3, 2026

Copilot started reviewing on behalf of aglinxinyuan May 3, 2026 01:51 View session

Copilot AI reviewed May 3, 2026

View reviewed changes

Comment thread ...kflow-core/src/test/scala/org/apache/texera/amber/core/storage/result/ResultSchemaSpec.scala Outdated

aglinxinyuan requested a review from Yicong-Huang May 3, 2026 01:59

Yicong-Huang reviewed May 3, 2026

View reviewed changes

Comment thread ...kflow-core/src/test/scala/org/apache/texera/amber/core/storage/result/ResultSchemaSpec.scala Outdated

Yicong-Huang reviewed May 3, 2026

View reviewed changes

aglinxinyuan and others added 2 commits May 3, 2026 01:38

aglinxinyuan requested a review from Yicong-Huang May 4, 2026 04:08

aglinxinyuan closed this May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(workflow-core): add unit test coverage for ResultSchema#4786

test(workflow-core): add unit test coverage for ResultSchema#4786
aglinxinyuan wants to merge 4 commits into
apache:mainfrom
aglinxinyuan:xinyuan-test-result-schema-spec

aglinxinyuan commented May 3, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

codecov-commenter commented May 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Yicong-Huang May 3, 2026

Uh oh!

Yicong-Huang May 3, 2026

Uh oh!

aglinxinyuan May 3, 2026 •

edited

Loading

Uh oh!

aglinxinyuan May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aglinxinyuan commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

codecov-commenter commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Yicong-Huang May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Yicong-Huang May 3, 2026

Choose a reason for hiding this comment

Uh oh!

aglinxinyuan May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aglinxinyuan May 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aglinxinyuan commented May 3, 2026 •

edited

Loading

codecov-commenter commented May 3, 2026 •

edited

Loading

aglinxinyuan May 3, 2026 •

edited

Loading