Skip to content

fix: avoid O(N) memory allocation when displaying large binary blobs in result panel#4876

Merged
kunwp1 merged 14 commits into
apache:mainfrom
kunwp1:chris-fix-table-scan
May 3, 2026
Merged

fix: avoid O(N) memory allocation when displaying large binary blobs in result panel#4876
kunwp1 merged 14 commits into
apache:mainfrom
kunwp1:chris-fix-table-scan

Conversation

@kunwp1
Copy link
Copy Markdown
Contributor

@kunwp1 kunwp1 commented May 3, 2026

What changes were proposed in this PR?

When running a workflow that produces large binary fields (e.g. 50 MB blobs), the result panel appeared empty with no error messages. The root cause was that ExecutionResultService converted the entire byte array to a hex string (~3× the blob size in memory) before truncating it for display, causing the web server to take a very long time to fetch results.

The fix slices the byte array to only the bytes needed for the preview before encoding, making the conversion O(1) regardless of blob size. The display format is also updated from a hex dump (AB 12 CD...) to a binary string preview (<binary 0110100101...110, size = 52,428,800 bytes>), showing binaryPreviewLeadingBits (10) leading bits and binaryPreviewTrailingBits (3) trailing bits, just more clearly signaling opaque binary content.

Performance comparison (50 MB blob, averaged over 5 runs):

Approach Time
Before (full hex conversion) ~5,971 ms
After (slice then encode) ~0.006 ms

~1,000,000× faster for a 50 MB blob; effectively constant time regardless of blob size.

Screenshot 2026-05-03 at 4 09 47 PM

Any related issues, documentation, discussions?

Closes #4875

How was this PR tested?

Test by running this workflow and check if the result appears almost immediately

Untitled workflow (11).json

Was this PR authored or co-authored using generative AI tooling?

Claude Code

@kunwp1 kunwp1 requested a review from aglinxinyuan May 3, 2026 21:53
@kunwp1 kunwp1 self-assigned this May 3, 2026
@github-actions github-actions Bot added the engine label May 3, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.01%. Comparing base (33ea60e) to head (8fdde77).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4876      +/-   ##
============================================
- Coverage     46.68%   44.01%   -2.68%     
- Complexity     2150     2151       +1     
============================================
  Files           843      957     +114     
  Lines         27322    34051    +6729     
  Branches       2533     3753    +1220     
============================================
+ Hits          12756    14986    +2230     
- Misses        13791    18312    +4521     
+ Partials        775      753      -22     
Flag Coverage Δ
amber 42.81% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One question inline.

Comment thread amber/src/main/scala/org/apache/texera/web/service/ExecutionResultService.scala Outdated
@Yicong-Huang
Copy link
Copy Markdown
Contributor

we had an offline discussion, the backend optimization is good.

but for frontend, maybe better to display non-HEX.

here is a suggested format:

<binary: abfas12312...adj 52,428,800 bytes>

@Yicong-Huang
Copy link
Copy Markdown
Contributor

looks good!

@kunwp1 kunwp1 enabled auto-merge (squash) May 3, 2026 23:19
@kunwp1 kunwp1 merged commit 1bf18e7 into apache:main May 3, 2026
13 checks passed
@kunwp1 kunwp1 deleted the chris-fix-table-scan branch May 3, 2026 23:44
@chenlica
Copy link
Copy Markdown
Contributor

chenlica commented May 4, 2026

Great job!

@Yicong-Huang
Copy link
Copy Markdown
Contributor

forgot to ask, does this need to be back ported to release/1.1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Result panel shows no results when operator produces large binary blobs

4 participants