feat: route computing-unit metadata over HTTP, off Postgres#5298
Draft
bobbai00 wants to merge 4 commits into
Draft
feat: route computing-unit metadata over HTTP, off Postgres#5298bobbai00 wants to merge 4 commits into
bobbai00 wants to merge 4 commits into
Conversation
The computing unit runs user-defined functions yet shipped with Postgres credentials (issue apache#5011). Add an opt-in path so the CU performs no direct JDBC access: - Execution metadata (create execution; runtime-stats / console / result URIs; latest-execution and result-URI lookup) routes to the Dashboard Service via a new /api/internal/execution-metadata/* API and an HTTP client; dataset-path resolution routes to file-service's new /api/dataset/resolve. Both forward the user's JWT. - Routing is active when SqlServer is uninitialized (the CU) and the user token is present; ComputingUnitMaster skips SqlServer.initConnection and DB cleanup when EXECUTION_METADATA_REMOTE=true. No behavior change with the flag off — the Dashboard Service keeps the direct-DB path. Unit-tested both HTTP clients (RemoteDatasetResolver, RemoteExecutionMetadata).
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5298 +/- ##
============================================
- Coverage 49.17% 49.02% -0.16%
+ Complexity 2386 2380 -6
============================================
Files 1051 1054 +3
Lines 40350 40516 +166
Branches 4279 4313 +34
============================================
+ Hits 19841 19861 +20
- Misses 19352 19476 +124
- Partials 1157 1179 +22
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
added 3 commits
May 30, 2026 14:40
…ing-service The computing unit re-compiled logical plans in-process (issue apache#5011), keeping amber's WorkflowCompiler on the execution path. Route compilation to the workflow-compiling-service over HTTP so the CU just runs a ready-made plan: - WorkflowExecutionService.executeWorkflow POSTs the logical plan to /api/compile (CompilingServiceClient) and runs the returned PhysicalPlan; the runtime no longer needs the logical plan (Workflow holds None). A failed compile now reports the error and returns instead of dereferencing a null workflow. The dead in-process validateWorkflow is removed. - PhysicalPlan is made JSON-serializable end to end: operator partition logic moves from closures to a DerivePartitionSpec ADT, with custom serdes for OpExecInitInfo, LocationPreference, OutputMode, and per-port PhysicalOp views, registered in a shared PhysicalPlanSerdeModule. WorkflowCompilationResource accepts workflow/execution ids and returns a typed success/failure response. - Fix a latent LogicalLink round-trip (issue apache#5042): fromOpId / toOpId now serialize as bare strings (the shape the @JsonCreator constructor reads), so a re-serialized plan deserializes again. Adds round-trip serde tests (hand-built, compiler-produced, and a thorough multi-operator suite) and the LogicalLink round-trip regression.
amber and the workflow-compiling-service each carried a near-duplicate WorkflowCompiler and
logical-plan model (LogicalPlan, LogicalPlanPojo, LogicalLink) — the duplication was even
flagged in-code ("we should consider merge this compile with WorkflowCompilingService's").
Move the canonical pieces into the shared workflow-operator module:
- org.apache.texera.amber.compiler.model.{LogicalLink, LogicalPlan, LogicalPlanPojo} now live
in workflow-operator as the single copy (the LogicalLink string serializer from apache#5042 is
preserved). The duplicates in amber and workflow-compiling-service are deleted.
- PhysicalPlanExpander.expand holds the logical-to-physical expansion the two compilers shared.
- Each compiler is now a thin wrapper over it: amber's adds result-storage planning and builds
a runtime Workflow (carrying the logical plan) for execution; the compiling-service's adds
output-schema collection and error reporting for the editor.
amber's references migrate to the shared model package. No behavior change: the logical-link
round-trip, compiler-produced physical-plan serde, and storage-port collection specs all pass.
… JWT auth The client (frontend and agent service) now compiles the workflow against the workflow-compiling-service and ships the ready-to-run PhysicalPlan to the Computing Unit, which runs it directly — no in-process or HTTP compilation, and no JWT authentication, so the CU no longer needs the JWT secret (issue apache#5011). Frontend: - ExecuteWorkflowService compiles via WorkflowCompilingService on Run and sends WorkflowExecuteRequest{physicalPlan, opsToViewResult}; on a compile failure it surfaces the error and does not start a run. - The workflow websocket URL no longer carries an access token. Computing Unit: - ComputingUnitMaster drops setupJwtAuth and RolesAllowedDynamicFeature, keeping only the SessionUser value-factory binder so @Auth parameters on co-registered dashboard resources stay injectable; it registers PhysicalPlanSerdeModule. - ServletAwareConfigurator's single-node handshake no longer parses a token. - SyncExecutionResource (/run) accepts a PhysicalPlan and drops @Auth/@RolesAllowed. - WorkflowExecutionService runs request.physicalPlan; InternalExecutionMetadataResource falls back to the metadata caller's uid when the CU sends none. Cleanup: - Remove the now-unused CompilingServiceClient and WORKFLOW_COMPILING_SERVICE_ENDPOINT env var left from the abandoned CU-side compilation offload. Test: - Add ClientPhysicalPlanRequestSpec covering the physical-plan request round-trip.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
The computing unit (which runs user-defined functions) no longer needs direct Postgres access. Execution-metadata operations route to the Dashboard Service (
TexeraWebApplication) via a new/api/internal/execution-metadata/*API (InternalExecutionMetadataResource) and an HTTP client (RemoteExecutionMetadata); dataset-path resolution routes to file-service's newGET /api/dataset/resolve(RemoteDatasetResolver+ a one-line dispatch inFileResolver). Routing engages whenSqlServeris uninitialized (the CU) and the forwarded user JWT is present;ComputingUnitMasterskipsSqlServer.initConnectionand DB cleanup whenEXECUTION_METADATA_REMOTE=true. No behavior change with the flag off — the Dashboard Service keeps the direct-DB path. Also addsSqlServer.isInitializedand the new env-var names.Any related issues, documentation, discussions?
Part of #5011
How was this PR tested?
Unit tests for both HTTP clients (
RemoteDatasetResolverSpec,RemoteExecutionMetadataSpec) using in-process HTTP servers — positive, 404→None, and error-status paths. Manually end-to-end: launched the full stack with the CU master inEXECUTION_METADATA_REMOTE=truemode (noSTORAGE_JDBC_*; verified no Hikari/SqlServerinit) against a Lakekeeper REST catalog; the IMDB example workflows (Movies, Iris) and workflow 2568 ran to completion with every execution-metadata and dataset-resolution call served over HTTP (confirmed in the Dashboard Service / file-service request logs).Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (1M context)