fix: cherry-pick 413 iterative bisection and TaggedOperation refactor into 1.12.5#27127
fix: cherry-pick 413 iterative bisection and TaggedOperation refactor into 1.12.5#27127
Conversation
…ctor in bulk index sinks (#26347) * fix: add iterative bisection on 413 and TaggedOperation refactor in bulk index sinks * fix: align JUnit 5 versions via BOM to fix local test discovery * fix: pass ReportDataType via constructor to CreateReportDataProcessor
| List<?> bulkOps = | ||
| (List<?>) entityProcessor.process(enriched, contextData); | ||
| operationsQueue.addAll(bulkOps); | ||
| EntityReference ref = entity.getEntityReference(); | ||
| bulkOps.forEach(op -> opsQueue.add(new TaggedOperation<>(op, ref))); |
There was a problem hiding this comment.
💡 Quality: Raw-type TaggedOperation loses compile-time type safety
In DataAssetsWorkflow.processSource, the processor returns List<?> and each element is wrapped as TaggedOperation<>(op, ref) where op is Object. The sink field is declared as raw Sink searchIndexSink, so write(batch) compiles without type checking. If the processor ever returns a non-BulkOperation object, the error would only surface at runtime deep inside the sink.
This is a pre-existing raw-type pattern on the Sink field, but the new code adds another layer of type erasure through TaggedOperation<?>. Consider narrowing the types to catch mismatches at compile time.
Suggested fix:
// In DataAssetsWorkflow, type the sink and processor fields properly:
private Sink<List<TaggedOperation<BulkOperation>>, BulkResponse> searchIndexSink;
private Processor<List<BulkOperation>, EntityInterface> entityProcessor;
// Then the forEach becomes type-safe:
List<BulkOperation> bulkOps = entityProcessor.process(enriched, contextData);
bulkOps.forEach(op -> opsQueue.add(new TaggedOperation<>(op, ref)));
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
Code Review 👍 Approved with suggestions 0 resolved / 1 findingsCherry-picks iterative bisection and TaggedOperation refactor into 1.12.5 to address issue #26344. Consider adding type parameters to TaggedOperation usage in DataAssetsWorkflow to maintain compile-time type safety. 💡 Quality: Raw-type TaggedOperation loses compile-time type safety📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/insights/workflows/dataAssets/DataAssetsWorkflow.java:289-292 📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/insights/workflows/dataAssets/DataAssetsWorkflow.java:347-356 In This is a pre-existing raw-type pattern on the Suggested fix🤖 Prompt for agentsOptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
Dropping this for now — will cherry-pick into 1.12.6 once the branch is up. |
Cherry-pick of #26347 (
da400385d4) frommaininto1.12.5.Changes
ElasticSearchIndexSinkandOpenSearchIndexSink— instead of failing the whole batch, the sink splits it in half recursively until the oversized document is isolated and skipped with a clear errorTaggedOperationwrapper so entity references are tracked through the bulk pipeline, ensuring errors report the actual entity instead of a positional index1.12.5-specific test dependencies inopenmetadata-service/pom.xmland droppedjunit-jupiter-paramswhich was removed by the original commitMotivation
The Data Insights Application on customer environments was failing with 413 errors when bulk indexing large entity batches. This fix was already merged to
mainbut was missing from the1.12.5release branch.Fixes #26344