Skip to content

[WIP] feat(workflows): batch entity processing — entityList-only for automated task nodes#26715

Open
yan-3005 wants to merge 22 commits intomainfrom
ram/workflow-improvements
Open

[WIP] feat(workflows): batch entity processing — entityList-only for automated task nodes#26715
yan-3005 wants to merge 22 commits intomainfrom
ram/workflow-improvements

Conversation

@yan-3005
Copy link
Copy Markdown
Contributor

@yan-3005 yan-3005 commented Mar 24, 2026

Issue: https://github.com/open-metadata/openmetadata-collate/issues/3363

Phase 1 of batch entity processing in governance workflows. All automated task nodes (checkEntityAttributesTask, setEntityAttributeTask, checkChangeDescriptionTask, rollbackEntityTask, sinkTask, dataCompletenessTask) now process a List of entity links via entityList exclusively. The relatedEntity fallback in getEntityList() is removed — batch nodes no longer have that path.

Key changes:

  • PeriodicBatchEntityTrigger (singleExecutionMode=false): each child process now receives global_entityList via ${entityToListMap[relatedEntity]}. FetchEntitiesImpl pre-builds entityToListMap (entity -> List.of(entity)) so the JUEL expression resolves without static class references.

  • Batch node impls (6 files): removed relatedEntity fallback from getEntityList() and the now-unused RELATED_ENTITY_VARIABLE import. entityList is the only input path.

  • BPMN builders: putIfAbsent(ENTITY_LIST_VARIABLE, GLOBAL_NAMESPACE) for all batch-capable nodes; relatedEntity is never added to inputNamespaceMap by builders.

  • v1130 migration (addEntityListToNamespaceMap): updated to also strip relatedEntity from batch node inputNamespaceMaps, covering both fresh upgrades (add entityList + remove relatedEntity) and instances that already ran the previous migration (entityList present, remove relatedEntity). Migration remains idempotent.

  • GlossaryApprovalWorkflow.json: removed relatedEntity from all 13 batch node inputNamespaceMaps. userApprovalTask nodes keep relatedEntity.

  • JSON schemas: entityList added to trigger output and batch node inputNamespaceMap/input definitions.

  • Integration tests: updated to reflect entityList-first structure across all batch-capable workflow node configurations.

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@yan-3005 yan-3005 self-assigned this Mar 24, 2026
@yan-3005 yan-3005 added safe to test Add this label to run secure Github workflows on PRs backend labels Mar 24, 2026
Copilot AI review requested due to automatic review settings March 24, 2026 07:23
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions bot requested a review from a team as a code owner March 24, 2026 07:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Phase 1 of batch-entity processing for governance workflows: moves automated task nodes to consume entities exclusively via entityList (a list of entity-link strings), updates trigger/BPMN wiring to pass lists, and adds a v1130 migration to update stored workflow definitions accordingly.

Changes:

  • Extend trigger outputs and workflow event variable initialization to include entityList.
  • Update automated task node implementations and BPMN builders to read from entityList and to populate entityList in inputNamespaceMap by default.
  • Add v1130 data migration to inject entityList into workflow definitions and remove relatedEntity from batch-node namespace maps; update built-in workflow JSON and integration tests.

Reviewed changes

Copilot reviewed 30 out of 37 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/periodicBatchEntityTrigger.json Updates trigger output schema to include entityList.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/eventBasedEntityTrigger.json Updates trigger output schema to include entityList.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/sinkTask.json Switches sink task schema inputs to entityList.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/setEntityAttributeTask.json Switches set-attribute task schema inputs to entityList.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/rollbackEntityTask.json Switches rollback task schema inputs to entityList.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/dataCompletenessTask.json Adds entityList to schema for completeness task.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/checkEntityAttributesTask.json Switches check-attributes task schema input to entityList and adds output list defaults.
openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/checkChangeDescriptionTask.json Switches check-change-description task schema input to entityList and adds output list defaults.
openmetadata-service/src/main/resources/json/data/governance/workflows/GlossaryApprovalWorkflow.json Updates built-in workflow to wire batch-capable nodes via entityList.
openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/v1130/MigrationUtil.java Adds v1130 migration to add entityList to triggers and batch node namespace maps; strips relatedEntity for batch nodes; redeploys workflows.
openmetadata-service/src/main/java/org/openmetadata/service/migration/postgres/v1130/Migration.java Executes the new workflow migration during v1130 Postgres upgrade.
openmetadata-service/src/main/java/org/openmetadata/service/migration/mysql/v1130/Migration.java Executes the new workflow migration during v1130 MySQL upgrade.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/triggers/impl/FetchEntitiesImpl.java Builds entityToListMap so periodic trigger can pass per-entity entityList without static calls in JUEL.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/triggers/PeriodicBatchEntityTrigger.java Updates periodic trigger call-activity input mapping to always pass entityList (full batch or per-entity list) and adjusts loop cardinality handling.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/triggers/EventBasedEntityTrigger.java Ensures event-based trigger passes entityList into the called workflow.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/sink/SinkTaskDelegate.java Refactors sink delegate to always read entities from entityList.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java Changes set-attribute delegate to iterate over entityList.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java Changes rollback delegate to iterate over entityList and adds BPMN error handling + exception propagation.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java Changes completeness delegate to process entityList and emit per-band lists + results.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java Changes attribute-check delegate to partition entities via entityList.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java Changes change-description-check delegate to partition entities via entityList.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/SetEntityAttributeTask.java Ensures BPMN builder injects entityList namespace mapping by default.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/RollbackEntityTask.java Ensures BPMN builder injects entityList namespace mapping by default.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/DataCompletenessTask.java Ensures BPMN builder injects entityList namespace mapping by default.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/CheckEntityAttributesTask.java Ensures BPMN builder injects entityList namespace mapping by default.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/CheckChangeDescriptionTask.java Ensures BPMN builder injects entityList namespace mapping by default.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/TriggerFactory.java Forces periodic triggers to run with singleExecutionMode=false.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/WorkflowEventConsumer.java Populates both relatedEntity and entityList in event-triggered workflow variables.
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/Workflow.java Adds a constant for a “false entity list” variable name.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java Updates integration tests to use entityList-based configs and expected outputs.
Comments suppressed due to low confidence (1)

openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/dataCompletenessTask.json:110

  • This schema still requires relatedEntity in inputNamespaceMap (and includes it in the default input), but the Java implementation now reads only from entityList and the v1130 migration strips relatedEntity from batch node inputNamespaceMaps. As written, migrated workflows will fail validation for dataCompletenessTask. Update the schema to make entityList the required input/namespace and remove or deprecate relatedEntity here to match runtime behavior.
    "input": {
      "type": "array",
      "items": { "type": "string" },
      "default": ["relatedEntity", "entityList"],
      "additionalItems": false,
      "minItems": 1
    },
    "inputNamespaceMap": {
      "type": "object",
      "properties": {
        "relatedEntity": {
          "type": "string",
          "default": "global"
        },
        "entityList": {
          "type": "string",
          "default": "global"
        }
      },
      "additionalProperties": false,
      "required": ["relatedEntity"]
    },

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 24, 2026

🟡 Playwright Results — all passed (19 flaky)

✅ 3397 passed · ❌ 0 failed · 🟡 19 flaky · ⏭️ 216 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 452 0 3 2
🟡 Shard 2 602 0 2 32
🟡 Shard 3 606 0 3 27
🟡 Shard 4 596 0 7 47
🟡 Shard 5 586 0 1 67
🟡 Shard 6 555 0 3 41
🟡 19 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Table entity item action after rules disabled (shard 1, 1 retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Api Service entity item action after rules disabled (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/DataProductPersonaCustomization.spec.ts › Data Product - customization should work (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for ApiEndpoint (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Verify duplicate domain creation (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with deeply nested subdomains (3+ levels) verifies FQN propagation (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Verify Domain entity API calls do not include invalid domains field in tag assets (shard 4, 2 retries)
  • Pages/Entity.spec.ts › Delete Container (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › validates visible/hidden tabs and tab content for databaseSchema (shard 5, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

…ted task nodes

Phase 1 of batch entity processing in governance workflows. All
automated task nodes (checkEntityAttributesTask, setEntityAttributeTask,
checkChangeDescriptionTask, rollbackEntityTask, sinkTask,
dataCompletenessTask) now process a List<String> of entity links via
entityList exclusively. The relatedEntity fallback in getEntityList()
is removed — batch nodes no longer have that path.

Key changes:

- PeriodicBatchEntityTrigger (singleExecutionMode=false): each child
  process now receives global_entityList via ${entityToListMap[relatedEntity]}.
  FetchEntitiesImpl pre-builds entityToListMap (entity -> List.of(entity))
  so the JUEL expression resolves without static class references.

- Batch node impls (6 files): removed relatedEntity fallback from
  getEntityList() and the now-unused RELATED_ENTITY_VARIABLE import.
  entityList is the only input path.

- BPMN builders: putIfAbsent(ENTITY_LIST_VARIABLE, GLOBAL_NAMESPACE) for
  all batch-capable nodes; relatedEntity is never added to inputNamespaceMap
  by builders.

- v1130 migration (addEntityListToNamespaceMap): updated to also strip
  relatedEntity from batch node inputNamespaceMaps, covering both fresh
  upgrades (add entityList + remove relatedEntity) and instances that
  already ran the previous migration (entityList present, remove relatedEntity).
  Migration remains idempotent.

- GlossaryApprovalWorkflow.json: removed relatedEntity from all 13 batch
  node inputNamespaceMaps. userApprovalTask nodes keep relatedEntity.

- JSON schemas: entityList added to trigger output and batch node
  inputNamespaceMap/input definitions.

- Integration tests: updated to reflect entityList-first structure across
  all batch-capable workflow node configurations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yan-3005 yan-3005 force-pushed the ram/workflow-improvements branch from ccd7f2a to 0ec257c Compare March 25, 2026 07:00
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

Copilot AI review requested due to automatic review settings March 25, 2026 07:06
@yan-3005 yan-3005 review requested due to automatic review settings March 25, 2026 07:06
- Remove relatedEntity from checkEntityAttributesTask and dataCompletenessTask
  inputNamespaceMap schemas; make entityList required with default "global"
- Restore hasBatchModeNodes() in TriggerFactory so sink workflows with
  batchMode=true run as a single batch execution instead of N parallel children
- Extract duplicated getEntityList() from 6 impl files into a shared static
  method on WorkflowVariableHandler
- Add migrateInputArray() to MigrationUtil to replace relatedEntity with
  entityList in stored workflow node input arrays
- Fix integration test trigger outputs: remove relatedEntity from pure batch
  workflows; add entityList to approval workflows that were missing it; keep
  relatedEntity for workflows with userApprovalTask nodes (required by
  MainWorkflow.WorkflowGraph validation)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 27, 2026 05:23
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 38 changed files in this pull request and generated 4 comments.

…k variables

- Fix falseEntityList vs false_entityList mismatch: schema output defaults
  in checkEntityAttributesTask.json and checkChangeDescriptionTask.json
  declared "falseEntityList" (camelCase) while runtime code writes
  "false_entityList" (underscore via FALSE_ENTITY_LIST_VARIABLE). Update
  schemas and integration tests to match the runtime variable name.
- Remove five write-only dead execution variables from RollbackEntityImpl
  (rollbackAction, rollbackFromVersion, rollbackToVersion, rollbackEntityId,
  rollbackEntityType) — nothing reads them; they only polluted the workflow
  instance state blob. Remove unused DelegateExecution param from
  rollbackEntity() private method.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…*_entityList routing

- MigrationUtil: build incomingEdge (target→{source,condition}) and nodeSubType maps
  from workflow JSON; downstream nodes of check tasks now get condition_entityList
  (e.g. true_entityList, false_entityList, gold_entityList) keyed to the check node's
  namespace, instead of always mapping entityList→global
- WorkflowVariableHandler.getEntityList(): iterate all inputNamespaceMap keys ending
  in _entityList first (covers arbitrary dataCompleteness band names), then fall back
  to plain entityList key

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 27, 2026 13:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 52 changed files in this pull request and generated 3 comments.

yan-3005 and others added 2 commits March 27, 2026 20:21
…ismatch

- MigrationUtil.addEntityListToNamespaceMap: start from deepCopy of existing
  inputNamespaceMap and remove relatedEntity, instead of creating a fresh empty
  map — preserves updatedBy and other keys for setEntityAttributeTask/rollbackEntityTask
- MigrationUtilTest: update all addEntityListToNamespaceMap calls to 3-arg
  signature (incoming, nodeSubType); add tests for updatedBy preservation,
  true/false/band conditional routing; fix Phase 2 test to not assert createOrUpdate
  (code now calls WorkflowHandler.deploy directly)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tants, renames

- Add TRUE_ENTITY_LIST_VARIABLE constant; replace "true_" + ENTITY_LIST_VARIABLE literals
- Add resilience4j per-entity retry (3 attempts, 500ms) to all 5 impl nodes and SinkTaskDelegate
  list mode; per-entity failures never throw BpmnError — workflow always continues to completion
- RollbackEntityImpl: add Approved → Draft → Deprecated fallback chain for rollback target;
  remove unused workflow/rollbackToStatus fields
- DataCompletenessImpl: fix band iteration to store all bands (including empty); fix storeFieldList
  to always store List<String>; store counts only in entityResults to reduce variable bloat;
  downgrade per-entity LOG.info → LOG.debug, add single summary LOG.info
- SinkTaskDelegate: fix entityFqn in SinkError to use entityLink.getEntityFQN() not raw link string
- Rename converted → definedNamespaceMap in 6 builder task files
- WorkflowVariableHandler: add warning log when entityList not found
- v1140 migration (MySQL + Postgres): add initializeWorkflowHandler() before migrateWorkflowInputNamespaceMap()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 27, 2026 16:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 53 changed files in this pull request and generated 6 comments.

yan-3005 and others added 2 commits March 27, 2026 22:24
- Remove unused ADMIN_USER_NAME constant from MigrationUtil
- Downgrade LOG.warn -> LOG.debug in WorkflowVariableHandler.getEntityList()
- Add true_entityList to output defaults in checkEntityAttributes and checkChangeDescription schemas
- Change additionalProperties: false -> true in all 6 task inputNamespaceMaps to allow conditional routing keys (true_entityList, false_entityList, gold_entityList)
- Fix MigrationUtil incomingEdge map to use List<String[]> to handle multi-edge nodes correctly
- Remove COLLECTION_VARIABLE from PeriodicBatchEntityTrigger; use Workflow.ENTITY_LIST_VARIABLE
- Fix FetchEntitiesImpl to import ENTITY_LIST_VARIABLE from Workflow instead of removed COLLECTION_VARIABLE
- Update rollbackEntityTask.json description to reflect APPROVED -> DRAFT -> DEPRECATED fallback chain
- Update MigrationUtil Javadoc to reference WorkflowHandler#deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

addEntityListToNamespaceMap now takes List<String[]> instead of String[]
to support multi-edge nodes; update test call sites accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 47 out of 54 changed files in this pull request and generated 1 comment.

pr-26715-review-bugs.md was a scratch file tracking review items
and should not be part of the repo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yan-3005
Copy link
Copy Markdown
Contributor Author

Removed pr-26715-review-bugs.md — it was an internal scratch file that shouldn't have been committed.

@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 27, 2026

Code Review 👍 Approved with suggestions 9 resolved / 10 findings

Batch entity processing refactor for automated task nodes now includes retry logic, consolidated field handling, and migration fixes addressing 9 previously flagged issues. Consider extracting the duplicate RetryConfig across 6 implementation classes into a shared utility to reduce maintenance burden.

💡 Quality: Duplicate retry configuration across 6 impl classes

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:62-69 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java:64-71 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java:63-70 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:49-56 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:51-58 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/sink/SinkTaskDelegate.java:265-272

The identical RetryConfig (maxAttempts=3, waitDuration=500ms, retryExceptions=Exception.class) is copy-pasted across RollbackEntityImpl, SetEntityAttributeImpl, CheckEntityAttributesImpl, CheckChangeDescriptionTaskImpl, DataCompletenessImpl, and SinkTaskDelegate. If retry parameters need tuning (e.g., adding specific exception types per the idempotency finding), all 6 files must be updated in lockstep.

Consider extracting a shared retry factory, e.g., WorkflowRetryConfig.defaultRetry(String name) in the workflows package, so retry policy changes are centralized.

✅ 9 resolved
Bug: RollbackEntity overwrites execution variables in batch loop

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:62-65 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:112-116 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:52-56 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:53-57
When entityList contains multiple entities, rollbackEntity() is called in a loop and each iteration sets the same execution variables (rollbackAction, rollbackFromVersion, rollbackToVersion, rollbackEntityId, rollbackEntityType) via execution.setVariable(). Only the last entity's rollback metadata survives. Other batch implementations (DataCompletenessImpl, SinkTaskDelegate) correctly accumulate per-entity results into maps/lists.

Bug: dataCompletenessTask.json schema still requires relatedEntity

📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/dataCompletenessTask.json:92 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/dataCompletenessTask.json:99-102 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/dataCompletenessTask.json:108-109 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/nodes/automatedTask/checkEntityAttributesTask.json:57-66
The JSON schema for dataCompletenessTask still has "required": ["relatedEntity"] in inputNamespaceMap and includes relatedEntity in the default input array. However, DataCompletenessImpl.java no longer reads relatedEntity — it exclusively uses ENTITY_LIST_VARIABLE. This mismatch means:

  1. Schema validation requires a field the code ignores
  2. additionalProperties: false combined with the required relatedEntity prevents creating workflows without it
  3. Inconsistent with peer schemas (rollbackEntityTask, setEntityAttributeTask, checkChangeDescriptionTask) which were updated to require entityList instead
Quality: Duplicated getEntityList() method across 6 classes

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:67-78 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:77-88 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java:137-149 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:126-137 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java:79-90 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/sink/SinkTaskDelegate.java:297-308 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java:148 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:136 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:66-78 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:76-88 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:125-137 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java:78-90 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/sink/SinkTaskDelegate.java:296-308 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java:74-85 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:85-96 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java:124-135 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:74-85
The exact same getEntityList(inputNamespaceMap, varHandler) method is copy-pasted into CheckChangeDescriptionTaskImpl, CheckEntityAttributesImpl, DataCompletenessImpl, RollbackEntityImpl, SetEntityAttributeImpl, and SinkTaskDelegate. This is a maintenance hazard — any bug fix or behavior change needs to be applied in 6 places.

Bug: Trigger output schema maxItems:2 violated by 3-item outputs

📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/eventBasedEntityTrigger.json:119 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/periodicBatchEntityTrigger.json:97 📄 openmetadata-service/src/main/resources/json/data/governance/workflows/GlossaryApprovalWorkflow.json:23-27 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/eventBasedEntityTrigger.json:113-119 📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/periodicBatchEntityTrigger.json:91-97 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:3953 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:7969 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:8194 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:8424 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:8616 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:8846
Both eventBasedEntityTrigger.json and periodicBatchEntityTrigger.json set maxItems: 2 on the trigger output array, but the GlossaryApprovalWorkflow.json and multiple integration tests define trigger output arrays with 3 items (e.g. ["entityList", "relatedEntity", "updatedBy"]). While JSON Schema validation is not strictly enforced at the Java API layer today, this inconsistency makes the spec inaccurate and will break if validation is tightened, and may affect frontend/codegen consumers of the schema.

Bug: Workflow.FALSE_ENTITY_LIST_VARIABLE unused; hardcoded strings used instead

📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/Workflow.java:15 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:55 📄 openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:56
A constant FALSE_ENTITY_LIST_VARIABLE = "false_entityList" was added in Workflow.java (line 15), but CheckEntityAttributesImpl.java (line 55) and CheckChangeDescriptionTaskImpl.java (line 56) construct the variable name via string concatenation: "false_" + ENTITY_LIST_VARIABLE. While the resulting value is the same today, using two different ways to produce the same string is fragile — if either the constant or the concatenation pattern is changed independently, they'll silently diverge. The constant should be used directly for consistency.

...and 4 more resolved from earlier reviews

🤖 Prompt for agents
Code Review: Batch entity processing refactor for automated task nodes now includes retry logic, consolidated field handling, and migration fixes addressing 9 previously flagged issues. Consider extracting the duplicate RetryConfig across 6 implementation classes into a shared utility to reduce maintenance burden.

1. 💡 Quality: Duplicate retry configuration across 6 impl classes
   Files: openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/RollbackEntityImpl.java:62-69, openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/SetEntityAttributeImpl.java:64-71, openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/DataCompletenessImpl.java:63-70, openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckEntityAttributesImpl.java:49-56, openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/impl/CheckChangeDescriptionTaskImpl.java:51-58, openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/elements/nodes/automatedTask/sink/SinkTaskDelegate.java:265-272

   The identical `RetryConfig` (maxAttempts=3, waitDuration=500ms, retryExceptions=Exception.class) is copy-pasted across `RollbackEntityImpl`, `SetEntityAttributeImpl`, `CheckEntityAttributesImpl`, `CheckChangeDescriptionTaskImpl`, `DataCompletenessImpl`, and `SinkTaskDelegate`. If retry parameters need tuning (e.g., adding specific exception types per the idempotency finding), all 6 files must be updated in lockstep.
   
   Consider extracting a shared retry factory, e.g., `WorkflowRetryConfig.defaultRetry(String name)` in the workflows package, so retry policy changes are centralized.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 53 changed files in this pull request and generated 5 comments.

Comment on lines 70 to +80
"inputNamespaceMap": {
"type": "object",
"properties": {
"relatedEntity": {
"entityList": {
"type": "string",
"default": "global"
}
},
"additionalProperties": false,
"required": ["relatedEntity"]
"additionalProperties": true,
"required": ["entityList"]
},
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputNamespaceMap still has required: ["entityList"] even though the migration/engine now support conditional keys like true_entityList and false_entityList for edge-based routing. Workflows that map only true_entityList/false_entityList into this node will be rejected by schema validation. Recommend removing/adjusting the required constraint and using typed additionalProperties (string) instead of true.

Copilot uses AI. Check for mistakes.
Comment on lines 49 to 63
@@ -58,8 +58,8 @@
"default": null
}
},
"additionalProperties": false,
"required": ["relatedEntity"]
"additionalProperties": true,
"required": ["entityList"]
},
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputNamespaceMap still requires the plain entityList key, but edge-routed workflows may pass only conditional keys like true_entityList / false_entityList / <band>_entityList. This will cause schema validation failures for migrated definitions. Recommend loosening the required constraint and typing additionalProperties as { "type": "string" } (namespace values) instead of true.

Copilot uses AI. Check for mistakes.
Comment on lines 150 to 164
@@ -159,8 +159,8 @@
"default": "global"
}
},
"additionalProperties": false,
"required": ["relatedEntity"]
"additionalProperties": true,
"required": ["entityList"]
},
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InputNamespaceMap now allows arbitrary keys via additionalProperties: true but still requires entityList. This conflicts with the new conditional namespace-map keys (e.g. true_entityList, false_entityList, gold_entityList) introduced by the v1140 migration and WorkflowVariableHandler.getEntityList(), where nodes may legitimately omit the plain entityList key. Schema validation will fail for migrated/edge-routed workflows. Suggestion: make additionalProperties a typed schema (e.g. { "type": "string" }) and relax/adjust the requirement so either entityList or a *_entityList key is allowed.

Copilot uses AI. Check for mistakes.
Comment on lines 66 to 80
@@ -75,8 +75,8 @@
"default": null
}
},
"additionalProperties": false,
"required": ["relatedEntity"]
"additionalProperties": true,
"required": ["entityList"]
},
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputNamespaceMap is marked required: ["entityList"] but the new batch routing can legitimately provide only a conditional key (e.g. true_entityList, false_entityList, or gold_entityList) with no plain entityList. As-is, migrated workflows can fail validation. Suggest: relax the required constraint and set additionalProperties to { "type": "string" } (namespace strings) instead of untyped true.

Copilot uses AI. Check for mistakes.
Comment on lines 96 to 106
"inputNamespaceMap": {
"type": "object",
"properties": {
"relatedEntity": {
"entityList": {
"type": "string",
"default": "global"
}
},
"additionalProperties": false,
"required": ["relatedEntity"]
"additionalProperties": true,
"required": ["entityList"]
},
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputNamespaceMap requires entityList, but with band-based routing the downstream node may only have <band>_entityList (and no plain entityList). This makes the schema incompatible with the new routing/migration logic. Suggest relaxing required and constraining additionalProperties to { "type": "string" } so conditional *_entityList keys are still validated as namespace strings.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants