Skip to content

Fix #26570: Add continuous migrations#26571

Merged
pmbrull merged 15 commits intomainfrom
issue-26570
Mar 26, 2026
Merged

Fix #26570: Add continuous migrations#26571
pmbrull merged 15 commits intomainfrom
issue-26570

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented Mar 18, 2026

Describe your changes:

Fixes #26570

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Continuous migrations support:
    • Added isReprocessing() and hasNewStatements() methods to MigrationProcess interface and implementation
    • Modified MigrationWorkflow to detect and reprocess versions with new SQL appended
    • Versions marked for reprocessing are skipped if no new statements detected
  • Migration file enhancements:
    • Added reprocessing flag and copyWithReprocessing() method to MigrationFile and FlywayMigrationFile
    • Implemented hasNewStatements() to identify newly appended SQL statements
  • Test coverage:
    • Added integration test ContinuousMigrationIT with three reprocessing scenarios
    • Added unit tests in MigrationWorkflowReprocessingTest and MigrationProcessImplTest

This will update automatically on new commits.

@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Mar 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 18, 2026

🟡 Playwright Results — all passed (20 flaky)

✅ 3411 passed · ❌ 0 failed · 🟡 20 flaky · ⏭️ 209 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 452 0 3 2
🟡 Shard 2 604 0 1 32
🟡 Shard 3 611 0 4 28
🟡 Shard 4 600 0 3 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 557 0 9 33
🟡 20 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Store Procedure entity item action after rules disabled (shard 1, 1 retry)
  • Features/DataAssetRulesDisabled.spec.ts › should allow multiple domain selection for glossary term when entity rules are disabled (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/TestSuiteMultiPipeline.spec.ts › TestSuite multi pipeline support (shard 3, 1 retry)
  • Flow/CustomizeWidgets.spec.ts › Following Assets Widget (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DomainDataProductsRightPanel.spec.ts › Should assign tier for data product from domain context (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tier Add, Update and Remove (shard 4, 1 retry)
  • Pages/Glossary.spec.ts › Change glossary term hierarchy using menu options across glossary (shard 6, 1 retry)
  • Pages/Glossary.spec.ts › Create glossary, change language to Dutch, and delete glossary (shard 6, 1 retry)
  • Pages/Lineage.spec.ts › Lineage creation from ApiEndpoint entity (shard 6, 1 retry)
  • Pages/Login.spec.ts › Refresh should work (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/ServiceEntity.spec.ts › User as Owner Add, Update and Remove (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Comment on lines +268 to +284
List<MigrationFile> availableNativeMigrations =
availableMigrations.stream().filter(migration -> !migration.isExtension).toList();
Optional<String> maxMigration =
executedMigrations.stream().max(MigrationWorkflow::compareVersions);
if (maxMigration.isPresent()) {
return availableNativeMigrations
.filter(migration -> migration.biggerThan(maxMigration.get()))
.toList();
List<MigrationFile> result = new ArrayList<>();
for (MigrationFile migration : availableNativeMigrations) {
if (migration.biggerThan(maxMigration.get())) {
result.add(migration);
} else if (compareVersions(migration.version, maxMigration.get()) == 0) {
migration.setReprocessing(true);
result.add(migration);
}
}
return result;
}
return availableNativeMigrations.toList();
return availableNativeMigrations;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to pull the changes from here #26592

* fix: process migration not executed

* fix: tests for process migration not executed

* fix: O(n) -> O(1)
Copilot AI review requested due to automatic review settings March 21, 2026 23:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements “continuous migrations” so the migration runner can re-evaluate the currently deployed server version and apply newly appended SQL statements without requiring a new version directory.

Changes:

  • Added a reprocessing flag and hasNewStatements() to MigrationFile, and exposed these via MigrationProcess.
  • Updated MigrationWorkflow version-selection and filtering to support reprocessing the current max version and skipping reprocessing when no new SQL exists.
  • Added unit tests covering reprocessing selection and “new SQL appended to same version” scenarios.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
openmetadata-service/src/main/java/org/openmetadata/service/migration/utils/MigrationFile.java Tracks reprocessing state and whether parsing yielded any new SQL statements.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationProcess.java Extends the migration process contract with reprocessing/new-statement signals.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationProcessImpl.java Implements the new MigrationProcess methods by delegating to MigrationFile.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java Adjusts migration selection/execution flow for continuous migrations and reprocessing.
openmetadata-service/src/test/java/org/openmetadata/service/migration/MigrationWorkflowReprocessingTest.java New tests for reprocessing behavior and incremental SQL pickup.
openmetadata-service/src/test/java/org/openmetadata/service/migration/api/MigrationWorkflowTest.java New tests for version filtering rules (native vs extension/collate).

# Conflicts:
#	openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java
#	openmetadata-service/src/test/java/org/openmetadata/service/migration/api/MigrationWorkflowTest.java
harshach and others added 2 commits March 24, 2026 21:01
…gration/api/MigrationWorkflow.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 25, 2026 04:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

pmbrull
pmbrull previously approved these changes Mar 25, 2026
pmbrull and others added 2 commits March 25, 2026 12:20
mohityadav766
mohityadav766 previously approved these changes Mar 25, 2026
Comment on lines +272 to +273
currentMaxMigrationVersion =
executedMigrations.stream().max(MigrationWorkflow::compareVersions);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: currentMaxMigrationVersion may resolve to extension version

At line 273, currentMaxMigrationVersion is computed using compareVersions, which strips the -extension suffix. This means "1.12.1" and "1.12.1-collate" compare as equal, and Stream.max() could return either one non-deterministically depending on DB ordering. In contrast, processNativeMigrations and processExtensionMigrations correctly use compareReprocessingCandidates which is deterministic.

The practical impact is low because currentMaxMigrationVersion is only used for context initialization and logging (line 374), where the version string serves as a map key. However, using compareReprocessingCandidates here would be more consistent and predictable.

Suggested fix:

currentMaxMigrationVersion =
    executedMigrations.stream().max(MigrationWorkflow::compareReprocessingCandidates);

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Mar 25, 2026

Code Review 👍 Approved with suggestions 3 resolved / 4 findings

Adds continuous migration reprocessing to re-run Java data migrations and fixes state mutation issues in processNativeMigrations. Consider verifying that currentMaxMigrationVersion correctly handles extension version suffixes in edge cases.

💡 Edge Case: currentMaxMigrationVersion may resolve to extension version

📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:272-273

At line 273, currentMaxMigrationVersion is computed using compareVersions, which strips the -extension suffix. This means "1.12.1" and "1.12.1-collate" compare as equal, and Stream.max() could return either one non-deterministically depending on DB ordering. In contrast, processNativeMigrations and processExtensionMigrations correctly use compareReprocessingCandidates which is deterministic.

The practical impact is low because currentMaxMigrationVersion is only used for context initialization and logging (line 374), where the version string serves as a map key. However, using compareReprocessingCandidates here would be more consistent and predictable.

Suggested fix
currentMaxMigrationVersion =
    executedMigrations.stream().max(MigrationWorkflow::compareReprocessingCandidates);
✅ 3 resolved
Bug: Reprocessing migration re-runs Java data migrations

📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:182 📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:277
When a reprocessing migration has new SQL statements (isReprocessing() && hasNewStatements() is true), the skip at line 182 does not trigger, and the full MigrationProcess lifecycle runs — including runDataMigration(). This means that adding a single new SQL statement to an already-deployed version will cause the Java data migration to execute a second time.

Existing concrete migrations (e.g., v1120, v1130) happen to be informally idempotent because their utility methods check data state before mutating, but this is not enforced by the framework. Any future runDataMigration() override that performs destructive or non-idempotent work (e.g., bulk data transforms, one-time backfills) will silently re-execute and could corrupt data.

Consider either:

  1. Skipping runDataMigration() for reprocessing migrations (only run the new SQL), or
  2. Adding a framework-level guard (similar to checkIfQueryPreviouslyRan for SQL) that tracks whether runDataMigration() has already been executed for a given version.
Quality: Unrelated .claustre session/hook files committed in migration PR

📄 .claustre_session_id 📄 .claude/hooks/_claustre-common.sh 📄 .claude/hooks/stop-hook.sh 📄 .claude/hooks/notification-hook.sh
Several files unrelated to the continuous migration feature were included in this PR: .claustre_session_id, .claude/hooks/_claustre-common.sh, .claude/hooks/stop-hook.sh, .claude/hooks/notification-hook.sh, etc. These appear to be local development tooling artifacts and should not be committed to the repository.

Quality: processNativeMigrations mutates shared MigrationFile objects

📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:311-312 📄 openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:334-336
Both processNativeMigrations and processExtensionMigrations call migration.setReprocessing(true) directly on MigrationFile objects from the availableMigrations list. This side-effect mutation of shared input objects makes the methods non-idempotent — calling getMigrationsToApply twice with the same availableMigrations list would leave stale reprocessing=true flags on migration files. While the current call flow creates fresh MigrationFile objects each time, this is a subtle contract that could break under refactoring.

🤖 Prompt for agents
Code Review: Adds continuous migration reprocessing to re-run Java data migrations and fixes state mutation issues in processNativeMigrations. Consider verifying that currentMaxMigrationVersion correctly handles extension version suffixes in edge cases.

1. 💡 Edge Case: currentMaxMigrationVersion may resolve to extension version
   Files: openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java:272-273

   At line 273, `currentMaxMigrationVersion` is computed using `compareVersions`, which strips the `-extension` suffix. This means "1.12.1" and "1.12.1-collate" compare as equal, and `Stream.max()` could return either one non-deterministically depending on DB ordering. In contrast, `processNativeMigrations` and `processExtensionMigrations` correctly use `compareReprocessingCandidates` which is deterministic.
   
   The practical impact is low because `currentMaxMigrationVersion` is only used for context initialization and logging (line 374), where the version string serves as a map key. However, using `compareReprocessingCandidates` here would be more consistent and predictable.

   Suggested fix:
   currentMaxMigrationVersion =
       executedMigrations.stream().max(MigrationWorkflow::compareReprocessingCandidates);

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add continuous migrations

5 participants