Skip to content

Fixes #26737: Remove stale dbt automated tags#27368

Open
mohitjeswani01 wants to merge 5 commits intoopen-metadata:mainfrom
mohitjeswani01:fix/26737-dbt-automated-tag-removal
Open

Fixes #26737: Remove stale dbt automated tags#27368
mohitjeswani01 wants to merge 5 commits intoopen-metadata:mainfrom
mohitjeswani01:fix/26737-dbt-automated-tag-removal

Conversation

@mohitjeswani01
Copy link
Copy Markdown

Description

Fixes #26737

When dbt removed tags from schema.yml, the corresponding AUTOMATED tags (labelType=Automated, appliedBy="ingestion-bot") on tables/columns were not getting cleaned up and stayed forever.

I changed TableRepository.addDataModel so that:

  • For DBT data models only, it drops AUTOMATED tags applied by ingestion-bot that are no longer present in the incoming dbt tags.
  • The same logic is applied at both table and column level.

I also added TableRepositoryDataModelTagTest to cover the main table/column scenarios and to make sure non‑dbt and non‑ingestion‑bot tags are left untouched.

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes #26737: Remove stale dbt automated tags
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.
  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.

Copilot AI review requested due to automatic review settings April 14, 2026 22:06
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a dbt ingestion edge case where previously-applied AUTOMATED tags from ingestion-bot were not removed from tables/columns when the tags were deleted from dbt schema.yml.

Changes:

  • Added DBT-only cleanup in TableRepository.addDataModel to remove stale AUTOMATED tags applied by ingestion-bot at both table and column level.
  • Added a new unit test class covering stale-tag removal vs. preservation behavior for DBT vs non-DBT models.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/TableRepository.java Adds DBT-scoped stale automated-tag cleanup during data model application for tables and columns.
openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TableRepositoryDataModelTagTest.java Adds tests for the new stale-tag removal logic and related scenarios.

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@mohitjeswani01
Copy link
Copy Markdown
Author

Hi @harshach

I’ve addressed all feedback from gitar-bot and Copilot (removed the duplicated helper in tests, wired tests to call TableRepository.removeStaleDbtAutomatedTags directly, cleaned up imports, and fixed the Javadoc).

Could you please add the safe-to-test label to trigger the CI 🙏

@PubChimps PubChimps added the safe to test Add this label to run secure Github workflows on PRs label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@mohitjeswani01
Copy link
Copy Markdown
Author

Hi @PubChimps 👋

I see Issue #26737 has been closed as completed with my PR #27368 linked.

If there are any changes required on my end, please let me know and I'll
address them.

I'm actively monitoring CI and ready to fix any failures right away on your command..
Thank you! 🙏

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

🔴 Playwright Results — 1 failure(s), 23 flaky

✅ 3657 passed · ❌ 1 failed · 🟡 23 flaky · ⏭️ 89 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 477 0 3 4
🟡 Shard 2 647 0 4 7
🔴 Shard 3 648 1 7 1
🟡 Shard 4 629 0 5 27
🟡 Shard 5 610 0 1 42
🟡 Shard 6 646 0 3 8

Genuine Failures (failed on all attempts)

Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3)
Error: Timeout 60000ms exceeded while waiting on the predicate
🟡 23 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Data Product - customization should work (shard 1, 1 retry)
  • Flow/Metric.spec.ts › Verify Related Metrics Update (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/ChangeSummaryBadge.spec.ts › Automated badge should appear on entity description with Automated source (shard 2, 1 retry)
  • Features/Glossary/GlossaryAdvancedOperations.spec.ts › should create glossary with multiple owners (users + teams) (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should inherit reviewers from glossary when term is created (shard 2, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 1 retry)
  • Features/UserProfileOnlineStatus.spec.ts › Should not show online status for inactive users (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContractInheritance.spec.ts › Remove Asset - Inherited contract no longer shown when asset is removed from Data Product (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Worksheet (shard 4, 1 retry)
  • Pages/DomainUIInteractions.spec.ts › Select domain from global dropdown filters explore (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Glossary Term Add, Update and Remove (shard 4, 1 retry)
  • Pages/Glossary.spec.ts › Add and Remove Assets (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings April 16, 2026 21:03
@mohitjeswani01 mohitjeswani01 force-pushed the fix/26737-dbt-automated-tag-removal branch from c5198aa to f3c5f1f Compare April 16, 2026 21:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


static {
tableRepository = Mockito.mock(TableRepository.class);
Mockito.when(tableRepository.removeStaleDbtAutomatedTags(Mockito.anyList(), Mockito.anyList()))
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Mockito stub uses Mockito.anyList() for incomingTags, which does not match null. In removeStaleDbtAutomatedTags_removesAllDbtAutomatedWhenIncomingNullOrEmpty, dataModel.getTags() is null, so this call won’t hit the thenCallRealMethod() stub and will return the mock default (null), causing an NPE on resultNull.size(). Use a matcher that allows null (e.g., Mockito.nullable(List.class) / Mockito.<List<TagLabel>>any()), or create the mock with CALLS_REAL_METHODS so the real method runs for null inputs too.

Suggested change
Mockito.when(tableRepository.removeStaleDbtAutomatedTags(Mockito.anyList(), Mockito.anyList()))
Mockito.when(
tableRepository.removeStaleDbtAutomatedTags(
Mockito.anyList(), Mockito.nullable(List.class)))

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Updated the mock setup to use Mockito.nullable(List.class) to prevent the NPE during null input tests

@mohitjeswani01
Copy link
Copy Markdown
Author

@PubChimps addressed bot comments also ran mvn spotless:apply . please let me know if anything else is needed by my side 🙏

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot AI review requested due to automatic review settings April 16, 2026 21:28
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 16, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Stale dbt automated tags removed from the codebase. Test coverage now correctly mirrors production logic instead of invoking it directly.

✅ 1 resolved
Quality: Test duplicates production logic instead of calling it

📄 openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/TableRepositoryDataModelTagTest.java:30-43
The test class copies removeStaleDbtAutomatedTags and the merge+removal orchestration into private helper methods rather than invoking the actual TableRepository code. If the production logic is later modified (e.g., an additional condition is added), the test copy will drift silently—tests will still pass while the real behavior is untested.

Since the test is in the same package (org.openmetadata.service.jdbi3), the private method is still not accessible directly. However, you could either:

  1. Change the method visibility to package-private (remove private) so the test can call it directly, or
  2. Use reflection in the test to invoke the private method, or
  3. Extract the logic into a static utility method in a helper class that both TableRepository and the test can call.
Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dbt ingestion cannot remove previously applied non-mutually-exclusive tags (follow-up to #26054)

3 participants