Skip to content

Fixes #24208: team ownership broken for special-character names and partial name matches#27453

Open
miantalha45 wants to merge 12 commits intoopen-metadata:mainfrom
miantalha45:fix/team-ownership-special-chars
Open

Fixes #24208: team ownership broken for special-character names and partial name matches#27453
miantalha45 wants to merge 12 commits intoopen-metadata:mainfrom
miantalha45:fix/team-ownership-special-chars

Conversation

@miantalha45
Copy link
Copy Markdown

Changes

  • URL-encoded the query_filter parameter so team names with & don't break the ES query
  • Added exact name lookup before fuzzy search to prevent partial matches (e.g. "AI Product" vs "AI Products")
  • Fixed manifest key cleanup to preserve list types, preventing Pydantic validation errors
  • Added unit tests and Playwright E2E tests covering all three cases

Related

Fixes #24208

Type of change:

  • Bug fix

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.

Before

image

After

image

@miantalha45 miantalha45 requested review from a team as code owners April 17, 2026 06:41
Copilot AI review requested due to automatic review settings April 17, 2026 06:41
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

if not name:
return None

maybe_team = self.get_by_name(entity=Team, fqn=name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: get_by_name propagates non-404 APIError unlike old ES path

The old code exclusively used _search_by_nameget_entity_from_es, which swallows most exceptions internally. The new code calls self.get_by_name() first (lines 189, 207), which re-raises APIError for any non-404 status (e.g. 500, 503). If the API is temporarily unhealthy, this will now crash the ingestion workflow instead of gracefully falling back.

Since get_reference_by_name is not wrapped in a try/except, a transient server error during the exact-lookup step will bubble up and may abort an entire ingestion run. The old fuzzy-search fallback would have silently returned None.

This is arguably better correctness (fail loudly), but it's a behavior change worth being aware of — especially for large ingestion jobs where a single transient 500 could halt the whole pipeline.

Suggested fix:

try:
    maybe_team = self.get_by_name(entity=Team, fqn=name)
except Exception:
    logger.debug("Exact lookup failed for team %s, falling back to ES", name)
    maybe_team = None

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes team/user ownership resolution when names contain special characters (e.g., &) and when fuzzy search returns partial-name matches, improving dbt ingestion and UI ownership assignment reliability.

Changes:

  • URL-encode the ingestion ES query_filter parameter to prevent special characters from breaking query-string parsing.
  • Prefer exact get_by_name lookups before falling back to ES fuzzy search for both Teams and Users.
  • Add unit tests (ingestion) and Playwright E2E coverage (UI) for special-character ownership and exact-name matching scenarios.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
openmetadata-ui/src/main/resources/ui/playwright/e2e/Features/TeamOwnershipSpecialCharacters.spec.ts Adds E2E coverage for assigning team owners with & and for avoiding partial-name mismatches.
ingestion/tests/unit/metadata/ingestion/ometa/test_user_mixin.py Adds unit tests for URL-encoding of ES query filters and exact-match preference in reference resolution.
ingestion/src/metadata/ingestion/ometa/mixins/user_mixin.py Encodes ES query_filter and updates reference lookup to try exact get_by_name before fuzzy search, with early return on empty input.

import { TableClass } from '../../support/entity/TableClass';
import { TeamClass } from '../../support/team/TeamClass';
import { getApiContext, redirectToHomePage, uuid } from '../../utils/common';
import { addOwner, removeOwner } from '../../utils/entity';
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeOwner is imported but never used in this spec. This will trigger lint/TS unused-import checks in the Playwright suite; please remove the unused import (or use it as part of cleanup if intended).

Suggested change
import { addOwner, removeOwner } from '../../utils/entity';
import { addOwner } from '../../utils/entity';

Copilot uses AI. Check for mistakes.
entity=Team, name=name, from_=0, size=1
)
raw_filter = query.split("query_filter=")[1].split("&from=")[0]
assert "&" not in raw_filter
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_other_special_characters_are_encoded only asserts that & is not present in the encoded query_filter, but the test name/documentation implies broader coverage (e.g., / and whitespace). Either expand assertions to verify the other characters are actually URL-encoded, or rename the test to reflect what it validates.

Suggested change
assert "&" not in raw_filter
assert "&" not in raw_filter
assert "/" not in raw_filter
assert " " not in raw_filter
assert "%26" in raw_filter
assert "%2F" in raw_filter
assert "%20" in raw_filter
decoded = unquote(raw_filter)
parsed = json.loads(decoded)
assert name in parsed["query"]["query_string"]["query"]

Copilot uses AI. Check for mistakes.
Comment on lines +186 to +193
if not name:
return None

maybe_team = self.get_by_name(entity=Team, fqn=name)
if maybe_team is None:
maybe_team = self._search_by_name(
entity=Team, name=name, from_count=from_count, size=size, fields=fields
)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions a fix to the dbt manifest key cleanup to preserve list types and avoid Pydantic validation errors, but that change is not present in the current codebase (e.g., DbtServiceSource.remove_manifest_non_required_keys still overwrites all non-required top-level keys with {} regardless of original type). If that fix is required for #24208, it needs to be included in this PR or the description should be updated to match what’s actually changed.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miantalha45 this seems like a valid comment if the issue also was related to dbt?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmbrull I have solve this suggestion.

@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copilot AI review requested due to automatic review settings April 17, 2026 08:45
@github-actions
Copy link
Copy Markdown
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 63%
63.74% (59648/93580) 43.65% (31396/71913) 46.74% (9435/20183)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +64 to +70
test.beforeAll('Create teams and table', async ({ browser }) => {
const { apiContext, afterAction } = await getApiContext(browser);
await specialCharTeam.create(apiContext);
await shortTeam.create(apiContext);
await longTeam.create(apiContext);
await table.create(apiContext);
await afterAction();
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getApiContext expects a Playwright Page (it reads the auth token from the page storage state), but this test passes browser. This should fail TypeScript type-checking and will break the setup/teardown. Use getDefaultAdminAPIContext(browser) (or create a page/context and pass a real Page into getApiContext) to obtain an authenticated apiContext for the fixture.

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +79
test.afterAll('Delete teams and table', async ({ browser }) => {
const { apiContext, afterAction } = await getApiContext(browser);
await specialCharTeam.delete(apiContext);
await shortTeam.delete(apiContext);
await longTeam.delete(apiContext);
await table.delete(apiContext);
await afterAction();
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in beforeAll: getApiContext requires a Page, but browser is passed here. This will break teardown and can leave created teams/tables behind. Switch to getDefaultAdminAPIContext(browser) (or build a Page first) for cleanup as well.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 17, 2026

❌ UI Checkstyle Failed

❌ Playwright — ESLint + Prettier + Organise Imports

One or more Playwright test files have linting or formatting issues.

Affected files
  • openmetadata-ui/src/main/resources/ui/playwright/e2e/Features/TeamOwnershipSpecialCharacters.spec.ts

Fix locally (fast — only checks files changed in this branch):

make ui-checkstyle-changed

Copilot AI review requested due to automatic review settings April 17, 2026 14:58
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 17, 2026

Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Implements team ownership fixes for special characters and partial matches to resolve issue #24208. Ensure get_by_name handles non-404 APIErrors consistently to avoid propagating unintended exceptions.

💡 Edge Case: get_by_name propagates non-404 APIError unlike old ES path

📄 ingestion/src/metadata/ingestion/ometa/mixins/user_mixin.py:189 📄 ingestion/src/metadata/ingestion/ometa/mixins/user_mixin.py:207

The old code exclusively used _search_by_nameget_entity_from_es, which swallows most exceptions internally. The new code calls self.get_by_name() first (lines 189, 207), which re-raises APIError for any non-404 status (e.g. 500, 503). If the API is temporarily unhealthy, this will now crash the ingestion workflow instead of gracefully falling back.

Since get_reference_by_name is not wrapped in a try/except, a transient server error during the exact-lookup step will bubble up and may abort an entire ingestion run. The old fuzzy-search fallback would have silently returned None.

This is arguably better correctness (fail loudly), but it's a behavior change worth being aware of — especially for large ingestion jobs where a single transient 500 could halt the whole pipeline.

Suggested fix
try:
    maybe_team = self.get_by_name(entity=Team, fqn=name)
except Exception:
    logger.debug("Exact lookup failed for team %s, falling back to ES", name)
    maybe_team = None
🤖 Prompt for agents
Code Review: Implements team ownership fixes for special characters and partial matches to resolve issue #24208. Ensure `get_by_name` handles non-404 APIErrors consistently to avoid propagating unintended exceptions.

1. 💡 Edge Case: get_by_name propagates non-404 APIError unlike old ES path
   Files: ingestion/src/metadata/ingestion/ometa/mixins/user_mixin.py:189, ingestion/src/metadata/ingestion/ometa/mixins/user_mixin.py:207

   The old code exclusively used `_search_by_name` → `get_entity_from_es`, which swallows most exceptions internally. The new code calls `self.get_by_name()` first (lines 189, 207), which re-raises `APIError` for any non-404 status (e.g. 500, 503). If the API is temporarily unhealthy, this will now crash the ingestion workflow instead of gracefully falling back.
   
   Since `get_reference_by_name` is not wrapped in a try/except, a transient server error during the exact-lookup step will bubble up and may abort an entire ingestion run. The old fuzzy-search fallback would have silently returned `None`.
   
   This is arguably better correctness (fail loudly), but it's a behavior change worth being aware of — especially for large ingestion jobs where a single transient 500 could halt the whole pipeline.

   Suggested fix:
   try:
       maybe_team = self.get_by_name(entity=Team, fqn=name)
   except Exception:
       logger.debug("Exact lookup failed for team %s, falling back to ES", name)
       maybe_team = None

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comment on lines +95 to +103
const allHistoricalColumns = useMemo(
() => pruneEmptyChildren(currentVersionData.columns) ?? [],
[currentVersionData.columns]
);

// Use search API if there's a search query, otherwise use regular pagination
const response = searchQuery
? await searchTableColumnsByFQN(tableFqn, {
q: searchQuery,
limit: pageSize,
offset: offset,
fields: 'tags',
})
: await getTableColumnsByFQN(tableFqn, {
limit: pageSize,
offset: offset,
fields: 'tags',
});
const filteredHistoricalColumns = useMemo(() => {
if (!searchText) {
return allHistoricalColumns;
}
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is scoped (title/description) to team ownership fixes in ingestion/UI search, but this component change alters Table Version schema rendering behavior (removing live column API fetch and switching to currentVersionData.columns + client-side paging/search). If this is intentional, it should be called out in the PR description (or split into a separate PR) to keep review/rollback scope clear.

Copilot uses AI. Check for mistakes.
Comment on lines +207 to +211
maybe_user = self.get_by_name(entity=User, fqn=name)
if maybe_user is None:
maybe_user = self._search_by_name(
entity=User, name=name, from_count=from_count, size=size, fields=fields
)
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue for user lookups: the exact get_by_name(entity=User, fqn=name) call does not pass through the fields argument from get_reference_by_name, so callers requesting specific fields may get a different payload depending on whether the match is exact vs. fuzzy. Pass fields to get_by_name here as well for consistent results.

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +111
const filteredHistoricalColumns = useMemo(() => {
if (!searchText) {
return allHistoricalColumns;
}
const lower = searchText.toLowerCase();

setTableColumns(pruneEmptyChildren(response.data) || []);
handlePagingChange(response.paging);
} catch {
// Set empty state if API fails
setTableColumns([]);
handlePagingChange({
offset: 1,
limit: pageSize,
total: 0,
});
} finally {
setColumnsLoading(false);
}
},
[tableFqn, pageSize]
);
return allHistoricalColumns.filter(
(col) =>
col.name?.toLowerCase().includes(lower) ||
col.description?.toLowerCase().includes(lower)
);
}, [allHistoricalColumns, searchText]);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filteredHistoricalColumns reimplements column search logic but only checks top-level name/description. This is less capable than the existing searchInColumns logic used inside VersionTable (recursive, includes displayName and dataType), and it can drop matches that exist only in nested children or in displayName/type. Consider reusing the shared searchInColumns(allHistoricalColumns, searchText) to preserve existing search behavior for schema version pages.

Copilot uses AI. Check for mistakes.
total: filteredHistoricalColumns.length,
});
setColumnsLoading(false);
}, [isVersionLoading, filteredHistoricalColumns, currentPage, pageSize]);
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The useEffect that slices columns and updates paging uses handlePagingChange but it is not listed in the dependency array. This will trip react-hooks/exhaustive-deps (and can capture stale references if the hook implementation ever changes). Add handlePagingChange to the dependency array (and, if needed for linting consistency, include other referenced values like setTableColumns/setColumnsLoading).

Suggested change
}, [isVersionLoading, filteredHistoricalColumns, currentPage, pageSize]);
}, [
isVersionLoading,
filteredHistoricalColumns,
currentPage,
pageSize,
handlePagingChange,
]);

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Ownership association with special characters

3 participants