Skip to content

Allow FQN-based lineage APIs#28599

Open
harshach wants to merge 6 commits into
mainfrom
harshach/fqn-based-lineage-api
Open

Allow FQN-based lineage APIs#28599
harshach wants to merge 6 commits into
mainfrom
harshach/fqn-based-lineage-api

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented Jun 1, 2026

Describe your changes:

Issue number was not provided in the PR request.

Adds path-based FQN lineage APIs so callers can add, fetch, patch, delete, and delete-by-source lineage without resolving entity UUIDs first. The global EntityReference schema remains unchanged and PUT /v1/lineage continues to use id-based AddLineageRequest references.

Python SQL lineage ingestion now carries table FQNs through OMetaFQNLineageRequest and writes lineage through the name-based API, avoiding extra get_by_name calls just to build UUID-backed lineage edges.

Type of change:

  • Improvement

High-level design:

Path-based APIs are explicit and have no id/FQN precedence ambiguity:

  • PUT /v1/lineage/{fromEntity}/name/{fromFQN}/{toEntity}/name/{toFQN}
  • GET /v1/lineage/getLineageEdge/{fromEntity}/name/{fromFQN}/{toEntity}/name/{toFQN}
  • PATCH /v1/lineage/{fromEntity}/name/{fromFQN}/{toEntity}/name/{toFQN}
  • DELETE /v1/lineage/{fromEntity}/name/{fromFQN}/{toEntity}/name/{toFQN}
  • DELETE /v1/lineage/source/name/{entityType}/{entityFQN}/type/{lineageSource}

The server resolves path FQNs with existing entity name resolution, authorizes lineage edits against the resolved entities, and keeps body-based AddLineageRequest behavior id-based.

Tests:

Use cases covered

  • Path-based add lineage by FQN succeeds.
  • FQN get/patch/delete lineage-edge routes work.
  • FQN delete-by-source works.
  • Ingestion SQL lineage emits FQN lineage requests and writes through the name-based API.
  • Global EntityReference schema and generated UI types are not relaxed.

Unit tests

  • I added/updated unit tests for the new/changed logic.
  • Files added/updated: openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/LineageRepositoryTest.java, ingestion unit tests for FQN lineage request generation.
  • Coverage %: Not measured.

Backend integration tests

  • I added integration tests in openmetadata-integration-tests/ for new/changed API endpoints.
  • Files added/updated: openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/LineageResourceIT.java.

Ingestion integration tests

  • Not applicable.
  • Ingestion behavior is covered through focused request-generation tests plus Python lint and compile checks.

Playwright (UI) tests

  • Not applicable.
  • No UI behavior changes.

Manual testing performed

  1. Ran Python ruff check/format and py_compile for touched ingestion files.
  2. Ran the Playwright lint/format sequence for the touched helper while removing the unrelated UI diff from the PR.
  3. Ran mvn -pl openmetadata-service spotless:apply and mvn -pl openmetadata-service spotless:check.
  4. Ran mvn -T 1C -pl openmetadata-service -Dtest=LineageRepositoryTest test.

UI screen recording / screenshots:

Not applicable; no UI behavior changes.

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • My PR is linked to a GitHub issue via Fixes #<issue-number> above.
  • No additional code comments were needed; changes follow existing lineage patterns.
  • For JSON Schema changes: Not applicable; this PR does not change JSON Schema.
  • For UI changes: Not applicable; this PR does not change UI behavior.
  • I have added tests (unit / integration / Playwright as applicable) and listed them above.

@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Jun 1, 2026
@harshach harshach marked this pull request as ready for review June 1, 2026 16:22
Copilot AI review requested due to automatic review settings June 1, 2026 16:22
@harshach harshach requested review from a team as code owners June 1, 2026 16:22
Comment thread ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py Outdated
Comment thread ingestion/src/metadata/ingestion/sink/metadata_rest.py
Comment thread ingestion/src/metadata/sdk/api/lineage.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables AddLineageRequest (and related lineage endpoints) to accept references by fullyQualifiedName in addition to id. The server resolves references via Entity.getEntityReference, new FQN-suffixed routes are added for get/patch/delete/delete-by-source, the Python ingestion and SDK build FQN-based lineage payloads, and the global EntityReference JSON schema is relaxed to no longer require id. UI generated TypeScript types are regenerated to reflect id?: string.

Changes:

  • Backend: relax EntityReference schema, add resolveLineageReference helper, add FQN-based lineage routes (/getLineageEdge/name/..., PATCH/DELETE name routes, delete-by-source-by-name).
  • Ingestion: OMetaLineageMixin now keys cache and builds edge paths by id or FQN; SQL lineage emits FQN-only references; SDK exposes *_by_name variants.
  • Tests: new LineageRepositoryTest cases for resolve-by-FQN and missing-id+FQN; new IT cases covering FQN-only add, id precedence, FQN-based patch/delete, and delete-by-source-by-name.

Reviewed changes

Copilot reviewed 9 out of 233 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
openmetadata-spec/.../entityReference.json Drops id from required so FQN-only refs validate.
openmetadata-service/.../LineageRepository.java Adds resolve/validate helpers, FQN variants of getEdge/patchEdge/deleteBySource.
openmetadata-service/.../LineageResource.java New FQN routes and shared authorizeLineageReference; modifies existing delete-by-name path.
openmetadata-service/.../LineageRepositoryTest.java Unit tests for FQN resolution + missing-ref error.
openmetadata-integration-tests/.../LineageResourceIT.java IT coverage for FQN add/patch/delete and delete-by-source.
ingestion/.../lineage_mixin.py Cache keys/paths support id or FQN; add/patch/delete handle both forms.
ingestion/.../metadata_rest.py Dispatch delete-by-source by id or FQN based on reference.
ingestion/.../sql_lineage.py SQL lineage now uses FQN instead of id when building edges.
ingestion/src/metadata/sdk/api/lineage.py New add_lineage_by_name / delete_lineage_by_name / delete_lineage_by_source_by_name (sync+async).
openmetadata-ui/.../generated/**/* (many) Codegen updates: id: stringid?: string on EntityReference.

Comment thread openmetadata-spec/src/main/resources/json/schema/type/entityReference.json Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 62%
63.07% (66751/105824) 43.92% (36638/83401) 46.13% (10852/23524)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🔴 Playwright Results — 3 failure(s), 16 flaky

✅ 4253 passed · ❌ 3 failed · 🟡 16 flaky · ⏭️ 88 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 298 0 1 4
🔴 Shard 2 801 1 5 9
🟡 Shard 3 803 0 4 8
🔴 Shard 4 842 2 2 12
🟡 Shard 5 719 0 1 47
🟡 Shard 6 790 0 3 8

Genuine Failures (failed on all attempts)

Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoHaveText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator:  locator('[data-row-key*="StatusBadgeTerm1780378541955"]').locator('.status-badge')
Expected: �[32m"Draft"�[39m
Received: �[31m"In Review"�[39m
Timeout:  15000ms

Call log:
�[2m  - Expect "toHaveText" with timeout 15000ms�[22m
�[2m  - waiting for locator('[data-row-key*="StatusBadgeTerm1780378541955"]').locator('.status-badge')�[22m
�[2m    19 × locator resolved to <div class="status-badge inReview" data-testid=""PW%'b2690611.Calm6f1f4034".StatusBadgeTerm1780378541955-status">…</div>�[22m
�[2m       - unexpected value "In Review"�[22m

Pages/DataMarketplacePermissions.spec.ts › Admin sees add buttons and customize button (shard 4)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoContainText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator: getByTestId('greeting-text')
Expected substring: �[32m"�[7ma�[27mdmin"�[39m
Received string:    �[31m"�[7mHey, A�[27mdmin�[7m 👋�[27m"�[39m
Timeout: 15000ms

Call log:
�[2m  - Expect "toContainText" with timeout 15000ms�[22m
�[2m  - waiting for getByTestId('greeting-text')�[22m
�[2m    19 × locator resolved to <h3 class="tw:mb-1 tw:mt-0" data-testid="greeting-text">Hey, Admin 👋</h3>�[22m
�[2m       - unexpected value "Hey, Admin 👋"�[22m

Pages/DataMarketplacePermissions.spec.ts › Data consumer does NOT see add buttons (shard 4)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoContainText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator: getByTestId('greeting-text')
Expected substring: �[32m"�[7mQuick2e5b9d�[27m73Deer4512�[7mb15b�[27m"�[39m
Received string:    �[31m"�[7mHey, Quick 2 E 5 B 9 D �[27m73�[7m �[27mDeer�[7m �[27m4512�[7m B 15 B 👋�[27m"�[39m
Timeout: 15000ms

Call log:
�[2m  - Expect "toContainText" with timeout 15000ms�[22m
�[2m  - waiting for getByTestId('greeting-text')�[22m
�[2m    19 × locator resolved to <h3 class="tw:mb-1 tw:mt-0" data-testid="greeting-text">Hey, Quick 2 E 5 B 9 D 73 Deer 4512 B 15 B 👋</h3>�[22m
�[2m       - unexpected value "Hey, Quick 2 E 5 B 9 D 73 Deer 4512 B 15 B 👋"�[22m

🟡 16 flaky test(s) (passed on retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Features/ActivityAPI.spec.ts › creates an activity event when the description is updated (shard 2, 1 retry)
  • Features/CustomMetric.spec.ts › Table custom metric (shard 2, 1 retry)
  • Features/DataQuality/TestCaseImportExportBasic.spec.ts › User with ViewAll on TEST_CASE resource can successfully export test cases (shard 2, 1 retry)
  • Features/DataQuality/TestCaseImportExportE2eFlow.spec.ts › Admin: Complete export-import-validate flow (shard 2, 1 retry)
  • Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 2, 1 retry)
  • Features/KnowledgeCenter.spec.ts › User Mentions in article and redirect should work of Knowledge Center page (shard 3, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 3, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Advanced Blocks (shard 3, 1 retry)
  • Features/UserProfileOnlineStatus.spec.ts › Should show online status badge on user profile for active users (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Description Rule Is_Not_Set (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel_KnowledgeCenter.spec.ts › Should remove user owner for knowledgeCenter (shard 5, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › Column lineage for dashboardDataModel -> table (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify Impact Analysis service filter selection (shard 6, 1 retry)
  • Pages/LogsViewer.spec.ts › Logs page shows breadcrumb, summary, and log viewer or empty state after opening from bundle suite pipeline tab (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@harshach harshach changed the title [codex] Allow FQN-based lineage APIs Allow FQN-based lineage APIs Jun 1, 2026
Copilot AI review requested due to automatic review settings June 1, 2026 21:04
Comment thread ingestion/src/metadata/ingestion/sink/metadata_rest.py Outdated
Comment thread ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 2, 2026

Copilot AI review requested due to automatic review settings June 2, 2026 03:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated no new comments.

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Jun 2, 2026

Code Review ✅ Approved 5 resolved / 5 findings

Adds path-based FQN lineage APIs to enable direct lineage operations without UUID resolution. This update resolves several issues, including silent lineage failures, inconsistent identifier handling, and crashes during ingestion.

✅ 5 resolved
Edge Case: Mixed id/FQN references cause silent lineage operation failures

📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:74-83 📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:86-94 📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:306-307 📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:388-389
The _lineage_edge_path and _lineage_edge_lookup_path methods require both entities in an edge to either both have id or both have fullyQualifiedName. If one entity has an id and the other only has FQN (a valid combination per the new schema), a ValueError is raised. In delete_lineage_edge (line 388) and patch_lineage_edge (line 306), this ValueError is caught and only logged, causing the operation to silently fail. This can happen when lineage requests are constructed from different sources (e.g., one reference resolved from a DB lookup with id, another built from user input with FQN only).

Quality: SDK lineage.py uses excessive None assignments for EntityReference

📄 ingestion/src/metadata/sdk/api/lineage.py:157-166 📄 ingestion/src/metadata/sdk/api/lineage.py:167-176 📄 ingestion/src/metadata/sdk/api/lineage.py:237-248 📄 ingestion/src/metadata/sdk/api/lineage.py:249-256
In add_lineage_by_name and delete_lineage_by_name, EntityReference is constructed with explicit name=None, description=None, displayName=None, deleted=None, inherited=None, href=None. Since these are optional fields with None defaults in the Pydantic model, this verbosity is unnecessary and makes the code harder to maintain.

Edge Case: _delete_lineage_by_source_reference silently skips if no identifier

📄 ingestion/src/metadata/ingestion/sink/metadata_rest.py:444-457
In _delete_lineage_by_source_reference, if entity_reference.id is falsy and entity_reference.fullyQualifiedName is also falsy, the method returns without performing any deletion and without logging a warning. While unlikely in normal flow, this could mask bugs where an EntityReference is constructed without proper identifiers, leading to lineage edges not being cleaned up during override operations.

Bug: write_fqn_lineage crashes with AttributeError when lineage returns None

📄 ingestion/src/metadata/ingestion/sink/metadata_rest.py:447-460 📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:355
In metadata_rest.py, write_fqn_lineage calls self.metadata.add_lineage_by_name(...) which internally returns self.get_lineage_by_name(...) on success (line 355 of lineage_mixin.py). However, get_lineage_by_name can return None (e.g., if the freshly-created lineage can't be retrieved due to a transient API error or eventual consistency). When this happens, line 457 (created_lineage.get("error")) raises AttributeError: 'NoneType' object has no attribute 'get'.

The same pattern exists in write_lineage (line 441-442) which has the same latent issue, but that's pre-existing code. The new write_fqn_lineage should handle None.

Bug: add_lineage_by_name returns None on success when get_lineage_by_name fails

📄 ingestion/src/metadata/ingestion/ometa/mixins/lineage_mixin.py:339-353
In lineage_mixin.py line 355, add_lineage_by_name returns self.get_lineage_by_name(from_entity_type, from_entity_fqn) after a successful PUT. If the subsequent GET fails (e.g., due to a transient error or the entity not being immediately queryable), this returns None even though the lineage was successfully created. All callers (write_fqn_lineage, add_lineage_by_query) assume the return value is a dict, leading to crashes.

Contrast with the error path (line 353) which correctly returns {"error": ...}. The happy path should similarly guarantee a dict return.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 2, 2026

Quality Gate Failed Quality Gate failed for 'open-metadata-ingestion'

Failed conditions
E Security Review Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants