Skip to content

Lineage Improvements#24919

Merged
mohityadav766 merged 10 commits intomainfrom
lineage-imprvements
Dec 21, 2025
Merged

Lineage Improvements#24919
mohityadav766 merged 10 commits intomainfrom
lineage-imprvements

Conversation

@mohityadav766
Copy link
Copy Markdown
Member

@mohityadav766 mohityadav766 commented Dec 19, 2025

Describe your changes:

Fixes

I worked on ... because ...


Summary by Gitar

  • Adaptive lineage strategy pattern:
    • Four size-based strategies (SmallGraphStrategy, MediumGraphStrategy, LargeGraphStrategy, StreamingGraphStrategy) handle graphs from 1K to 500K+ nodes
    • Strategy selector in LineageStrategySelector dynamically chooses optimal approach based on estimated graph size
  • Column-level filtering:
    • New columnFilter and preservePaths API parameters in LineageResource enable filtering by column names, tags, or glossary terms
    • ColumnFilterMatcher and LineagePathPreserver classes implement filtering logic while preserving complete lineage paths
  • Result caching infrastructure:
    • GuavaLineageGraphCache with LRU eviction and 5-minute TTL caches small/medium graphs (<50K nodes)
    • LineageCacheKey provides composite cache keys based on query parameters
  • Code deduplication:
    • New AbstractLineageGraphBuilder base class eliminates 95% code duplication between Elasticsearch and OpenSearch implementations
  • Configuration schema:
    • lineageSettings.json extended with graphPerformanceConfig defining thresholds (5K, 50K, 100K nodes) and batch sizes for each strategy

This will update automatically on new commits.


@github-actions
Copy link
Copy Markdown
Contributor

TypeScript types have been updated based on the JSON schema changes in the PR

@github-actions github-actions Bot requested a review from a team as a code owner December 19, 2025 10:20
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 19, 2025

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.3% (50893/79150) 41.9% (24778/59139) 45.38% (7807/17204)

@sonarqubecloud
Copy link
Copy Markdown

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Dec 20, 2025

🔍 CI failure analysis for c6c3863: Maven build failed due to 502 Bad Gateway error from yarn registry while fetching @emotion package; this is a transient infrastructure issue unrelated to PR code changes.

Issue

Maven build failure in openmetadata-ui-core-components during yarn dependency installation.

Root Cause

The yarn package registry (registry.yarnpkg.com) returned a 502 Bad Gateway error when trying to fetch:

@emotion/use-insertion-effect-with-fallbacks@1.2.0

This is a temporary infrastructure/network issue on the registry's side.

Details

Build Context:

  • Build ran for 3 hours 33 minutes before failure
  • All Java tests passed successfully: 7701 tests run, 707 skipped, 0 failed
  • Failure occurred during yarn install --frozen-lockfile phase
  • Error: Request failed "502 Bad Gateway"

Why This Is Unrelated to PR #24919:

  1. This PR only modifies Java lineage graph building code
  2. No dependency changes were made
  3. The failure is a transient registry outage, not a code issue
  4. All actual tests passed - only dependency fetching failed

Combined with Previous Python Test Failure:
Both CI failures (Python SqlFluff parser limitation and Maven yarn registry error) are unrelated to this PR's Java lineage improvements.

Solution

The fix is to retry the CI job. This is a flaky infrastructure issue that should resolve once the yarn registry recovers from the temporary outage. The 502 Bad Gateway indicates the registry service was temporarily unavailable.

Code Review 👍 Approved with suggestions

Comprehensive lineage performance enhancement with column filtering. Well-architected strategy pattern with good test coverage. Three previous findings remain unresolved.

Resolved ✅ 3 resolved
Performance: GuavaLineageGraphCache weight calculation creates redundant lists

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/lineage/GuavaLineageGraphCache.java:52
In GuavaLineageGraphCache.java, the weigher function creates the cache with a weight calculation based on node and edge counts. However, the weigher lambda is called for every put operation but references result.getNodes() and result.getDownstreamEdges()/getUpstreamEdges() which could be null.

While the code handles this with ternary operators, more importantly, the cache doesn't account for the actual memory consumption of the lineage data structures. A more accurate estimate would account for the size of column lineage data within edges.

Suggested fix: Consider using a more refined memory estimation that accounts for column lineage data size, or document that the weight is an approximation based on node count only.

Edge Case: shouldCacheGraph returns false for zero nodes

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/lineage/LineageGraphConfiguration.java:108
In LineageGraphConfiguration.java, the shouldCacheGraph method returns false for nodeCount <= 0. While this makes sense for invalid counts, it also prevents caching of legitimate empty lineage results (where a node exists but has no connected lineage).

Impact: Repeated queries for entities with no lineage will always hit the backend instead of being cached.

Suggested fix: Consider caching valid lineage results even when they have zero nodes/edges, as this is a legitimate state for newly created entities or entities without lineage relationships yet.

Bug: Integer overflow possible in calculateGeometricProgression

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/elasticsearch/ESLineageGraphBuilder.java:1909
In ESLineageGraphBuilder.java, the calculateGeometricProgression method multiplies power *= cappedFanout in a loop. Even with cappedFanout limited to 10, at depth 10 the value power would be 10^10 = 10 billion, which overflows a 32-bit integer (max ~2.1 billion).

The check for total > config.getMaxInMemoryNodes() may not trigger before the overflow occurs since total is accumulated sum, not power.

Impact: For deep lineage queries (depth > 9), the estimation could overflow and return an incorrect (potentially negative or small) value, causing wrong strategy selection.

Suggested fix: Use long type for power and total, or add an early exit check on power itself to prevent overflow. Alternatively, check if power > maxInMemoryNodes before multiplication.

What Works Well

  • Well-designed Strategy pattern (SmallGraph, MediumGraph, LargeGraph, StreamingGraph) for handling lineage graphs of varying sizes
  • Comprehensive test suite covering strategy selection, cache behavior, and edge cases
  • Clean separation of concerns with AbstractLineageGraphBuilder, LineageGraphConfiguration, and strategy implementations
  • Good use of Guava Cache with proper TTL, size limits, and eviction monitoring
  • Integer overflow protection added to calculateGeometricProgression with fanout capping and threshold checks
  • Cache weight calculation simplified by using maximumSize instead of weight-based eviction

Recommendations

  • Consider fixing the remaining issues from previous review:
    • RedisLineageGraphCache: Either make it non-instantiable (abstract class or private constructor) or remove it until implementation is ready
    • StreamingGraphStrategy: Change ERROR level logging to WARN or INFO for normal operation warnings
    • ColumnFilterMatcher: Add validation for empty filter values after parsing (e.g., "columnName:" with no value)

Status of Previous Findings

  • ✅ Integer overflow in calculateGeometricProgression - Fixed with fanout capping
  • ✅ shouldCacheGraph boundary condition - Fixed, now uses <= comparison
  • ✅ GuavaLineageGraphCache weight calculation - Fixed, uses maximumSize instead
  • ❌ RedisLineageGraphCache UnsupportedOperationException - Still present
  • ❌ StreamingGraphStrategy ERROR level logging - Still present
  • ❌ ColumnFilterMatcher input validation - Still needs validation for empty values

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off Gitar will not commit updates to this branch.
✅ Code review is on Gitar will review this change.
Display: compact Hiding non-applicable rules.

Comment with these commands to change:

Auto-apply ✅ Code review Compact
gitar auto-apply:on         
gitar code-review:off         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | This comment will update automatically (Docs)

@sonarqubecloud
Copy link
Copy Markdown

@mohityadav766 mohityadav766 merged commit 9c80936 into main Dec 21, 2025
34 of 36 checks passed
@mohityadav766 mohityadav766 deleted the lineage-imprvements branch December 21, 2025 05:06
ShaileshParmar11 pushed a commit that referenced this pull request Dec 26, 2025
* Lineage Improvements
1. Add path preservation
2. Add Column Filter

* Remove impl doc

* Add Lineage Strategies for efficient loading of graphs

* Update getByEntityCount

* Update generated TypeScript types

* Add Builder

* Fix Build Issue

* Make NodOpProgressTracker have empty constructor

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants