RDF Knowledge Graph epic with requirements including structured RDF modeling, PROV-O lineage representation, idempotent indexing, inference engine support, and scalable indexing architecture.#24911
Conversation
…odeling, PROV-O lineage representation, idempotent indexing, inference engine support, and scalable indexing architecture.
|
TypeScript types have been updated based on the JSON schema changes in the PR |
…odeling, PROV-O lineage representation, idempotent indexing, inference engine support, and scalable indexing architecture.
🔍 CI failure analysis for 6509df1: Multiple CI failures: (1) RDF workflow configuration issue, (2) Widespread Playwright test failures across 3 shards (2/6, 4/6, 5/6) - all unrelated to backend RDF changesMultiple CI Failures DetectedThis PR has multiple distinct CI failures: Issue 1: RDF Workflow Configuration Failuremaven-postgresql-rdf-ci (Jobs 58471241367, 58472281646)Root Cause: Missing Solution: The fix is to add the docker compose -f ./docker/development/docker-compose-postgres-fuseki.yml psStatus: This configuration issue should be corrected by updating the workflow file. Issue 2: Widespread Playwright UI Test FailuresMultiple Playwright Shards FailingAffected Jobs (3 out of 6 shards failing):
Failure Rate: 50% of Playwright shards failing (3/6) Failed Test Patterns:
Error Signatures:
Critical Assessment: Playwright Failures NOT Related to PREvidence Against PR Causation:
Failure Characteristics Point To:
RecommendationRDF CI Failure: Blocks PR - The workflow configuration issue should be corrected by updating line 108 of the workflow file Playwright Failures: Should NOT block this PR for these reasons:
Suggested Actions:
Code Review 👍 Approved with suggestionsSubstantial RDF improvements for structured properties and lineage support with comprehensive test coverage. Previous URI collision issue resolved, but atomic storage and silent exception concerns remain. Resolved ✅ 1 resolvedPerformance: Using System.nanoTime() for URI generation may cause collisions
What Works Well
Recommendations
Tip Comment OptionsAuto-apply is off Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | This comment will update automatically (Docs) |
|
|



Describe your changes:
Summary
This PR implements the RDF Knowledge Graph epic with requirements including structured RDF modeling, PROV-O lineage representation, idempotent indexing, inference engine support, and scalable indexing architecture.
Changes
Structured RDF Modeling for Nested Properties
Remodeled 5 properties from embedded JSON literals to proper RDF triples:
votes→om:hasVoteswithom:Votesclass containingom:upVotes,om:downVoteschangeDescription→om:hasChangeDescriptionwithom:ChangeDescriptionclasslifeCycle→om:hasLifeCyclewith lifecycle propertiescustomProperties→ Structured custom property mappingsextension→ Proper RDF extension handlingFiles:
openmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SqlMappingContext.java- Added nested mappingsopenmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SparqlBuilder.java- Nested field projection supportPROV-O Lineage Representation
Implemented W3C PROV-O vocabulary for lineage relationships:
prov:wasDerivedFromfor upstream relationshipsprov:wasInfluencedByfor downstream relationshipsprov:wasGeneratedByfor pipeline associationsom:hasColumnLineageom:sqlQueryFiles:
openmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SqlMappingContext.java- Lineage table mapping with PROV-Oopenmetadata-service/src/main/java/org/openmetadata/service/rdf/RdfRepository.java-addLineageWithDetails()methodIdempotent RDF Indexing
Implemented DELETE/INSERT SPARQL Update pattern for idempotent entity updates:
Files:
openmetadata-service/src/main/java/org/openmetadata/service/rdf/storage/JenaFusekiStorage.java- DELETE/INSERT patternInference Engine Support
Enabled reasoning/inference capabilities:
Files:
openmetadata-service/src/main/java/org/openmetadata/service/rdf/InferenceEngine.java- Rule-based inferenceopenmetadata-spec/src/main/resources/json/schema/api/configuration/rdfConfiguration.json- Enabled inference by defaultRdfIndexApp Scalability Optimization
Optimized for million-scale environments using SearchIndexApp patterns:
Key optimizations:
findToBatchWithRelations()andfindFromBatch()replacing N+1 queriesFiles:
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/rdf/RdfIndexApp.javaSQL-to-SPARQL Translation Enhancements
Enhanced SPARQL generation for structured/nested fields:
votes.upVotes)Files:
openmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SqlMappingContext.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SparqlBuilder.javaTest Plan
RdfIndexAppTest - 22 tests
SqlToSparqlTranslatorTest - 10 tests
SparqlBuilderNestedFieldsTest - 29 tests
Total: 61 RDF tests passing
Files Changed
Core Implementation
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/rdf/RdfIndexApp.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/RdfRepository.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/InferenceEngine.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/storage/JenaFusekiStorage.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SqlMappingContext.javaopenmetadata-service/src/main/java/org/openmetadata/service/rdf/sql2sparql/SparqlBuilder.javaConfiguration
openmetadata-spec/src/main/resources/json/schema/api/configuration/rdfConfiguration.jsonTests
openmetadata-service/src/test/java/org/openmetadata/service/apps/bundles/rdf/RdfIndexAppTest.java(new)openmetadata-service/src/test/java/org/openmetadata/service/rdf/sql2sparql/SparqlBuilderNestedFieldsTest.java(new)openmetadata-service/src/test/java/org/openmetadata/service/rdf/sql2sparql/SqlToSparqlTranslatorTest.java(updated)