Skip to content

research: knowledge graph construction ergonomics and pipeline limitations #449

@DecisionNerd

Description

@DecisionNerd

Goal

Hands-on validation of the knowledge-graph-construction use case against GraphForge v0.3.9. Build a small KG from mock LLM extraction output (5–10 entity types, 20+ relationships) and document friction points.

Scope

  • Run through every code example in docs/use-cases/knowledge-graph-construction.md — produce a pass/fail matrix
  • Document pain points in MERGE-based ingest (e.g., dynamic label support, parameterized label syntax)
  • Schema validation gaps: no constraint enforcement, no programmatic way to list all labels/relationship types
  • Export ergonomics: how painful is it to get a subgraph into pandas or share as JSON?
  • Ingestion performance: what happens when batches grow to 1K, 10K entities?
  • Recommendations for utility functions or API additions that would reduce friction

Output

docs/research/kg-construction.md — findings document with:

  • Code examples of each friction point (runnable against v0.3.9)
  • Pass/fail matrix for all documented patterns
  • Prioritized list of recommended API additions

Acceptance Criteria

  • All code in knowledge-graph-construction.md tested; failures documented
  • Friction points supported by concrete code, not just prose
  • Recommendations are specific (method signature proposals, not vague suggestions)
  • No code changes to the library (findings only)

Metadata

Metadata

Assignees

No one assigned

    Labels

    coreCore source code changesquestionFurther information is requested

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions