Skip to content

SHACL-driven transformation: use shapes as transformation specification #325

@ddeboer

Description

@ddeboer

Summary

Use SHACL shapes not just for validation after transformation, but as a transformation specification that drives the mapping process itself. This inverts the usual flow: instead of 'transform, then validate,' the target SHACL shapes guide the transformation towards the desired RDF model.

Motivation

Currently, transforming non-RDF data (CSV, JSON, XML) to RDF requires manually writing SPARQL CONSTRUCT queries or RML mappings. The user must understand both the source schema and the target RDF model, and there is no feedback loop between the transformation and the desired output shape.

SHACL shapes already describe what valid RDF should look like – required properties, value types, cardinalities, controlled vocabularies. This is exactly the information needed to guide a transformation, not just check its result.

Proposed approach

  1. Define the target model as SHACL shapes – these become the transformation specification
  2. Analyse source data (CSV/JSON/XML) against the SHACL shapes to identify potential mappings
  3. Generate or suggest SPARQL CONSTRUCT queries that map source fields to the target model
  4. Validate the transformation output against the same SHACL shapes – violations become feedback for iterative refinement
  5. Iterate until the output conforms to the target shapes

This creates a SHACL-driven feedback loop: shapes → mapping → transformation → validation → refined mapping.

Research questions

  • How much of a SHACL shape's constraints can be meaningfully used to drive transformation (vs. only validation)?
  • Can we automatically infer mappings from source data structure + target SHACL shapes?
  • What is the right level of automation vs. user guidance in this loop?
  • How does this interact with LDE's streaming pipeline architecture and backpressure?

Prior art

No existing tool uses SHACL as a transformation specification in this way. Related work:

  • SPARQL Anything / Facade-X – handles the technical conversion but requires manual query writing
  • RML – declarative mappings but not SHACL-driven
  • SHACL validation – used post-hoc, not as input to transformation

Context

This idea emerged during preparation of the NLnet grant proposal for LDE. It represents a genuinely novel R&D direction that combines LDE's existing SHACL and SPARQL capabilities.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions