Skip to content

gsoc26: Format Conversion Layer (Layer 2) #59

@DhanashreePetare

Description

@DhanashreePetare

Feature Request

Describe the feature you'd like
Add Layer 2 (within-class format conversion) to the Databus Python Client download pipeline. Users should be able to convert downloaded RDF and tabular files to a different serialization format in a single command using a new --convert-format flag.

Why is this feature important?
Currently the client downloads files exactly as published. If a dataset is published in Turtle but the consuming application needs N-Triples or CSV, the user must manually convert after downloading. This feature eliminates that step and brings the Python client to feature parity with the Java Databus Client's Layer 2 implementation.

Describe alternatives you've considered
The Java client uses Apache Jena and Spark for this layer. For the Python
client, rdflib is already a project dependency and natively supports all
required formats, making it the natural choice without adding new dependencies.

Additional context
Equivalence classes and supported conversions:

From Class To Class Formats
RDF Triples RDF Triples ntriples, turtle, rdf-xml
RDF Quads RDF Quads nquads, trig, json-ld
Tabular Tabular csv, tsv

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions