Skip to content

feat: expose JSON reader via registerJson and readJson #35

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge?

DataFusion 53.1 has built-in support for newline-delimited JSON
(SessionContext::read_json / register_json), but the Java binding
currently only exposes Parquet (#18) and CSV (#21). Users wanting to
query JSON files have to fall back to CREATE EXTERNAL TABLE via SQL,
which loses the typed-options ergonomics the Parquet/CSV bindings
already provide.

Describe the solution you'd like

Mirror the existing reader pattern:

  • Add an NdJsonReadOptions value class analogous to
    ParquetReadOptions / CsvReadOptions (file extension, schema,
    schema-infer-max-records, compression, etc.).
  • Add a proto/json_read_options.proto and pass options through the
    established proto-over-JNI convention (refactor: pass csv and parquet read options via protobuf #29).
  • Expose SessionContext.registerJson(name, path[, options]) and
    readJson(path[, options]).
  • Cover with tests in the spirit of SessionContextCsvTest /
    ParquetReadOptionsTest.

Describe alternatives you've considered

Users can issue CREATE EXTERNAL TABLE … STORED AS JSON via
SessionContext.sql. This works but bypasses the typed builder pattern
and gives a less discoverable Java API.

Additional context

DataFusion's JSON reader is in the default feature set, so no Cargo
feature flag changes are required on the native side.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions