Skip to content

Streaming CSV/JSON Object Store Read #2935

@tustvold

Description

@tustvold

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Currently CsvOpener and JsonOpener call GetResult::bytes which downloads the entire file, prior to feeding it to the appropriate arrow reader.

This is not ideal:

Following on from #2677 we now support streaming responses from object storage

Describe the solution you'd like

The underlying challenge is to take arbitrary Stream<Bytes> and convert it into a Stream<Bytes> where each stream element contains complete rows, as delimited by a newline character. Once we have this DelimitedStream, it is trivial to feed each of these byte chunks individually into the corresponding decoder.

Describe alternatives you've considered

We could not do this

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions