-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently CsvOpener and JsonOpener call GetResult::bytes which downloads the entire file, prior to feeding it to the appropriate arrow reader.
This is not ideal:
- Adds decode latency as must buffer full payload before reading
- May read more data than necessary (Support CSV Limit Pushdown to Object Storage #2930)
Following on from #2677 we now support streaming responses from object storage
Describe the solution you'd like
The underlying challenge is to take arbitrary Stream<Bytes> and convert it into a Stream<Bytes> where each stream element contains complete rows, as delimited by a newline character. Once we have this DelimitedStream, it is trivial to feed each of these byte chunks individually into the corresponding decoder.
Describe alternatives you've considered
We could not do this
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed