Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JSON Lines in InfluxDB Edge #24654

Open
mgattozzi opened this issue Feb 8, 2024 · 0 comments
Open

Support JSON Lines in InfluxDB Edge #24654

mgattozzi opened this issue Feb 8, 2024 · 0 comments
Labels

Comments

@mgattozzi
Copy link
Contributor

mgattozzi commented Feb 8, 2024

As part of #24616 we added JSON support, however, this has the problem of requiring us to buffer all of the data into memory. Given a sufficiently large query this would cause a possible OOM kill and so we should offer JSON Lines where each line is a row from the returned query. This would allow us to stream the data from a SendableRecordBatchStream as we get it, to the user, and avoid buffering too much data into memory.

Steps:

  1. Create a JsonLinesFormatter in the influxdb3_server/src/http.rs file that uses the JsonFormat trait to set the output to follow JSON lines e.g. instead of the JSON:
[ { "foo": "bar" }, {"foo": "baz" } ]

it should be output as:

{ "foo": "bar" }
{ "foo": "baz" }

where each record is on it's own line without the []

This should just be writing a '\n' char into the writer in the end_row function and the rest should just need to be stubbed out with Ok(())

  1. Using the Writer and the JsonLinesFormatter write the data into a streaming body in a separate task, while returning the Response for the API in the query routes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant