-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Problem
Current JMESPath implementations typically load entire datasets into memory, which is inefficient and often impossible for large data streams (e.g., logs, sensor data, large API responses). This leads to out-of-memory errors and performance bottlenecks in high-volume or resource-constrained environments.
Proposed Solution
Introduce a streaming JMESPath evaluation capability, similar to jmespath.searchStream(expression, largeDataStream), where largeDataStream is a ReadableStream. This would allow incremental processing of data, emitting results as they are evaluated, without buffering the entire dataset.
This approach supports
- node.js environments using Node's
stream.Readableorstream/web.ReadableStream. - Browser environments utilizing the Web Streams API (e.g.,
Response.bodyfromfetch,File.stream()), requiring a streaming JSON parser.
Example Usage
const stream = jmespath.searchStream(expression, largeDataReadableStream);
stream.on('data', (chunk) => {
// Process results incrementally
console.log(chunk);
});
stream.on('end', () => {
console.log('Stream finished.');
});Benefits
-
Memory Efficiency: Drastically reduce memory footprint for large datasets.
-
Improved Performance: Process data as it arrives, enabling real-time or near real-time analysis.
-
Scalability: Better support for big data pipelines, log analysis, and API gateways in cybersecurity and other domains.