perf: optimize memory usage in denoised-rows query by conditionally including processedRows#1177
Conversation
…ncluding processedRows
🦋 Changeset detectedLatest commit: a0930d2 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Stably Runner - Test Suite - 'Smoke Test'Test Suite Run Result: 🔴 Failure (1/4 tests failed) [dashboard] Failed Tests: This comment was generated from stably-runner-action |
| } | ||
| return undefined; | ||
| }, | ||
| gcTime: isLive ? ms('30s') : ms('5m'), // more aggressive gc for live data, since it can end up holding lots of data |
There was a problem hiding this comment.
oh this is neat. theoretically we don't need to keep the old page if live tail is enabled
| denoiseResults, | ||
| // Only include processed rows if denoising is enabled | ||
| // This helps prevent the queryKey from getting extremely large | ||
| // and causing memory issues, when it's not used. | ||
| ...(denoiseResults ? [processedRows] : []), |
There was a problem hiding this comment.
I’m scratching my head as to why this is relevant. even if denoise is disabled, react-query still tries to cache the key that blows up the memory?
There was a problem hiding this comment.
Yeah, exactly!
You'd think the enabled flag being false would turn all this off, I can file a bug in the tanstack query library when I get back from vacation (and see if one already exists) if we want to help provide feedback.
| denoiseResults, | ||
| // Only include processed rows if denoising is enabled | ||
| // This helps prevent the queryKey from getting extremely large | ||
| // and causing memory issues, when it's not used. | ||
| ...(denoiseResults ? [processedRows] : []), |
There was a problem hiding this comment.
perf nit: Instead of passing all the rows, one idea is to generate an ID for processedRows. For example, we can take a fixed step to sample the rows and then compute a hash from those samples
There was a problem hiding this comment.
Ah good idea, I spent a large portion of time messing around with their queryKeyHashFn but I couldn't get it to have a clean & fast solution on the full processedRows dataset, but if we're sampling the results and generating a hash on those then I can definitely improve the perf on that front. This should reduce memory when denoising is enabled.
Let me take a look at this next week when I get back from vacation!
There was a problem hiding this comment.
I think we don't need to go too deep down optimizing this path - my preference is if scaling issues continue to come up for this feature, we re-evaluate pushing down denoising into the clickhouse query itself as opposed to incremental improvements to the current implementation.
E2E Test Results✅ All tests passed • 20 passed • 3 skipped • 184s
|
When the searching row limits is set very high (ex the max of 100k) the app quickly consumes all available memory and crashes.
This adds some improvements to help mitigate the problem:
queryKeyis generating a ton of extra entries every time theprocessedRowschanges (which is every 5s when in live mode). The queryKey and result is cached regardless of if enabled is true or false. The base hashFn strategy is to stringify the objects which creates a very large string to be stored in memory. I tried to fix this by providing a customqueryKeyHashFntouseQuerybut it was too slow, and the faster browser based hashing fns return a promise which isn't supported byuseQueryat this time. The easiest solution I found was to short circuit the hash generation if we are not denoising.gcTime- We already setgcTimeinuseOffsetPaginatedQueryso I added that field here too, this helps keep the memory usage lower while denoising rows (but the memory still is much higher).The app still uses very high memory usage, just from the sheer number of rows being captured and processed, but it doesn't crash anymore. There is definitely further optimizations we could make to reduce this. One solution that comes to mind is storing a hash/unique id of each row server side before sending to the client, then our app can leverage this key instead of a stringified object.
Before (after 1 min):

After (after 5 mins):

Fixes: HDX-2409