Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune charts data #73

Closed
josevalim opened this issue Apr 19, 2020 · 10 comments · Fixed by #135
Closed

Prune charts data #73

josevalim opened this issue Apr 19, 2020 · 10 comments · Fixed by #135

Comments

@josevalim
Copy link
Member

So someone reported that keeping the "Metrics" tab open after a long period of time ended-up consuming 5GB of memory. This makes sense, given that we keep pushing data to charts forever and ever. I am wondering if we should put an upper limit on the number of elements on the chart and start pruning old data once we go past N elements, for example, N = 10000.

Hi @leeoniya, if you don't mind, do you have any insights on what we can do here?

@leeoniya
Copy link

leeoniya commented Apr 19, 2020

Hi @leeoniya, if you don't mind, do you have any insights on what we can do here?

add more ram :D

i mean, you can't expect a permanently-open live view to show you a boundless amount of data, regardless of the tech you use; the user's browser cannot serve as a replacement for a server-side time-series database, after all. and an overview with 1e9 datapoints would be unwise to feed to the client or to a js chart - especially one that has to be continuously re-rendered.

you could potentially save some memory by using typed arrays for the input data, but that's just kicking the can down the road. re-plotting 1e6 datapoints on every event (which is what uPlot's .setData() does) is going to chew through your CPU. there are some webgl-based options [1] that can draw millions of datapoints, but i don't think that's the correct solution here.

i'm not sure i can give you any terribly insightful advice that isn't obvious.

  1. limit the live charts to a fixed view window, like past 1 hour. there probably won't be a universal time period for this because really you're trying to limit the amount of stored/rendered data, and servers under heavy load can generate millions of points in seconds or minutes. you can make it adaptive using some logic that makes sense to you. if you look at the streaming demo [2], you guys are doing option 2 for live charts, but i would recommend either 1 or 3.

  2. have a long-term overview chart that's separate and gets fed "all" the data at lower frequency and after it has been down-sampled on the server to however many pixels can be physically shown on the current screen (depends on the chart dims). e.g. https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z0000019YLKSA2&l=en-US

i have a half-baked demo of an overview & detail chart linked together [3], but it's all client-side. you would need to involve the server for it to solve this perf issue.

[1] https://github.com/leeoniya/uPlot#166650-point-bench-httpsleeoniyagithubiouplotbenchuplothtml
[2] https://leeoniya.github.io/uPlot/demos/stream-data.html
[3] https://leeoniya.github.io/uPlot/demos/zoom-ranger.html

@leeoniya
Copy link

sorry, i think i stopped reading after the first sentence of this issue. ignore my re-explaining of the obvious things you already said.

time for coffee, obviously.

@josevalim
Copy link
Member Author

Thanks @leeoniya. I think the simplest for now is to keep only the last N points or so. We will figure out the best value for N as we move forward. I will take a look at your examples and see if I can come up with something.

@mcrumm
Copy link
Member

mcrumm commented Apr 19, 2020

@josevalim We could consider a sliding time window, too. Rather than keep the last N points, we could drop points where z < now() - window_size_in_minutes.

@josevalim
Copy link
Member Author

@mcrumm my concern with window size is that it won't be enough for apps doing many req/s. So having an absolute value works best. However if the sliding window is easier, then let's do that.

@mcrumm
Copy link
Member

mcrumm commented Apr 19, 2020

Good point. No, max values is likely simpler, so let's start there. 👍

@leeoniya
Copy link

leeoniya commented Apr 23, 2020

@mcrumm

RE: https://twitter.com/mcrumm/status/1251191099512598528

i'll offer my $0.02. and please don't take this as me trying to keep uPlot in live_dashboard, but simply as my experience with benchmarking more libs than i ever thought i'd need to [1].

i have yet to encounter an SVG-based charting lib that comes anywhere close in performance to Canvas based libs for large datasets and interactivity. i love SVG more than canvas and would be ecstatic to be proven wrong, but thus far the performance gap is simply too large. i'd be very curious to see how contex-charts performs relative to everything else i've tested, not just in static chart rendering but live streams, too.

another new lib in this space is https://github.com/Rich-Harris/pancake

i think that any server-side charting solution will have to reduce SVG datapoints drastically on the server or output a png rather than xml.

[1] https://github.com/leeoniya/uPlot#performance

@mcrumm
Copy link
Member

mcrumm commented Apr 23, 2020

Thanks @leeoniya! I just wanted to mention some of the alternatives we considered, while we still had some Twitter hype going :)

I'm extremely happy with what we have so far, and once we set some limits in the dashboard to improve the experience for very large datasets, it will be even better.

Thanks for all your continued help!

@josevalim
Copy link
Member Author

To be clear: this is an issue about pruning data on the client. Whatever we do in the server is a separate issue from this one.

@josevalim
Copy link
Member Author

Closing in favor of PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants