New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Websockets or SSE instead of polling #130

Open
pjlegato opened this Issue Apr 1, 2016 · 20 comments

Comments

Projects
None yet
7 participants
@pjlegato
Contributor

pjlegato commented Apr 1, 2016

Performance would be better, and there would be less impact on the server from monitoring itself, if the data were transmitted to the web client via websockets or SSE rather than by polling the server once a second.

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 1, 2016

Member

You are right!

However, even with websockets, the main chart library used (dygraphs.com) is not capable of just shifting the raster and adding a single point. The only chart libraries I know that are capable of doing this are:

However, I find them a lot less flexible (and incomplete) compared to dygraphs.
This means the charts on the browser would have to be re-drawn.
So, no performance benefit for the browser.

Then, let's consider the communication between client and server: A new protocol (over websockets or SSE) would have to be created to handle the different states of the charts:

  • appending points in auto-refresh
  • zooming in/out
  • panning
  • window resizing (which might change the data aggregation at the server so a full refresh might be triggered)
  • scrolling the page (i.e. switching charts - again full refresh might be needed)
  • data transformation at browser javascript to build the dataset at the proper structure required for each chart library (now this is done in C and it is simple because there is just one case: send all the visible data to the browser).

Consider also the state machine that would be needed at the netdata server to track what each web client is willing to get and fire the proper events for updating them.

This is a huge project by itself !

So, yes I agree. The way it is now it is not optimal.
Can we do it the optimal way? Of course!
Does it have any value to do it? I am not sure.
Let's see why:

If we do it, we will gain some CPU cycles at the netdata server, the web browser (if we also write our own chart library) and some bandwidth.
When are we going to benefit from them?
Just for the moments we let the charts auto-refresh, without scrolling the page and without panning or zooming the charts.

Time yourself. How much time do you spend in that condition? I timed me and found that I don't spend that much time just sitting in front of the screen watching it without interacting with the charts.

Of course, such a solution would be perfect for a TV dashboard that is always showing the same charts in auto-refresh.

Member

ktsaou commented Apr 1, 2016

You are right!

However, even with websockets, the main chart library used (dygraphs.com) is not capable of just shifting the raster and adding a single point. The only chart libraries I know that are capable of doing this are:

However, I find them a lot less flexible (and incomplete) compared to dygraphs.
This means the charts on the browser would have to be re-drawn.
So, no performance benefit for the browser.

Then, let's consider the communication between client and server: A new protocol (over websockets or SSE) would have to be created to handle the different states of the charts:

  • appending points in auto-refresh
  • zooming in/out
  • panning
  • window resizing (which might change the data aggregation at the server so a full refresh might be triggered)
  • scrolling the page (i.e. switching charts - again full refresh might be needed)
  • data transformation at browser javascript to build the dataset at the proper structure required for each chart library (now this is done in C and it is simple because there is just one case: send all the visible data to the browser).

Consider also the state machine that would be needed at the netdata server to track what each web client is willing to get and fire the proper events for updating them.

This is a huge project by itself !

So, yes I agree. The way it is now it is not optimal.
Can we do it the optimal way? Of course!
Does it have any value to do it? I am not sure.
Let's see why:

If we do it, we will gain some CPU cycles at the netdata server, the web browser (if we also write our own chart library) and some bandwidth.
When are we going to benefit from them?
Just for the moments we let the charts auto-refresh, without scrolling the page and without panning or zooming the charts.

Time yourself. How much time do you spend in that condition? I timed me and found that I don't spend that much time just sitting in front of the screen watching it without interacting with the charts.

Of course, such a solution would be perfect for a TV dashboard that is always showing the same charts in auto-refresh.

@pjlegato

This comment has been minimized.

Show comment
Hide comment
@pjlegato

pjlegato Apr 1, 2016

Contributor

All excellent points. It does seem like a fairly large amount of work for what would be a relatively small gain in the most common cases.

Contributor

pjlegato commented Apr 1, 2016

All excellent points. It does seem like a fairly large amount of work for what would be a relatively small gain in the most common cases.

@pjlegato

This comment has been minimized.

Show comment
Hide comment
@pjlegato

pjlegato Apr 1, 2016

Contributor

D3.js or one of its derivatives would be ideal, but would require substantial reworking.

Contributor

pjlegato commented Apr 1, 2016

D3.js or one of its derivatives would be ideal, but would require substantial reworking.

@ibnesayeed

This comment has been minimized.

Show comment
Hide comment
@ibnesayeed

ibnesayeed Apr 2, 2016

However, even with websockets, the main chart library used (dygraphs.com) is not capable of just shifting the raster and adding a single point. The only chart libraries I know that are capable of doing this are:

@ktsaou, I was wandering if you considered vis.js and does it support shifting and painting the raster incrementally? I find this library quite feature rich.

ibnesayeed commented Apr 2, 2016

However, even with websockets, the main chart library used (dygraphs.com) is not capable of just shifting the raster and adding a single point. The only chart libraries I know that are capable of doing this are:

@ktsaou, I was wandering if you considered vis.js and does it support shifting and painting the raster incrementally? I find this library quite feature rich.

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 2, 2016

Member

I was wandering if you considered vis.js and does it support shifting and painting the raster incrementally? I find this library quite feature rich.

No I haven't. I can give it a try and integrate it (all integrated so far: http://netdata.firehol.org/dashboard.html).

Member

ktsaou commented Apr 2, 2016

I was wandering if you considered vis.js and does it support shifting and painting the raster incrementally? I find this library quite feature rich.

No I haven't. I can give it a try and integrate it (all integrated so far: http://netdata.firehol.org/dashboard.html).

@ktsaou ktsaou added the question label Apr 2, 2016

@0xpr03

This comment has been minimized.

Show comment
Hide comment
@0xpr03

0xpr03 Apr 2, 2016

I timed me and found that I don't spend that much time just sitting in front of the screen watching it without interacting with the charts.

Sorry for bursting in but I've to disagree, I'm just letting it run through to see changes when I'm doing some on the system, creating 40 MB logfiles from nginx. Especially when I'm observing an DDOS and how the system reacts.

0xpr03 commented Apr 2, 2016

I timed me and found that I don't spend that much time just sitting in front of the screen watching it without interacting with the charts.

Sorry for bursting in but I've to disagree, I'm just letting it run through to see changes when I'm doing some on the system, creating 40 MB logfiles from nginx. Especially when I'm observing an DDOS and how the system reacts.

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 2, 2016

Member

Of course, that is a proper use of netdata.
Regarding the logs, I can add a configuration option to netdata to just log a line per connection (not every request), logging also the X-Forwarded-For set by nginx for this connection. This way you could disable the nginx log entirely. If you want it, open a github issue for it.

Keep in mind that is what you will actually get if web sockets or SSE were there...

Member

ktsaou commented Apr 2, 2016

Of course, that is a proper use of netdata.
Regarding the logs, I can add a configuration option to netdata to just log a line per connection (not every request), logging also the X-Forwarded-For set by nginx for this connection. This way you could disable the nginx log entirely. If you want it, open a github issue for it.

Keep in mind that is what you will actually get if web sockets or SSE were there...

@0xpr03

This comment has been minimized.

Show comment
Hide comment
@0xpr03

0xpr03 Apr 2, 2016

@ktsaou My use case for the logs is to catch failed logins.

0xpr03 commented Apr 2, 2016

@ktsaou My use case for the logs is to catch failed logins.

@ibnesayeed

This comment has been minimized.

Show comment
Hide comment
@ibnesayeed

ibnesayeed Apr 3, 2016

No I haven't. I can give it a try and integrate it (all integrated so far: http://netdata.firehol.org/dashboard.html).

@ktsaou Interesting page. If I may ask, is there a way to know if they are painting the differences incrementally or overwriting the whole canvas on each change, without looking into their code to find out the rendering logic?

ibnesayeed commented Apr 3, 2016

No I haven't. I can give it a try and integrate it (all integrated so far: http://netdata.firehol.org/dashboard.html).

@ktsaou Interesting page. If I may ask, is there a way to know if they are painting the differences incrementally or overwriting the whole canvas on each change, without looking into their code to find out the rendering logic?

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 3, 2016

Member

ok. You have password protection on nginx.
Your problem can also be solved when we add password protection on netdata, #120

Member

ktsaou commented Apr 3, 2016

ok. You have password protection on nginx.
Your problem can also be solved when we add password protection on netdata, #120

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 3, 2016

Member

If I may ask, is there a way to know if they are painting the changes incrementally or overwriting the whole canvas on each change, without looking into their code to find out the rendering logic

which one?

Member

ktsaou commented Apr 3, 2016

If I may ask, is there a way to know if they are painting the changes incrementally or overwriting the whole canvas on each change, without looking into their code to find out the rendering logic

which one?

@ibnesayeed

This comment has been minimized.

Show comment
Hide comment
@ibnesayeed

ibnesayeed commented Apr 3, 2016

which one?

vis.js

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 3, 2016

Member

vis.js

In a few demos, I think they are scrolling the canvas. However, netdata requirements are tough. The charting library has to be really fast. If they manipulate data in javascript, they are not going to be fast. I saw a demo on vis.js with 50000 points. It took a few seconds on my PC. If that was processing time, it will not do for netdata. If it was downloading time, it might be ok. I don't know.

Then, there are issues with synchronizing mouse selection on multiple charts, legends, zooming, panning, etc. If you check the details on the dashboard I sent you, you will see that netdata calculates the time in ms each chart takes to be refreshed. Even in dygraphs, I had to disable almost all animations and effects it supports to make it usable. With 2-3 charts on a page it was playing nice. With 10 it was unusable. So, I disabled everything and I re-wrote the legend entirely, etc to speed it up!

Member

ktsaou commented Apr 3, 2016

vis.js

In a few demos, I think they are scrolling the canvas. However, netdata requirements are tough. The charting library has to be really fast. If they manipulate data in javascript, they are not going to be fast. I saw a demo on vis.js with 50000 points. It took a few seconds on my PC. If that was processing time, it will not do for netdata. If it was downloading time, it might be ok. I don't know.

Then, there are issues with synchronizing mouse selection on multiple charts, legends, zooming, panning, etc. If you check the details on the dashboard I sent you, you will see that netdata calculates the time in ms each chart takes to be refreshed. Even in dygraphs, I had to disable almost all animations and effects it supports to make it usable. With 2-3 charts on a page it was playing nice. With 10 it was unusable. So, I disabled everything and I re-wrote the legend entirely, etc to speed it up!

@ibnesayeed

This comment has been minimized.

Show comment
Hide comment
@ibnesayeed

ibnesayeed Apr 3, 2016

If they manipulate data in javascript, they are not going to be fast. I saw a demo on vis.js with 50000 points. It took a few seconds on my PC. If that was processing time, it will not do for netdata. If it was downloading time, it might be ok. I don't know.

I tried to profile their 50,000 example and it lookes like they are generating some dummy data using their generateData function and then calling loadDataIntoVis function at once with all that dataset. In my profiling it took 953.7ms for loadDataIntoVis (which internally called their DataSet.add method) to execute, load data, and render it. This is the measurement on rendering 50,000 data points at once. However, I did not measure what's the performance of the incremental changes. I can perhaps write a small function and run in the developer console to append a bunch of additional dataset one point at a time and collect the stats.

ibnesayeed commented Apr 3, 2016

If they manipulate data in javascript, they are not going to be fast. I saw a demo on vis.js with 50000 points. It took a few seconds on my PC. If that was processing time, it will not do for netdata. If it was downloading time, it might be ok. I don't know.

I tried to profile their 50,000 example and it lookes like they are generating some dummy data using their generateData function and then calling loadDataIntoVis function at once with all that dataset. In my profiling it took 953.7ms for loadDataIntoVis (which internally called their DataSet.add method) to execute, load data, and render it. This is the measurement on rendering 50,000 data points at once. However, I did not measure what's the performance of the incremental changes. I can perhaps write a small function and run in the developer console to append a bunch of additional dataset one point at a time and collect the stats.

@myroslav

This comment has been minimized.

Show comment
Hide comment
@myroslav

myroslav Apr 16, 2016

Server Sent Events are usable, when netdata are displaying live data, i.e. right side of graphs is "now()" and zoom is 1:1 (when each and every datapoint is to be sent to web client).

Javascript code would need to collect set of graphs to be displayed/updated from the moment on (i.e. after page scroll), and request some "feed" endpoint with set of datasets to fetch and its superset to stream. Each of the sets would go into separate SSE event and data would provide missing data sequence.

After providing all of the data for all of the graphs into single HTTP response, that response would stay open and after a second when new set of data for all graphs to be "streamed" would be collected, these data would be set into series of events, one event per graph. Network-wise, the that http chunk would probably be gzipped into single TCP packet, meaning minimal overhead on TCP/HTTP-request processing. Additionally each such "burst" of data can have an id that would be used on connection loss, thus SSE reconnection would be quite efficient (where netdata API server would provide only missing datapoints).

All in all SSE appear as the proper solution nework-wise. However graph repaint/update issue is not quite understood by me, and most probably is valid.

myroslav commented Apr 16, 2016

Server Sent Events are usable, when netdata are displaying live data, i.e. right side of graphs is "now()" and zoom is 1:1 (when each and every datapoint is to be sent to web client).

Javascript code would need to collect set of graphs to be displayed/updated from the moment on (i.e. after page scroll), and request some "feed" endpoint with set of datasets to fetch and its superset to stream. Each of the sets would go into separate SSE event and data would provide missing data sequence.

After providing all of the data for all of the graphs into single HTTP response, that response would stay open and after a second when new set of data for all graphs to be "streamed" would be collected, these data would be set into series of events, one event per graph. Network-wise, the that http chunk would probably be gzipped into single TCP packet, meaning minimal overhead on TCP/HTTP-request processing. Additionally each such "burst" of data can have an id that would be used on connection loss, thus SSE reconnection would be quite efficient (where netdata API server would provide only missing datapoints).

All in all SSE appear as the proper solution nework-wise. However graph repaint/update issue is not quite understood by me, and most probably is valid.

@ktsaou ktsaou added enhancement and removed question labels Apr 19, 2016

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou Apr 19, 2016

Member

I switched this to enhancement, to have a look at it at some point...

Member

ktsaou commented Apr 19, 2016

I switched this to enhancement, to have a look at it at some point...

@zet4

This comment has been minimized.

Show comment
Hide comment
@zet4

zet4 May 5, 2016

I would also like to add comment on the issue of using web-sockets in combination with services like CloudFlare that simply don't allow web-sockets, thus would require some form of fallback.

zet4 commented May 5, 2016

I would also like to add comment on the issue of using web-sockets in combination with services like CloudFlare that simply don't allow web-sockets, thus would require some form of fallback.

@ktsaou

This comment has been minimized.

Show comment
Hide comment
@ktsaou

ktsaou May 5, 2016

Member

I think the first step is to allow incremental updates of the data of the charts. This will save a lot of bandwidth in cases the dashboard is left to autorefresh the charts.

Once we have this, web sockets could also be implemented.

Member

ktsaou commented May 5, 2016

I think the first step is to allow incremental updates of the data of the charts. This will save a lot of bandwidth in cases the dashboard is left to autorefresh the charts.

Once we have this, web sockets could also be implemented.

@pjlegato

This comment has been minimized.

Show comment
Hide comment
@pjlegato

pjlegato May 5, 2016

Contributor

@ZetaHunter Oddly enough, CloudFlare just launched WebSockets for all users today: https://blog.cloudflare.com/everybody-gets-websockets/

Contributor

pjlegato commented May 5, 2016

@ZetaHunter Oddly enough, CloudFlare just launched WebSockets for all users today: https://blog.cloudflare.com/everybody-gets-websockets/

@zet4

This comment has been minimized.

Show comment
Hide comment
@zet4

zet4 May 5, 2016

You ought to be kidding, well, thanks for letting me know. 👍

zet4 commented May 5, 2016

You ought to be kidding, well, thanks for letting me know. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment