Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: fetch all widgets in one single comm message using the control channel #766

Merged

Conversation

maartenbreddels
Copy link
Member

Related to #764

The way we fetch the widgets from the a kernel, is that we request all comm id's, and send an 'update' message to each comm to request the widget state. That means that for each widget, we send 1 message (update) and we receive 3 messages (busy, comm, idle). This leads to bad performance, since there is a per-message overhead in all layers (kernel, jupyter server, browser). On top of that, we don't know if the Comms are actually widgets.

I think we should have some higher level of communication, to talk 'widget' instead of 'comm', through a single comm object. This will allow us to request the full widget state in 1 request, and get the full state as 1 reply (or multiple if we want to).

Performance results:
Fetching of 1000 Layout widgets from the kernel

  • Before: 2500msec
  • After: 500msec

That's a 5x increase in performance, and I think much less load on the voila/jupyter_server server.

Note that this PR only downloads the widget data, it does not build the widgets yet.

@maartenbreddels maartenbreddels marked this pull request as draft November 19, 2020 11:47
maartenbreddels added a commit to maartenbreddels/voila that referenced this pull request Nov 19, 2020
This will connect to the kernel as soon as possible, and tries to create
widget during execution. This avoids having to get all the widgets after
execution, and is an alterative to;
voila-dashboards#766

Issues:
 * since the whole jupyter message handling is 'synchronous' it awaits
   at a top level and goes all the way down, get_model needs to be resolved
   without any new message coming from the kernel.
 * Despite making handle_comm_open not returning a promise, we still get
   a stalling of the whole message q.

Despite these issues, the loading time is significantly reduced, since
the page renders almost instantly when it succeeds (e.g. when no widgets
are missed).
@jtpio jtpio added the enhancement New feature or request label Jun 23, 2021
@martinRenou
Copy link
Member

I am rebasing this and exploring it

@martinRenou
Copy link
Member

Adding the widget model creation to the PR.

Performance results:
Fetching of 400 Button widgets from the kernel

  • Before: 4200msec
  • After: 1000msec

@martinRenou
Copy link
Member

The model creation works now (no support for binary buffers yet), but passing all messages through one comm doesn't seem to work, I need to explore more on this.

@maartenbreddels
Copy link
Member Author

I'm afraid this is more difficult, since we may start listening after say 20% of the widgets are created, and if we want to create a widget while we missed the creation of a previously created widget that we depend on, we'll enter a 'deadlock' : We'll await the creation of our depending widget, and during that time we will not process any other message, since we 'await through the whole callstack', and the incoming websocket messages just accumulate. This means that we also cannot ask for the state of a previously created widget.

I hope my description makes sense, let me know if want need a video call for some higher bandwidth channel.

@martinRenou
Copy link
Member

I am not sure I understand your description 😅 getting into a call could help indeed.

What I am exploring now is to use your jupyter.widget.control target only for the first message, and then fake a comm-open message for all widgets so that all comms get created on the front-end, and if I understand correctly this should create the widget models.
This way the widgets should behave normally I believe, and we would still use the "one comm=one widget" logic.

@martinRenou
Copy link
Member

@maartenbreddels it seems to work with the following approach:

  • One "single shot" comm to fetch all models at once
  • Loop over all the received states and:
    • Create model
    • Create comm for this model

Performances are still good:
Fetching of 400 Button widgets from the kernel

Before: 4400msec
After: 800msec

@maartenbreddels
Copy link
Member Author

I am also afraid this will go over the 10mb per websocket message. I guess this is still an issue

@martinRenou
Copy link
Member

Oh I didn't know this was a limitation. Don't websockets send messages in batches?

@martinRenou
Copy link
Member

martinRenou commented Sep 24, 2021

@maartenbreddels I went a bit far testing this 😅 I created a grid of 1225 bqplot plots, each plot being a 1000 points line. The entire websocket message was 23_446_769 bytes, and it worked!

plots

Time to fetch and create all models
Before this PR: 48110 msec
After this PR: 15118 msec

@maartenbreddels
Copy link
Member Author

Haha, cool!
I don't know what the status is now, but this used to be a problem jupyter/notebook#3468

@martinRenou
Copy link
Member

Oooh it's a tornado limitation, I see...

Well, the good thing is we can bypass this with c.NotebookApp.tornado_settings = {"websocket_max_message_size": x}, we could maybe document this in the Voila documentation.

catch {
console.warn('Failed to open "jupyter.widget.control" comm channel, fallback to slow fetching of widgets.');
// Fallback to the old implementation for old ipywidgets versions (<=7.6)
return this._build_models_slow();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the backend for ipywidgets does not implement the control channel, we fallback to the old implementation

@martinRenou
Copy link
Member

martinRenou commented Nov 10, 2021

This PR could technically be merged already. Because it falls back to the previous implementation if the control channel is not responding. But let's wait for its sibling to be merged first: jupyter-widgets/ipywidgets#3021

@martinRenou martinRenou changed the title WIP feat: fetch all widgets in 1 go Feature: fetch all widgets in one single comm message using the control channel Nov 10, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2021

Benchmark report

The execution time (in milliseconds) are grouped by test file, test type and browser.
For each case, the following values are computed: min <- [1st quartile - median - 3rd quartile] -> max.

Results table
Test file basics.ipynb bqplot.ipynb dashboard.ipynb gridspecLayout.ipynb interactive.ipynb ipympl.ipynb ipyvolume.ipynb multiple_widgets.ipynb query-strings.ipynb reveal.ipynb
Render
chromium
actual 3296 <- [3428 - 3615 - 4073] -> 4904 3000 <- [3053 - 3104 - 3172] -> 3467 3304 <- [3358 - 3463 - 3595] -> 3765 4215 <- [4299 - 4302 - 4371] -> 4442 2463 <- [2524 - 2592 - 2668] -> 2824 5242 <- [5465 - 5694 - 5937] -> 6685 11925 <- [11989 - 12062 - 13211] -> 13356 14157 <- [14201 - 14227 - 14273] -> 14325 1796 <- [1822 - 1836 - 1845] -> 1967 2933 <- [2967 - 3067 - 3293] -> 3631
expected 3379 <- [3442 - 3517 - 3701] -> 3876 2976 <- [3227 - 3321 - 3421] -> 3604 3608 <- [3623 - 3709 - 3793] -> 3825 4453 <- [4453 - 4523 - 4661] -> 4748 2559 <- [2655 - 2656 - 2660] -> 2674 3982 <- [4079 - 4213 - 4356] -> 4743 12183 <- [18509 - 19553 - 20811] -> 21515 15319 <- [15660 - 15796 - 15912] -> 16056 1517 <- [1920 - 1997 - 2103] -> 2113

❗ Test metadata have changed
--- /dev/fd/63	2021-12-21 15:54:33.209823771 +0000
+++ /dev/fd/62	2021-12-21 15:54:33.209823771 +0000
@@ -4,37 +4,37 @@
     "BENCHMARK_REFERENCE": "actual"
   },
   "browsers": {
-    "chromium": "97.0.4666.0"
+    "chromium": "94.0.4595.0"
   },
   "systemInformation": {
     "cpu": {
-      "brand": "Xeon® E5-2673 v4",
+      "brand": "Xeon® E5-2673 v3",
       "cache": {
         "l1d": 65536,
         "l1i": 65536,
         "l2": 524288,
-        "l3": 52428800
+        "l3": 31457280
       },
       "cores": 2,
       "family": "6",
-      "flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear",
+      "flags": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear",
       "governor": "",
       "manufacturer": "Intel®",
-      "model": "79",
+      "model": "63",
       "physicalCores": 2,
       "processors": 1,
       "revision": "",
       "socket": "",
-      "speed": 2.3,
+      "speed": 2.4,
       "speedMax": null,
       "speedMin": null,
-      "stepping": "1",
+      "stepping": "2",
       "vendor": "GenuineIntel",
       "virtualization": false,
       "voltage": ""
     },
     "mem": {
-      "total": 7289614336
+      "total": 7291699200
     },
     "osInfo": {
       "arch": "x64",
@@ -42,11 +42,11 @@
       "codename": "Focal Fossa",
       "codepage": "UTF-8",
       "distro": "Ubuntu",
-      "kernel": "5.11.0-1022-azure",
+      "kernel": "5.8.0-1040-azure",
       "logofile": "ubuntu",
       "platform": "linux",
       "release": "20.04.3 LTS",
-      "serial": "4a9fc102c841404db9b929f440d77193",
+      "serial": "cfc067bfcb844f35865e279a1b0e66c5",
       "servicepack": "",
       "uefi": false
     }

@martinRenou martinRenou marked this pull request as ready for review November 18, 2021 09:10
@martinRenou
Copy link
Member

Marking this PR as ready to review.

This can be merged and released before ipywidgets supports the control channel in an official release. This way it will work as soon as ipywidgets 8 or 7.7 is out

@martinRenou martinRenou force-pushed the refactor_widget_fetching branch 2 times, most recently from 781002b to bce3284 Compare December 20, 2021 15:32
@martinRenou
Copy link
Member

Thanks for your review!

@martinRenou
Copy link
Member

The ui-tests are failing but I guess it's only due to the ipympl update: #1048

@martinRenou
Copy link
Member

Rebased

@trungleduc trungleduc merged commit d3d299c into voila-dashboards:main Dec 22, 2021
@martinRenou
Copy link
Member

Thanks!!

@maartenbreddels
Copy link
Member Author

Btw, the reason we don't see this tornado limit is that it's a limit for incoming traffic, such that clients cannot pump the server full of terrabytes of data which it is forced to keep into memory. Downloading has no limit, since there the server is more in control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants