Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Layout engine much slower on Chrome #9515

Closed
philippjfr opened this issue Dec 8, 2019 · 27 comments · Fixed by #10113
Closed

[BUG] Layout engine much slower on Chrome #9515

philippjfr opened this issue Dec 8, 2019 · 27 comments · Fixed by #10113

Comments

@philippjfr
Copy link
Contributor

philippjfr commented Dec 8, 2019

When working with complex layouts the layout engine sometimes takes a considerable time to rerender because it has to measure the size of various components. For a long time I thought this was a general issue with the layout engine but it was just pointed out to me that this is much worse in Chrome.

In the contrived example below we have a reasonably complex layout of rows and columns and keep appending new rows with some text and plots.

from bokeh.layouts import Row, Column, Spacer
from bokeh.models.widgets import Div, Slider, Button
from bokeh.plotting import figure
from bokeh.io import curdoc

title = Div(text='<h1>Page title</h1>')

header = Row(children=[title, Spacer(sizing_mode='stretch_width'),
                       Div(text="Some link")],
             height=60, sizing_mode="stretch_width")

main = Column(sizing_mode='stretch_width')

def on_click(event):
    p = figure(sizing_mode='stretch_width')
    p.line([1, 2, 3], [1, 2, 3])
    row = Row(children=[Div(text="Some text") for i in range(10)]+[p],
              sizing_mode='stretch_width')
    main.children.append(row)

widgets = [Slider(start=0, end=10, title='Slider %d' % i) for i in range(10)]
button = Button(label='Click me')
button.on_click(on_click)
widgets.append(button)

nav = Column(children=widgets, width=200)

layout = Column(children=[header, Row(children=[nav, main])])

curdoc().add_root(layout)

The layout engine struggles the more items you add with the button presses in Chrome, while in Firefox it hardly slows down. Here are flame graphs generated by recording the performance in the two browsers.

Chrome

Screen Shot 2019-12-08 at 2 43 10 PM

Firefox

Screen Shot 2019-12-08 at 2 54 30 PM

You will note that it takes <2 seconds in Firefox and ~5 in Chrome. I'm not sure if this is just a Chrome performance issue but it would at least be good to investigate why it's so much slower and if anything can be done to fix it.

Versions:

Bokeh 1.4.0
Chrome Version 78.0.3904.108 (Official Build) (64-bit)
Firefox 70.0.1 (64-bit)

@MarcSkovMadsen
Copy link
Contributor

And I have a Panel App where this is an issue in practice.

@bryevdv
Copy link
Member

bryevdv commented Dec 8, 2019

@philippjfr Thanks for the detailed issue, cc @mattpap

As an immediate step it might be helpful to continue to characterize where this is a problem. I.e. Chrome on all platforms? or only OSX? Are other browsers on other platforms ok? Any difference between Chome versions? Basically just good to narrow the scope of where exactly this happens as much as possible.

@philippjfr
Copy link
Contributor Author

Seeing the same thing on Linux, can try out Windows later on I hope.

@xavArtley
Copy link
Contributor

xavArtley commented Dec 9, 2019

Hello don't know if it helps but there it is performance on windows 10 chrome Version 78.0.3904.108 (Build officiel) (64 bits):
image

Same test on firefox 60.0.1 (64-bit):
image

image

Howver the difference seems mitigate when not using the debugger
I made a rough test on 20 clicks an looking the scroll bar updating. It took me 27 seconds on chrome vs 23 seconds on firefox

@MarcSkovMadsen
Copy link
Contributor

Here is a real world example. The layout engine is slow in Firefox but incredible slow in Chrome.

bootstrap_dashboard_just_never_shows

It's running at https://awesome-panel.org

System Info

  • Panel: 0.7.0
  • Bokeh: 1.4.0
  • Python: 3.7.4
  • OS: Windows 8.1
  • Browser: Chrome

Gallery Code

The code for selecting and showing the app in the gallery is basically

def view() -> Column:
    """The gallery view of awesome-panel.org"""
    app_selection = Select(name="Select app", options=list(APPS.keys()))

    @pn.depends(app_selection.param.value)
    def selected_app(value):
        return APPS[value]()

    return Column(
        Markdown(TEXT),
        app_selection,
        Spacer(height=50),
        selected_app,
        sizing_mode="stretch_both",
        name="Gallery",
    )

@MarcSkovMadsen
Copy link
Contributor

And just for comparison. It's much faster in a Jupyter Notebook. But Firefox is significantly faster than Chrome.

notebook_performance

Im running this notebook https://github.com/MarcSkovMadsen/awesome-panel/blob/master/app.ipynb

and have this log

$ jupyter notebook app.ipynb
[I 04:54:31.256 NotebookApp] Serving notebooks from local directory: C:\repos\private\awesome-panel
[I 04:54:31.256 NotebookApp] The Jupyter Notebook is running at:
[I 04:54:31.257 NotebookApp] http://localhost:8888/?token=5f809af8188893692b9f863c8db8159841e4768130891117
[I 04:54:31.257 NotebookApp]  or http://127.0.0.1:8888/?token=5f809af8188893692b9f863c8db8159841e4768130891117
[I 04:54:31.257 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 04:54:31.891 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///C:/Users/masma/AppData/Roaming/jupyter/runtime/nbserver-7816-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=5f809af8188893692b9f863c8db8159841e4768130891117
     or http://127.0.0.1:8888/?token=5f809af8188893692b9f863c8db8159841e4768130891117
[I 04:54:36.293 NotebookApp] Kernel started: 0b477488-b6e8-433c-be95-42b46e49a125
[W 04:54:41.925 NotebookApp] 404 GET /api/kernels/6ed7af11-6e24-4a43-9c3f-b873b8db42db/channels?session_id=5ac847a8828f46bc8e3e2b7fa8e5955f (::1): Kernel does not exist: 6ed7af11-6e24-4a43-9c3f-b873b8db42db
[W 04:54:41.939 NotebookApp] 404 GET /api/kernels/6ed7af11-6e24-4a43-9c3f-b873b8db42db/channels?session_id=5ac847a8828f46bc8e3e2b7fa8e5955f (::1) 16.01ms referer=None
[I 04:54:46.510 NotebookApp] 302 GET /?token=5f809af8188893692b9f863c8db8159841e4768130891117[I%2004:54:31.257%20NotebookApp]%20Use%20Control-C%20to%20stop%20this%20server%20and%20shut%20down%20all%20kernels%20(twice%20to%20skip%20confir (127.0.0.1) 2.00ms
[I 04:54:46.584 NotebookApp] Saving file at /app.ipynb
[W 04:55:14.917 NotebookApp] Replacing stale connection: 6ed7af11-6e24-4a43-9c3f-b873b8db42db:5ac847a8828f46bc8e3e2b7fa8e5955f
[I 04:55:41.064 NotebookApp] Starting buffering for 0b477488-b6e8-433c-be95-42b46e49a125:fc57efbb9e6b40c3ab18bb4aec6e49f4
[I 04:55:42.401 NotebookApp] Kernel shutdown: 0b477488-b6e8-433c-be95-42b46e49a125
[I 04:55:45.803 NotebookApp] Kernel started: 4c1f0636-7239-43ae-b32b-d7c6a6b7fd19
[I 04:57:46.056 NotebookApp] Saving file at /app.ipynb
[W 05:00:19.905 NotebookApp] Replacing stale connection: 6ed7af11-6e24-4a43-9c3f-b873b8db42db:5ac847a8828f46bc8e3e2b7fa8e5955f

and this error message

Javascript error adding output!
Error: Mismatched anonymous define() module: function(){return function(){return function t(e,r,n){function a(o,s){if(!r[o]){if(!e[o]){var l="function"==typeof require&&require;if(!s&&l)return l(o,!0);if(i)return i(o,!0);var c=new Error("Cannot find module '"+o+"'");throw c.code="MODULE_NOT_FOUND",c}var u=r[o]={exports:{}};e[o][0].call(u.exports,function(t){return a(e[o][1][t]||t)},u,u.exports,t,e,r,n)}return r[o].exports}for(var i="function"==typeof require&&require,o=0;o<n.length;o++)a(n[o]);return a}}()({1:[function(t,e,r){"use strict";var n=t("../src/lib"),a={"X,X div":"direction:ltr;font-family:'Open Sans', verdana, arial, sans-serif;margin:0;padding:0;","X input,X button":"font-family:'Open Sans', verdana, arial, sans-serif;","X input:focus,X button:focus":"outline:none;","X a":"text-decoration:none;","X a:hover":"text-decoration:none;","X .crisp":"shape-rendering:crispEdges;","X .user-select-none":"-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;-o-user-select:none;user-select:none;","X svg":"overflow:hidden;","X svg a":"fill:#447adb;","X svg a:hover":"fill:#3c6dc5;","X .main-svg":"position:absolute;top:0;left:0;pointer-events:none;","X .main-svg .draglayer":"pointer-events:all;","X
....
....
....

@bryevdv
Copy link
Member

bryevdv commented Dec 10, 2019

OK For me (and probably @mattpap) to look into this we really need a pure Bokeh reproducer. I've tried to approximate the structure of this how I would in Bokeh. Does this run OK for you? (It does for me on Chrome and Safari) If so, what to do we need to do to it it get it closer to what Panel does? @philippjfr

"""## The Gallery Page of awesome-panel.org"""
from bokeh.io import curdoc
from bokeh.layouts import *
from bokeh.models import *

TEXT = "Some Header Text"

info = column(Div(text="Info"), sizing_mode="stretch_width")
bootdash = column(Div(text="Bootstrap Dashboard"), sizing_mode="stretch_width")
df = column(Div(text="DataFrame"), sizing_mode="stretch_width")

APPS = {"Info Alert": info, "Bootstrap Dashboard": bootdash, "DataFrame": df}

app_selection = Select(name="Select app", options=list(APPS.keys()), width=100)

layout = column(
    Div(text=TEXT),
    row(app_selection),
    info,
    sizing_mode="stretch_both",
)

def selected_app(attr, old, new):
       layout.children[2] = APPS[new]

app_selection.on_change('value', selected_app)

curdoc().add_root(layout)

@MarcSkovMadsen
Copy link
Contributor

I guess the problem is not so much there for the simple example. The trouble starts when you have an advanced layout with columns and rows inside of each other that are full width, responsive.

If I run the simple Panel version of your code as below the speed is acceptable although not impressive.

standalone_performance_ok

"""## The Gallery Page of awesome-panel.org"""
from panel import Column, depends, Spacer
from panel.widgets import Select
import panel as pn

from awesome_panel.express._pane._panes import Markdown
from awesome_panel.express.bootstrap import InfoAlert
from gallery.bootstrap_dashboard import components, app

TEXT = """\
# Awesome Panel Gallery ![Awesome Badge](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)

I hope this gallery can show case the power of Panel and inspire you as you build awesome analytics apps in Panel.

If you have an awesome tool or app you wan't to show case here you are very welcome.
You can do so via a [pull request](https://github.com/MarcSkovMadsen/awesome-panel/pulls)."""

INFO_TEXT = """\
Please **use FireFox, Safari or Edge** if you can.

Alternatively you can use Chrome - but it's
[slower](https://github.com/bokeh/bokeh/issues/9515).

This page does not render nicely in Internet Explorer and it's not supported.

Please **have patience** as some of the apps can take 10-30 seconds to load.
"""


def info():
    return Column(InfoAlert(text=INFO_TEXT), sizing_mode="stretch_width")


APPS = {"Info Alert": info, "Bootstrap Dashboard": app.main, "DataFrame": components.dataframe_view}


def view() -> Column:
    """The gallery view of awesome-panel.org"""
    app_selection = Select(name="Select app", options=list(APPS.keys()))

    @pn.depends(app_selection.param.value)
    def selected_app(value):
        return APPS[value]()

    return Column(
        Markdown(TEXT),
        app_selection,
        Spacer(height=50),
        selected_app,
        sizing_mode="stretch_both",
        name="Gallery",
    )


view().servable()

@mattpap
Copy link
Contributor

mattpap commented Dec 10, 2019

The layout engine struggles the more items you add with the button presses in Chrome, while in Firefox it hardly slows down.

I can't reproduce this. For me both are slow beyond being usable (after maybe 8-10 clicks). By the nature of things, layout should struggle more when more items are added, and by design it should struggle significantly more when more free form HTML items are added, which is the case here.

Bokeh's layout was designed as a fully managed layout, i.e. all layout computations are done in a JavaScript runtime and don't fundamentally require a CSS engine. We obviously have to interact to some extent with the CSS engine of a web browser to get real measurements (fonts, images, free form HTML, etc.). However, we don't have to or we can choose how much we depend on it. This way layout can work within HTML canvas, as well as (potentially) in environments where there is no DOM (and CSS) access at all (e.g. web workers, WASM), or even outside a web browser (e.g. node with node-canvas).

The side effect of this design is that mixing bokeh's layout with free form HTML is expensive. This is because the layout has to measure HTML fragments in various configurations and then remeasure when adjustments are needed. Each measurement results in a CSS reflow (layout), which is expensive on its own. If you add this up over dozens or hundreds of layout items, then you have performance characteristic as showed in this issue.

Having a big performance difference between FF and Chrome is unexpected. Maybe FF's CSS engine got suddenly better or Chrome has had a recent regression. Maybe there are platform specific differences that affect this. For long running tasks GC may be skewing the results. Long tasks should be manually split to give GC a breather (we already do this during initialization, because otherwise bokehjs would stall on large object graphs). Or the testing methodology is flawed in general (see below).

I suppose that layout could be improved in its current form, but ultimately we will need to look into different designs that would cooperate with CSS layout, instead of fighting against it.

However the difference seems mitigate when not using the debugger (...)

This may be the case as well. Performance analysis in web browsers is a tricky business and comparison of bare numbers between browsers should be done with care.

@philippjfr
Copy link
Contributor Author

Thanks for the detailed answer @mattpap. I understand that there are a lot of constraints on the layout engine and that measuring the size of contents is needed. I think investigating why Chrome is one approach we could pursue but I suspect that may just be a fundamental limitation or indeed some problem in my testing approach.

My other question then is whether we could help the layout engine out in some way, currently any modification to a layout triggers re-measurement of all the components which is obviously very expensive. Is there any chance that components could cache their sizes so that it doesn't have to re-measure all the time. I realize this is tricky because you have to make very sure that if the component is responsive, or the contents or some layout property changes the cache would have to be invalidated but I still think this might be worth investigating. Even manually declaring some part of the layout as fixed after the initial measurement might be sufficient. Let me know if you think if that's even worth investigating or if there are some fundamental reasons this can't work.

@philippjfr
Copy link
Contributor Author

philippjfr commented Dec 15, 2019

@mattpap I've been digging into the measurement code a bit to try to figure out if there are any easy gains to be made and I noticed something strange, it seems that for each call to Layoutable.measure the Layoutable._measure call is run exactly 4 times. In every scenario I've seen so far each of those calls returns exactly the same width and height values which makes it seem quite redundant.

Screen Shot 2019-12-15 at 6 22 38 PM

Is this on purpose? I can't really see where that happens in the code but if I cache the Sizable on the Layoutable on the first call to Layoutable._measure and return that on the subsequent three calls everything stays highly responsive and as far as I can tell the layout is no different. That said you clearly do have to invalidate the cache in certain cases, e.g. when the viewport size changes. It's not even just a 4x speedup, it's in excess of 10x (maybe because of GC).

@philippjfr
Copy link
Contributor Author

philippjfr commented Dec 15, 2019

Actually a slight addendum to my comment above, the number of times Layoutable.measure will call Layoutable._measure depends on the complexity of the layout, here's what it looks like for the example I posted above:

Screen Shot 2019-12-15 at 6 51 20 PM

The 10x speedup in this case is therefore not too surprising.

@philippjfr
Copy link
Contributor Author

Just to see how it'd perform I implemented a CachedVariadicBox here: holoviz/panel#867

It's giving me the performance gains I was hoping for and I've not seen any regressions being caused by it yet (although I suspect there are some in certain scenarios).

@mattpap
Copy link
Contributor

mattpap commented Dec 16, 2019

for each call to Layoutable.measure the Layoutable._measure call is run exactly 4 times.

_measure is called exactly once per measure. In fact the only reason for having _measure is code reuse. Otherwise these could have been one method. Thus I think you're misinterpreting the results, by conflating measure and _measure from different subclasses. That aside, caching seems an obvious way to improve speed, because this way you are carving out most of the purple intervals from the timeline, which represent CSS layout, the expensive part.

Caching measurements isn't a new idea. We already do this in plot layout, by caching side panels' measures, which are expensive to obtain, because canvas measurements are expensive in general and a complete set of necessary measurements actually requires CSS, and results in CSS layout, similarly to VariadicBox's behavior. The difference is that in plot's layout we can invalidate the cache reliably.

With free form HTML there is no good way to know when layout changed, unless we are implementing a CSS engine. Thus the most robust and also inefficient way is to remeasure. What I realize now, is that this makes layout only eventually correct and, what's worse, by proxy, i.e. something has to invalidate the layout for those measurements to take effect. So all the heavy lifting done by VariadicBox is for nothing if e.g. layout is computed only once. With caching, if the HTML isn't self-contained or has other side effects (e.g. JavaScript), we won't get the correct result at all, but at least we will get some result fast. Still, with the current design, caching is the way to go, but we will have to also invalidate the layout in common scenarios affecting CSS layout (lazily loaded images, maybe fonts, etc.).

In fact I originally planned to do exactly this, but complexity and lack of universality of this approach, made me proceed with the current implementation. Though I think that practical usability in this case is more important here than theoretical correctness.

@mattpap mattpap added this to the 2.0 milestone Dec 16, 2019
@philippjfr
Copy link
Contributor Author

philippjfr commented Dec 16, 2019

Yes, I started realizing that it can only have been called once per measure, I'm still not quite sure how to interpret the results. I definitely think there is excessive re-measuring being triggered from somewhere but I can't quite pin it down.

So all the heavy lifting done by VariadicBox is for nothing if e.g. layout is computed only once.

Right I did wonder about that, the layout would have to be invalidated for things to remeasure correctly. Indeed you can observe this by putting an image tag in a Div, laying it out next to another Div containing some text. At first they will overlap but if you resize the browser window the layout is invalidated and it recomputes laying them out correctly. Given that a relayout instantiates a new Layoutable do you think my approach in holoviz/panel#867 is safe given the current limitations?

@mattpap
Copy link
Contributor

mattpap commented Dec 16, 2019

(...) if you resize the browser window the layout is invalidated and it recomputes laying them out correctly. Given that a relayout instantiates a new Layoutable (...)

We need to start from defining the terms, which I've been using incorrectly myself. Layout can be either (re)computed or invalidated. There are other forms like resize, but they boil down to those two. The former doesn't instantiate layouts, whereas the later does. So pretty much everywhere I mentioned layout invalidation, I actually meant recomputation. It doesn't help that e.g. plot canvas uses those terms interchangeably.

This way your approach is correct, assuming no images, etc., as previously mentioned. Layout is invalidated when layout structure or properties change (on models), or when root element's parent HTML element's CSS display changes. Currently layout invalidation is clearly overzealous, but that can be improved on case-by-case basis, if needed. If invalidation happens, it will naturally result in cache invalidation, which is fine when e.g. changing HTML content of a markup widget. It's also suboptimal if it happens due to unrelated changes in the layout. Currently layout is updated naively (destroy the world approach), but that once again can be improved, e.g. by just applying a diff.

@philippjfr
Copy link
Contributor Author

Thanks that's very helpful!

Currently layout invalidation is clearly overzealous

To be clear though, I don't think it's just the layout invalidation that is overzealous. In one of my tests I put a counter on the VariadicBox to see how often _measure is called when you click the button. In my tests every time you clicked the button (and therefore triggered a layout invalidation) it was called 45 times for each of the Divs in the layout. You can see how that quickly adds up in the example since you are adding 10 Divs per click. After 10 clicks there are ~100 Divs on the page, each getting measured 45 times, totaling 4500 measurements.

@mattpap
Copy link
Contributor

mattpap commented Dec 16, 2019

In my tests every time you clicked the button (and therefore triggered a layout invalidation) it was called 45 times for each of the Divs in the layout.

It depends on the structure of a layout. If you have 10 divs (or really any layoutables) in a column, then each is measured 4 times. If you have a nested layout then you have to multiply this accordingly. This may be overzealous indeed, but to understand that, we need to know what kind of layouts we are talking about. Large flat grid layout will have very different performance characteristics from a small deeply nested one.

@mstump
Copy link

mstump commented Dec 19, 2019

I wanted to relay some findings and mitigation strategies from #9529. I've got a complex app that was composed of 4 logical "panels" that were arranged vertically as rows in a master layout and added to the document as a single call to doc.add_root.

Selecting a row in a DataTable would cause several plots to update. This resulted in 4 calls to compute_layout, each taking about 7.5 seconds for a total run time of 28-32 seconds during which the browser was frozen. The chrome profiling output can be found here.

Based on advice from @bryevdv I decomposed the main layout to into 7 columns and added each of these to the document as independent calls to doc.add_root. By doing so it reduced calls to compute_layout from 4 to 2, and those calls now take 1.5 and 1.2 seconds respectively. The chrome profiling output after the layout changes can be found here.

It's not a solution to slow layout computation, but it may provide a mitigation strategy for others.

@MarcSkovMadsen
Copy link
Contributor

MarcSkovMadsen commented Jan 30, 2020

My feedback is also that with the help of @philippjfr I had the basic layout (header, sidebar, main) of my app (awesome-panel.org) layout in a Template. Then the app became ok. And testing with the release candidate for Panel 0.8 I actually think the app is pretty fast.

I'm dynamically adding and removing layouts and panes. Something that I was originally told by @bryevdv that I could not expect to work. But it actually does and that's really awesome. 👍 .

One thing though is that I still don't really understand when the app will be fast and not. So I can just test and experiment to fix things. I have no (mental) model to follow or no docs to guide me.

These feature requests holoviz/panel#1012 and holoviz/panel#1010 would really help me and others (I believe).

Thanks.

@bryevdv
Copy link
Member

bryevdv commented Jan 30, 2020

I'm glad to hear of your successes @MarcSkovMadsen if you have some bandwidth in the next few months it would be really valuable to try and distill some small purposeful examples and documentation sections that could benefit all users.

@bryevdv
Copy link
Member

bryevdv commented Feb 22, 2020

@mattpap do you have an updated ETA for porting the optimization from panel? (Or @philippjfr can you point me at it to take a crack?) This is is the only thing remaining before RC1 can be built.

@philippjfr
Copy link
Contributor Author

The implementation of the CachedVariadicBox is here and is used here.

@mattpap
Copy link
Contributor

mattpap commented Feb 23, 2020

Implementing this isn't a breaking change, so I don't see a point in holding off RC. I will try to get to this tomorrow.

@MarcSkovMadsen
Copy link
Contributor

I'm glad to hear of your successes @MarcSkovMadsen if you have some bandwidth in the next few months it would be really valuable to try and distill some small purposeful examples and documentation sections that could benefit all users.

Hi @bryevdv

I try to list the things I find here https://awesome-panel.readthedocs.io/en/latest/performance.html. I am currently motivated by other things and don't have the band width to distill it nicely. But I think it is important.

@bryevdv bryevdv modified the milestones: 2.0, short-term Feb 26, 2020
@bryevdv
Copy link
Member

bryevdv commented Feb 26, 2020

Implementing this isn't a breaking change, so I don't see a point in holding off RC. I will try to get to this tomorrow.

@mattpap For the sake of improving communication, I will share my interpretation of this statement, which is: "There's no need to hold off on a release candidate but I will have it done tomorrow in time for the actual release (as was agreed in the last meeting)." If you did not intend to have this done for the release I will gently suggest that removing it from the milestone would have been a clear signal.

@bryevdv
Copy link
Member

bryevdv commented Apr 7, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants