Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plotly on jupyter creates HUGE notebooks file sizes #5056

Open
urirosenberg opened this issue Feb 26, 2025 · 10 comments · May be fixed by #5096
Open

plotly on jupyter creates HUGE notebooks file sizes #5056

urirosenberg opened this issue Feb 26, 2025 · 10 comments · May be fixed by #5096
Assignees
Labels
bug something broken P1 needed for current cycle performance something is slow

Comments

@urirosenberg
Copy link

Hi, we are seeing an issue with plotly v6.0.0 where running in jupyter creates HUGE notebook files, resulting in crashed kernel and inability to save.

Sample code to reproduce the issue (full example can be found here):

import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np

# Create sample data
dates = pd.date_range(start='2024-01-01', end='2024-12-31', freq='M')
values = np.random.randn(12).cumsum()
categories = ['A', 'B', 'C', 'D']
values_bar = [23, 45, 56, 78]

# Line Chart
fig1 = px.line(x=dates, y=values, 
               title='Monthly Trend 2024',
               labels={'x': 'Date', 'y': 'Value'})
fig1.show()

# Scatter Plot with custom styling
fig2 = go.Figure()
fig2.add_trace(go.Scatter(
    x=np.random.randn(100),
    y=np.random.randn(100),
    mode='markers',
    marker=dict(
        size=10,
        color=np.random.randn(100),
        colorscale='Viridis',
        showscale=True
    )
))
fig2.update_layout(title='Scatter Plot with Color Scale')
fig2.show()

# Bar Chart
fig3 = px.bar(x=categories, y=values_bar,
              title='Category Distribution',
              labels={'x': 'Category', 'y': 'Value'},
              color=values_bar,
              color_continuous_scale='Reds')
fig3.show()

Using plotly version 5.24.1, this generates a >1M file.
Using plotly version 6.0.0, this generates a 19M file.

Looking at the v6.0.0 notebook source json file, I see massive amounts of JS code.

@gvwilson gvwilson added bug something broken P2 considered for next cycle performance something is slow labels Feb 28, 2025
@gvwilson
Copy link
Contributor

thanks for the report @urirosenberg - I'll try to get someone to dig into this in the next cycle. (And thanks for providing the notebook - that will help a lot.)

@marthacryan
Copy link
Collaborator

Hey @urirosenberg thanks for this! I'm not able to reproduce this, could you send more info about the environment you're running this in?

  • Are you running this in JupyterLab, Jupyter notebook, VS Code, or somewhere else? And what version of that are you using?
  • Are you running this on a mac? PC? linux?
  • What version of python are you using? Are you in a virtual environment? If so which one?

@pfebrer
Copy link

pfebrer commented Mar 7, 2025

I am seeing the same thing in jupyter notebooks with the following environment:

python == 3.9.21
plotly == 6.0.0
notebook == 7.3.2

There are two problems with plotly==6 that I could find:

  • When rendering the first plot, plotly==6 includes TWICE the full plotly.js library.
  • When rendering other plots, plotly==6 includes the full plotly.js library for each extra plot.

This can be very simply tested in a clean new notebook by running:

import plotly.graph_objects as go

go.Figure()

and then File > Save and Export notebook as > HTML.

If you do this for plotly==6 and plotly==5.24.1, you can then grep for the header of the plotly.js library and you get:

Image

If you render another empty figure in another cell, you get:

Image

This is indeed a huge problem as @urirosenberg says. In our case, we render notebooks with ~40 plotly plots in our documentation site and with plotly==6.0.0 readthedocs was refusing to host it because some html files where 150MB. We had to downgrade to plotly 5.

@pfebrer
Copy link

pfebrer commented Mar 7, 2025

For anyone getting here, a temporary workaround if you don't need to see your notebook offline is to change the default renderer at the top of the notebook:

import plotly.io as pio

pio.renderers.default = "notebook_connected"

@pfebrer
Copy link

pfebrer commented Mar 7, 2025

If it helps with debugging, for the case of plotly==6 with two figures, first it gets loaded as <script type="module"> and then for each figure it gets loaded as <script type="text/javascript">:

grep -B 1 "plotly.js v" plotly6_2emptyfigs.html

Image

@gvwilson gvwilson added P1 needed for current cycle and removed P2 considered for next cycle labels Mar 8, 2025
@gvwilson
Copy link
Contributor

gvwilson commented Mar 8, 2025

@marthacryan please have a look when you can.

@marthacryan
Copy link
Collaborator

I think the best solution here is to set plotly_mimetype+notebook_connected to the default renderer in notebook environments. The issue here is that we're using the "offline" mode for the renderer. I can open a PR to do that unless there are objections! cc @emilykl @LiamConnors @gvwilson

@gvwilson
Copy link
Contributor

no objection from me - thank you

@marthacryan marthacryan linked a pull request Mar 17, 2025 that will close this issue
@pfebrer
Copy link

pfebrer commented Mar 17, 2025

Hi, I can't say whether changing the default to notebook_connected is a good idea or not, but in my opinion if that is the applied change this issue shouldn't be closed.

It would be a bit sweeping the issue under the rug. And the issue itself doesn't look particularly hard to tackle (maybe I'm missing something that makes it more complex than it seems).

@hlvlad
Copy link

hlvlad commented Mar 19, 2025

I agree with @pfebrer, in my case, I need offline mode to be able to send reports via mail. Pre 6.0.0 version produced reports of roughly 5MB, and 6.0.0+ version produces 24MB plots, which is almost 5x regression in size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken P1 needed for current cycle performance something is slow
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants