New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamlit startup time could be reduced from 1s to 400ms #6066
Comments
Thanks for analyzing this! Some time ago I opened enhancement request #5798 with the same goal in mind (reduce startup time) but mostly focused on the size and quantity of Streamlit core static files. Sharing here because looking at both (python imports and static files) might be worthwhile. |
Thanks for the investigation 👍 I did a quick check. I think besides the Plotly theme, all other proposed changes are viable. As far as I remember, in order to apply our Streamlit chart theme in the frontend we need to apply changes to the global Plotly theme before any chart object is created (fyi @willhuang1997). Regarding Altair theme: removing it might break a few apps (that's why we kept it in), but it was never an official feature anyways. So, I assume this isn't a problem. |
We can always do this in parts: first the easy ones, then the themes (if possible). |
## Describe your changes The emoji data is the biggest object when running a blank Streamlit app. Compiling the regex is also slightly expensive. However, the emoji data is only required if there is a check for emojis; many apps might not require this. Therefore, this PR makes the emoji module to lazy load only if it is actually required. This also adds a precheck for emoji checks to make sure that the string even contains non alphanumeric characters before using the more expensive emoji regex. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added e2e test to check that some lazy-loaded modules are not imported in an almost blank Streamlit app. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes A small refactoring related to the creation of the plotly theme. Instead of using a module for the plotly theme, we are capturing the creation in a method to make it slightly better behave to enable lazy-loading for plotly. There aren't any other logical changes in here related to the Plotly theme. This refactor also applies a couple of other small refactorings related to the imports. Related to #6066 --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes If a user configures `server.fileWatcherType` to none or poll, there isn't any reason we need to import the watchdog package, even if it is installed on the system. This PR applies a couple of small refactorings to make the `event_based_path_watcher` lazy-loaded (only importer if actually needed). ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added e2e test to make sure that `watchdog` and `streamlit.watcher.event_based_path_watcher` are lazy loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes The vendored pympler module is only used when someone explicitly requests the metrics via the metrics endpoint. This PR moves the module to lazyloading. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added e2e test to ensure that vendored `pympler` module is lazy loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
Removed a couple of dependencies:
And lazy-loaded a couple of other modules:
This allows to reduce the import time to as low as ~200ms: And will enable significant speed ups related to the loading time of stlite 🥳 We also now have an e2e test to make sure that we don't accidentally import any of the lazy-loaded packages. |
## Describe your changes Lazy-load `pandas` and `pyarrow` only when required (e.g. usage of `st.dataframe`). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added e2e test to ensure that `pyarrow` and `pandas` are lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes The deprecation of the `runner.fixMatplotlib` and the decision to always use the `Agg` backend, made it possible to just configure the matplotlib backend via the config option (also see the previous TODO comment). This prevents an unnecessary import of matplotlib at the server start and allows to lazy load this import. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added unit and e2e tests to make sure that `matplotlib` is properly lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes Lazy-load `numpy` and `pillow` only when required (e.g. usage of `st.image`). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Added e2e test to ensure that `numpy` and `pillow` are lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes Lazy-load `click` and `toml` dependencies. `click` will always be loaded when Streamlit is started via its CLI which it will be in most use cases. But it can also run without having `click` installed if the app is not started via CLI (e.g. in stlite). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to #6066 ## Testing Plan - Update unit tests and add `toml` to e2e test to check that its not loaded yet. - We cannot do the same for `click`, since the e2e tests use the CLI to start Streamlit. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes This is the final lazy-loading PR for now. It lazy-loads the following modules: - `unittest` - `packaging` - `streamlit.proto.openmetrics_data_model_pb2` This PR also includes a couple of other refactorings related to typing and imports. ## GitHub Issue Link (if applicable) - Closes #6066 ## Testing Plan - Add lazy-loaded modules to e2e test. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes The emoji data is the biggest object when running a blank Streamlit app. Compiling the regex is also slightly expensive. However, the emoji data is only required if there is a check for emojis; many apps might not require this. Therefore, this PR makes the emoji module to lazy load only if it is actually required. This also adds a precheck for emoji checks to make sure that the string even contains non alphanumeric characters before using the more expensive emoji regex. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added e2e test to check that some lazy-loaded modules are not imported in an almost blank Streamlit app. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes A small refactoring related to the creation of the plotly theme. Instead of using a module for the plotly theme, we are capturing the creation in a method to make it slightly better behave to enable lazy-loading for plotly. There aren't any other logical changes in here related to the Plotly theme. This refactor also applies a couple of other small refactorings related to the imports. Related to streamlit#6066 --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes If a user configures `server.fileWatcherType` to none or poll, there isn't any reason we need to import the watchdog package, even if it is installed on the system. This PR applies a couple of small refactorings to make the `event_based_path_watcher` lazy-loaded (only importer if actually needed). ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added e2e test to make sure that `watchdog` and `streamlit.watcher.event_based_path_watcher` are lazy loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes The vendored pympler module is only used when someone explicitly requests the metrics via the metrics endpoint. This PR moves the module to lazyloading. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added e2e test to ensure that vendored `pympler` module is lazy loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes Lazy-load `pandas` and `pyarrow` only when required (e.g. usage of `st.dataframe`). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added e2e test to ensure that `pyarrow` and `pandas` are lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes The deprecation of the `runner.fixMatplotlib` and the decision to always use the `Agg` backend, made it possible to just configure the matplotlib backend via the config option (also see the previous TODO comment). This prevents an unnecessary import of matplotlib at the server start and allows to lazy load this import. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added unit and e2e tests to make sure that `matplotlib` is properly lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes Lazy-load `numpy` and `pillow` only when required (e.g. usage of `st.image`). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Added e2e test to ensure that `numpy` and `pillow` are lazy-loaded. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
## Describe your changes Lazy-load `click` and `toml` dependencies. `click` will always be loaded when Streamlit is started via its CLI which it will be in most use cases. But it can also run without having `click` installed if the app is not started via CLI (e.g. in stlite). This PR also includes a couple of other small refactorings related to typing and imports. ## GitHub Issue Link (if applicable) Related to streamlit#6066 ## Testing Plan - Update unit tests and add `toml` to e2e test to check that its not loaded yet. - We cannot do the same for `click`, since the e2e tests use the CLI to start Streamlit. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
) ## Describe your changes This is the final lazy-loading PR for now. It lazy-loads the following modules: - `unittest` - `packaging` - `streamlit.proto.openmetrics_data_model_pb2` This PR also includes a couple of other refactorings related to typing and imports. ## GitHub Issue Link (if applicable) - Closes streamlit#6066 ## Testing Plan - Add lazy-loaded modules to e2e test. --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
Problem
The time between
streamlit run foo.py
and having a blank app load in the browser feels a bit slow nowadays!Based on a quick analysis, a simple way to cut startup time by 60% would be to lazy-load certain imports.
Methodology
python -X importtime -c 'import streamlit' 2> latency.log
Interestingly, even if you run this multiple times, there's no real change in the numbers.
Machine: M2 Macbook Pro
Results
The result of step 2 above is attached as latency.log.
Also, here's a pretty chart:
Findings
Streamlit takes 1s to initialize:
Major culprits:
Importing pandas inside type_util: 240ms
Importing pandas.style in st.arrow: 209ms
Setting up the plotly theme: 49ms
Importing altair inside of arrow_altair (mostly to set up theme): 48ms
Loading the validators library: 40ms
Importing requests inside streamlit.version: 52ms
String util is slow to import probably due to emoji computation: 27ms
Proposal
import
out of the file's root scope and into the actual scope where it's used.EMOJI_EXTRACTION_REGEX
into afunctools.cache
'd function.requests
(52ms) -- but it's unclear whether this actually causes a nontrivial number of people to actually upgrade.Community voting on feature requests enables the Streamlit team to understand which features are most important to our users.
If you'd like the Streamlit team to prioritize this feature request, please use the 👍 (thumbs up emoji) reaction in response to the initial post.
The text was updated successfully, but these errors were encountered: