Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import dependencies lazily #1646

Closed
pickfire opened this issue Oct 2, 2023 · 5 comments
Closed

Import dependencies lazily #1646

pickfire opened this issue Oct 2, 2023 · 5 comments
Labels
performance Performance related issues/questions/PRs

Comments

@pickfire
Copy link

pickfire commented Oct 2, 2023

Describe the bug
Import huge dependencies like openpyxl from tablib is causing a huge import time taken on startup unnecessarily.

To Reproduce
Steps to reproduce the behavior:

  1. In sample project, run python -X importtime ./manage.py check 2> import.log
  2. View it with tuna https://pypi.org/project/tuna/
  3. Django import export is importing openpyxl (we used tablib[all]) which causes a huge chunk of time spent

Versions (please complete the following information):

  • Django Import Export: 3.2.0
  • Python 3.11.5
  • Django 4.1.10

Expected behavior

Importing formats should be done lazily and not during application startup.

Screenshots

Taken from our project, django-import-export is taking up the most time during startup.
image

Additional context

@pickfire pickfire added the bug label Oct 2, 2023
@andrewgy8
Copy link
Member

Hey! Any idea how to solve this?

Also, Im not sure this is an issue with Django import export. It seems to be an openpyxl issue since the other dependencies are not taking that much time. If the other dependencies were equally slow, that would be another story.

Having said that, it might be beneficial to allow users to import the depdendencies themselves by overriding the admin class where it is used.

@matthewhegarty matthewhegarty added performance Performance related issues/questions/PRs and removed bug labels Oct 2, 2023
@matthewhegarty
Copy link
Contributor

matthewhegarty commented Oct 2, 2023

Thanks for raising. I'm interested to know the context where this is an issue. It seems from the graph that the time to load these dependencies is <200ms. Do you have a requirement to shorten this startup time?

Do you have any suggestion as to how we could solve this, or what would be the best outcome for you? Would it work if you could select dependencies as described in #1459?

See also #1459 #1069

@bufke
Copy link

bufke commented Oct 3, 2023

I'm interested to know the context where this is an issue

Wow odd timing, I was just looking at why my celery beat instance uses so much ram and came here. Celery calls django.setup() which eventually runs this and adds 12mb. These little things add up.

A workaround for non web apps loading django might be to set a environment variable and conditionally add INSTALLED_APPS like import-export. Maybe there is a way to change imports here to make it just never happen. If I come up with a reasonable solution, I will post again. Right now I'm learning towards just tweaking INSTALLED_APPS as needed for the service type.

@matthewhegarty
Copy link
Contributor

Thanks for adding this. It helps us understand how folks are using this and the source of the issue. If it helps, I added this PR for our v4 release. It means that only basic dependencies are loaded by default (e.g. csv), and that others have to be explicitly declared during installation.

@matthewhegarty
Copy link
Contributor

After #1647, dependencies can be selected at library installation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related issues/questions/PRs
Projects
None yet
Development

No branches or pull requests

4 participants