Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import slowdown after v1 #1127

Closed
layday opened this issue Dec 24, 2019 · 16 comments · Fixed by #1132
Closed

Import slowdown after v1 #1127

layday opened this issue Dec 24, 2019 · 16 comments · Fixed by #1132

Comments

@layday
Copy link
Contributor

layday commented Dec 24, 2019

After upgrading to v1, I've noticed that it takes almost double the time to import Pydantic. Observe:

$ python3 -m pip install -U 'pydantic<1'
[...]
Successfully installed pydantic-0.32.2

$ seq 2 | xargs -I -- python3 -X importtime -c '
  import pydantic
  print("pydantic version:", pydantic.VERSION)
  ' 2>&1 | begin
      head -n 1
      tail -n 2
  end
import time: self [us] | cumulative | imported package
import time:      1058 |      70730 | pydantic
pydantic version: 0.32.2

$ python3 -m pip install -U pydantic
[...]
Successfully installed pydantic-1.3

$ seq 2 | xargs -I -- python3 -X importtime -c '
  import pydantic
  print("pydantic version:", pydantic.VERSION)
  ' 2>&1 | begin
      head -n 1
      tail -n 2
  end
import time: self [us] | cumulative | imported package
import time:       802 |     128663 | pydantic
pydantic version: 1.3

v0.32.2 takes approximately 70ms to import on my MBP; v1.3 takes 130ms.

attrs for comparison takes 36ms:

$ seq 2 | xargs -I -- python3 -X importtime -c '
  import attr
  print("attrs version:", attr.__version__)
  ' 2>&1 | begin
      head -n 1
      tail -n 2
  end
import time: self [us] | cumulative | imported package
import time:       882 |      35659 | attr
attrs version: 19.3.0

I can understand that this might not be something you'd want to spend time on, depending on what your direction for Pydantic is; but slow warmups have been plaguing Python CLIs for a very long time. If you'd like to investigate this further, tuna can help you visualise your dependency tree. Here's what it's showing me for tuna (python3 -X importtime -c 'import pydantic' 2>| psub):

Screenshot_2019-12-24 tuna - var folders 18 r2nky_v137d3fwhtx0644m940000gn T psub WY0ulb7Mci

There is the unfortunate practice in Python of importing the entire package's contents in __init__ and (unfortunately) once you've done that, there's no going back - not without a breaking release anyway.

As always, thanks for all your work on Pydantic.

@layday layday added the bug V1 Bug related to Pydantic V1.X label Dec 24, 2019
@layday
Copy link
Contributor Author

layday commented Dec 24, 2019

(I think the label was added automatically because I selected the 'bug' option when I was opening the issue.)

@samuelcolvin
Copy link
Member

See #1124, we can move more things out of __init__.py in v2.

Very weird that network.py takes so long to load. Can we get any more details on why that is?

@samuelcolvin samuelcolvin added Performance and removed bug V1 Bug related to Pydantic V1.X labels Dec 24, 2019
@layday
Copy link
Contributor Author

layday commented Dec 24, 2019

It appears to be caused, in no small part, by the pre-compilation of this particular regex with a range containing 262,057 characters:

_domain_ending = r'(?P<tld>\.[a-z]{2,63})?\.?'
_int_chunk = r'[_0-9a-\U00040000](?:[-_0-9a-\U00040000]{0,61}[_0-9a-\U00040000])?'
int_domain_regex = re.compile(fr'(?:{_int_chunk}\.)*?{_int_chunk}{_domain_ending}', re.IGNORECASE)

I think the re module might've not been optimised for this particular use case ;)

@layday
Copy link
Contributor Author

layday commented Dec 24, 2019

Just out of curiosity, I tried to compile it using regex and it only took 1ms, compared to ~70ms for re. I wonder if this should be reported upstream.

@samuelcolvin
Copy link
Member

Yes probably, though I doubt you'll get a very helpful response.

I think the solution for us is to compile this regex in a lazy way. Eg, only create it when a field using it is used.

@layday
Copy link
Contributor Author

layday commented Dec 25, 2019

Regexes are compiled and cached on first use by the re module. The issue is that regex flags aren't part of the regex itself in Python so people are drawn to re.compile as a means to attach flags to their regexes. I was wrong - you can actually do this in Python, meaning you can replace

int_domain_regex = re.compile(fr'(?:{_int_chunk}\.)*?{_int_chunk}{_domain_ending}', re.IGNORECASE)

with

int_domain_regex = fr'(?i)(?:{_int_chunk}\.)*?{_int_chunk}{_domain_ending}'

and then use it like re.fullmatch(int_domain_regex, ...).

@samuelcolvin
Copy link
Member

I agree, but I think better to explicitly generate and cache, I'll create a PR and let you check.

@samuelcolvin
Copy link
Member

thanks for letting me know about tuna, very useful.

See #1132 for a fix on most things.

We could also move compiled to utils and wrap it in a function is_compiled() to have improve things a bit more, but won't make a massive difference.

@layday
Copy link
Contributor Author

layday commented Dec 27, 2019

The improvement isn't as dramatic for me - it's shaved off about 40ms - but it's definitely noticeable. Thanks for working on this.

@samuelcolvin
Copy link
Member

Are you comparing it to current master uncompiled?

Could you show me tuna images (or import logs) for before and after #1132?

@layday
Copy link
Contributor Author

layday commented Dec 27, 2019

I'm comparing it against 1.3:

Screenshot_2019-12-27 tuna - var folders 18 r2nky_v137d3fwhtx0644m940000gn T psub jKKXC1dO8r

import-speedup:

Screenshot_2019-12-27 tuna - var folders 18 r2nky_v137d3fwhtx0644m940000gn T psub 5NqqDeCKru

These tend to vary by +/-20 ms. It was a little slower earlier - I think that might've been before your latest commit to the branch.

@samuelcolvin
Copy link
Member

samuelcolvin commented Dec 27, 2019

Okay, but is v1.3 compiled or simple python? You can check with utils.version_info()

@samuelcolvin
Copy link
Member

python -c "import pydantic.utils; print(pydantic.utils.version_info())"

@layday
Copy link
Contributor Author

layday commented Dec 27, 2019

Ah right, no, it wasn't compiled. I installed Cython and built 1.3 and #1132 from sdist and it doesn't look it's made much of a difference - still measuring at about 140 ms and 80 ms respectively.

             pydantic version: 1.3
            pydantic compiled: True
                 install path: [...]
               python version: 3.8.0 (default, Dec 12 2019, 11:47:45)  [Clang 7.1.0 (tags/RELEASE_710/final)]
                     platform: macOS-10.14.6-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

@samuelcolvin
Copy link
Member

Okay weird, I got a bigger difference.

@samuelcolvin
Copy link
Member

see my comments on #1132, comment there with any suggestions on how to improve the change, otherwise I'll merge that.

samuelcolvin added a commit that referenced this issue Jan 2, 2020
* improve pydantic import time, fix #1127, fix #1039

* more tweaks

* tweak utils.py

* defering inspect
andreshndz pushed a commit to cuenca-mx/pydantic that referenced this issue Jan 17, 2020
* improve pydantic import time, fix pydantic#1127, fix pydantic#1039

* more tweaks

* tweak utils.py

* defering inspect
alexdrydew pushed a commit to alexdrydew/pydantic that referenced this issue Dec 23, 2023
Co-authored-by: sydney-runkle <sydneymarierunkle@gmail.com>
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
Co-authored-by: Samuel Colvin <s@muelcolvin.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants