Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow import time (http.server) #6901

Closed
1 task done
rafael-zilberman opened this issue Aug 31, 2022 · 15 comments · Fixed by #6903
Closed
1 task done

Slow import time (http.server) #6901

rafael-zilberman opened this issue Aug 31, 2022 · 15 comments · Fixed by #6903
Assignees
Labels
bug good first issue Good for newcomers Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/

Comments

@rafael-zilberman
Copy link

rafael-zilberman commented Aug 31, 2022

Describe the bug

While using aiohttp in lambda functions we discovered that the import of the aiohttp packages takes nearly half a second (even more, but in our case some of the dependencies of aiohttp were already cached).
image
image

To Reproduce

You can reprosude the import time diagram using the following bash commands:

pip install tuna
python -X importtime -c "import aiohttp" >> import.txt
tuna import.txt

image

Expected behavior

A lot of packages are dependent on aiohttp (aiobotocore, elasticsearch, and more...) and the import time is too long for lambda functions.

Logs/tracebacks

import time: self [us] | cumulative | imported package
import time:       714 |        714 | _frozen_importlib_external
import time:       875 |        875 |   time
import time:       549 |       1423 | zipimport
import time:       141 |        141 |     _codecs
import time:      2323 |       2464 |   codecs
import time:      1547 |       1547 |   encodings.aliases
import time:      3052 |       7062 | encodings
import time:       870 |        870 | encodings.utf_8
import time:       971 |        971 | encodings.cp1252
import time:       103 |        103 | _signal
import time:       892 |        892 | encodings.latin_1
import time:        69 |         69 |     _abc
import time:      1144 |       1212 |   abc
import time:      1388 |       2600 | io
import time:       197 |        197 |       _stat
import time:      1026 |       1222 |     stat
import time:      2129 |       2129 |     _collections_abc
import time:      1095 |       1095 |       genericpath
import time:      2514 |       3608 |     ntpath
import time:      2379 |       9337 |   os
import time:      1138 |       1138 |   _sitebuiltins
import time:       132 |        132 |     _locale
import time:      1163 |       1295 |   _bootlocale
import time:      1224 |       1224 |   types
import time:      1563 |       1563 |       warnings
import time:      2265 |       3827 |     importlib
import time:       997 |        997 |       importlib.machinery
import time:      1677 |       2674 |     importlib.abc
import time:       144 |        144 |           _operator
import time:      1559 |       1703 |         operator
import time:      1055 |       1055 |         keyword
import time:       123 |        123 |           _heapq
import time:      1952 |       2074 |         heapq
import time:       225 |        225 |         itertools
import time:      1192 |       1192 |         reprlib
import time:       966 |        966 |         _collections
import time:      4693 |      11905 |       collections
import time:       104 |        104 |         _functools
import time:      1891 |       1995 |       functools
import time:      2263 |      16162 |     contextlib
import time:      5297 |      27958 |   importlib.util
import time:      1296 |       1296 |     pywin32_system32
import time:      3651 |       4946 |   pywin32_bootstrap
import time:       750 |        750 |   sitecustomize
import time:       701 |        701 |   usercustomize
import time:     10919 |      58264 | site
import time:      1259 |       1259 |     collections.abc
import time:      2946 |       2946 |       enum
import time:       130 |        130 |         _sre
import time:      1535 |       1535 |           sre_constants
import time:      2732 |       4267 |         sre_parse
import time:     14121 |      18517 |       sre_compile
import time:      1429 |       1429 |       copyreg
import time:      4739 |      27630 |     re
import time:      4321 |      33209 |   typing
import time:      1494 |       1494 |       multidict._abc
import time:      4425 |       4425 |         platform
import time:      1085 |       1085 |           multidict._multidict_base
import time:      2352 |       3437 |         multidict._multidict
import time:      1701 |       9561 |       multidict._compat
import time:      2401 |      13456 |     multidict
import time:      2534 |      15990 |   aiohttp.hdrs
import time:      1150 |       1150 |           concurrent
import time:      1212 |       1212 |                     token
import time:      2655 |       3866 |                   tokenize
import time:      1344 |       5210 |                 linecache
import time:      2125 |       7334 |               traceback
import time:      1362 |       1362 |                 _weakrefset
import time:      1848 |       3209 |               weakref
import time:       110 |        110 |                 _string
import time:      2232 |       2342 |               string
import time:      2052 |       2052 |               threading
import time:        92 |         92 |               atexit
import time:      5852 |      20878 |             logging
import time:      2405 |      23283 |           concurrent.futures._base
import time:      1811 |      26243 |         concurrent.futures
import time:      3680 |       3680 |           _socket
import time:       248 |        248 |             math
import time:      1078 |       1078 |             select
import time:      2038 |       3363 |           selectors
import time:       141 |        141 |           errno
import time:      4273 |      11456 |         socket
import time:      1779 |       1779 |           signal
import time:       127 |        127 |           msvcrt
import time:       179 |        179 |           _winapi
import time:      3451 |       5535 |         subprocess
import time:      7550 |       7550 |           _ssl
import time:       137 |        137 |               _struct
import time:      1299 |       1435 |             struct
import time:       146 |        146 |             binascii
import time:      2566 |       4147 |           base64
import time:      6238 |      17934 |         ssl
import time:      1160 |       1160 |         asyncio.constants
import time:        83 |         83 |                 _opcode
import time:      1524 |       1606 |               opcode
import time:      2148 |       3753 |             dis
import time:      4072 |       7825 |           inspect
import time:      1238 |       1238 |             asyncio.format_helpers
import time:      1435 |       2673 |           asyncio.base_futures
import time:       958 |        958 |           asyncio.log
import time:      2191 |      13646 |         asyncio.coroutines
import time:        88 |         88 |             _contextvars
import time:      1316 |       1404 |           contextvars
import time:      1093 |       1093 |           asyncio.exceptions
import time:      1114 |       1114 |             asyncio.base_tasks
import time:      1860 |       2973 |           _asyncio
import time:      3008 |       8476 |         asyncio.events
import time:      1193 |       1193 |         asyncio.futures
import time:      1087 |       1087 |         asyncio.protocols
import time:      1253 |       1253 |           asyncio.transports
import time:      1504 |       2756 |         asyncio.sslproto
import time:      1230 |       1230 |           asyncio.locks
import time:      1252 |       1252 |           asyncio.tasks
import time:      1748 |       4230 |         asyncio.staggered
import time:      1156 |       1156 |         asyncio.trsock
import time:      8261 |     103126 |       asyncio.base_events
import time:      1258 |       1258 |       asyncio.runners
import time:      1177 |       1177 |       asyncio.queues
import time:      1290 |       1290 |       asyncio.streams
import time:      1216 |       1216 |       asyncio.subprocess
import time:      2351 |       2351 |         _overlapped
import time:      1468 |       1468 |         asyncio.base_subprocess
import time:      1656 |       1656 |         asyncio.proactor_events
import time:      1574 |       1574 |         asyncio.selector_events
import time:      1299 |       1299 |                 posixpath
import time:      1328 |       2627 |               fnmatch
import time:       176 |        176 |               zlib
import time:      1217 |       1217 |                 _compression
import time:      1538 |       1538 |                 _bz2
import time:      2114 |       4868 |               bz2
import time:      1268 |       1268 |                 _lzma
import time:      1824 |       3091 |               lzma
import time:       801 |        801 |               pwd
import time:       785 |        785 |               grp
import time:      7217 |      19562 |             shutil
import time:        78 |         78 |                 _bisect
import time:      1313 |       1391 |               bisect
import time:       101 |        101 |               _sha512
import time:       254 |        254 |               _random
import time:      5491 |       7236 |             random
import time:      2134 |      28931 |           tempfile
import time:      1567 |      30498 |         asyncio.windows_utils
import time:      3849 |      41394 |       asyncio.windows_events
import time:      5022 |     154480 |     asyncio
import time:      1588 |       1588 |       _hashlib
import time:       131 |        131 |       _blake2
import time:       128 |        128 |       _sha3
import time:      2500 |       4346 |     hashlib
import time:        96 |         96 |           _json
import time:      1660 |       1756 |         json.scanner
import time:      2492 |       4247 |       json.decoder
import time:      2780 |       2780 |       json.encoder
import time:      4439 |      11465 |     json
import time:      1202 |       1202 |       __future__
import time:       755 |        755 |                 org
import time:       455 |       1210 |               org.python
import time:      1673 |       2883 |             org.python.core
import time:      2185 |       5068 |           copy
import time:       776 |        776 |             _uuid
import time:      2074 |       2850 |           uuid
import time:      1061 |       1061 |           attr._config
import time:      1150 |       1150 |             attr.exceptions
import time:      1251 |       2401 |           attr.setters
import time:      1148 |       1148 |           attr._compat
import time:      8042 |      20567 |         attr._make
import time:      2222 |      22788 |       attr.converters
import time:      1250 |       1250 |       attr.filters
import time:      6467 |       6467 |       attr.validators
import time:      1149 |       1149 |       attr._funcs
import time:      1723 |       1723 |       attr._version_info
import time:      1156 |       1156 |       attr._next_gen
import time:      6857 |      42589 |     attr
import time:      2721 |       2721 |         ipaddress
import time:      1174 |       1174 |           urllib
import time:      2766 |       3940 |         urllib.parse
import time:      1448 |       1448 |           idna.package_data
import time:      1009 |       1009 |             idna.idnadata
import time:      1150 |       1150 |             unicodedata
import time:       901 |        901 |             idna.intranges
import time:      3530 |       6588 |           idna.core
import time:      2111 |      10145 |         idna
import time:       943 |        943 |           yarl._quoting_c
import time:       969 |       1911 |         yarl._quoting
import time:      2984 |      21699 |       yarl._url
import time:      1656 |      23355 |     yarl
import time:      1785 |       1785 |         http
import time:       169 |        169 |           _datetime
import time:      1948 |       2116 |         datetime
import time:       961 |        961 |           email
import time:      1557 |       1557 |               locale
import time:      1529 |       3086 |             calendar
import time:      1094 |       4180 |           email._parseaddr
import time:       739 |        739 |             email.base64mime
import time:       912 |        912 |             email.quoprimime
import time:      1028 |       1028 |             email.errors
import time:      1280 |       1280 |               quopri
import time:      1926 |       3205 |             email.encoders
import time:      4736 |      10618 |           email.charset
import time:      2612 |      18369 |         email.utils
import time:      2022 |       2022 |           html.entities
import time:      1753 |       3775 |         html
import time:      1434 |       1434 |                 email.header
import time:      1076 |       2509 |               email._policybase
import time:      1795 |       4304 |             email.feedparser
import time:      2348 |       6651 |           email.parser
import time:       914 |        914 |             uu
import time:       966 |        966 |             email._encoded_words
import time:       739 |        739 |             email.iterators
import time:      1894 |       4512 |           email.message
import time:      2734 |      13896 |         http.client
import time:      1041 |       1041 |         mimetypes
import time:      2082 |       2082 |         socketserver
import time:    238495 |     281554 |       http.server
import time:      1486 |       1486 |           pathlib
import time:      1131 |       2617 |         aiohttp.typedefs
import time:      1115 |       3731 |       aiohttp.http_exceptions
import time:       704 |        704 |           aiohttp.tcp_helpers
import time:       826 |       1529 |         aiohttp.base_protocol
import time:       852 |        852 |           cgi
import time:       828 |        828 |             shlex
import time:       842 |       1670 |           netrc
import time:       635 |        635 |               urllib.response
import time:       885 |       1520 |             urllib.error
import time:       752 |        752 |             nturl2path
import time:      2676 |       4947 |           urllib.request
import time:      1108 |       1108 |           async_timeout
import time:      1523 |       1523 |           typing_extensions
import time:       698 |        698 |           aiohttp.log
import time:       845 |        845 |           aiohttp._helpers
import time:      7605 |      19245 |         aiohttp.helpers
import time:      2323 |       2323 |             http.cookies
import time:      1486 |       3809 |           aiohttp.abc
import time:      1173 |       1173 |           aiohttp._http_writer
import time:      1522 |       6503 |         aiohttp.http_writer
import time:      1712 |       1712 |         aiohttp.streams
import time:       616 |        616 |         brotli
import time:       613 |        613 |           backports_abc
import time:      1564 |       2177 |         aiohttp._http_parser
import time:      4150 |      35929 |       aiohttp.http_parser
import time:       918 |        918 |         aiohttp._websocket
import time:      9480 |      10397 |       aiohttp.http_websocket
import time:      2135 |     333745 |     aiohttp.http
import time:      1897 |       1897 |     aiohttp.payload
import time:      1381 |       1381 |     aiohttp.client_exceptions
import time:      2132 |       2132 |       aiohttp.multipart
import time:       992 |        992 |       aiohttp.formdata
import time:       636 |        636 |       cchardet
import time:       927 |        927 |             chardet.enums
import time:       901 |        901 |             chardet.charsetprober
import time:      1303 |       3131 |           chardet.charsetgroupprober
import time:       916 |        916 |             chardet.codingstatemachine
import time:       843 |        843 |             chardet.escsm
import time:      1246 |       3003 |           chardet.escprober
import time:       885 |        885 |           chardet.latin1prober
import time:       906 |        906 |               chardet.mbcssm
import time:      1827 |       2733 |             chardet.utf8prober
import time:       857 |        857 |               chardet.mbcharsetprober
import time:      1042 |       1042 |                 chardet.euctwfreq
import time:       908 |        908 |                 chardet.euckrfreq
import time:       945 |        945 |                 chardet.gb2312freq
import time:      1030 |       1030 |                 chardet.big5freq
import time:      1000 |       1000 |                 chardet.jisfreq
import time:      1770 |       6693 |               chardet.chardistribution
import time:      1051 |       1051 |               chardet.jpcntx
import time:      1632 |      10231 |             chardet.sjisprober
import time:       866 |        866 |             chardet.eucjpprober
import time:       874 |        874 |             chardet.gb2312prober
import time:       864 |        864 |             chardet.euckrprober
import time:       846 |        846 |             chardet.cp949prober
import time:      1048 |       1048 |             chardet.big5prober
import time:       890 |        890 |             chardet.euctwprober
import time:      6301 |      24650 |           chardet.mbcsgroupprober
import time:       865 |        865 |             chardet.hebrewprober
import time:      1040 |       1040 |               chardet.sbcharsetprober
import time:      1263 |       2302 |             chardet.langbulgarianmodel
import time:      1050 |       1050 |             chardet.langgreekmodel
import time:      1087 |       1087 |             chardet.langhebrewmodel
import time:      1199 |       1199 |             chardet.langrussianmodel
import time:      1241 |       1241 |             chardet.langthaimodel
import time:      1099 |       1099 |             chardet.langturkishmodel
import time:      3687 |      12528 |           chardet.sbcsgroupprober
import time:      3395 |      47589 |         chardet.universaldetector
import time:       831 |        831 |         chardet.version
import time:      1725 |      50144 |       chardet
import time:      6613 |      60515 |     aiohttp.client_reqrep
import time:       997 |        997 |     aiohttp.client_ws
import time:      1103 |       1103 |       aiohttp.client_proto
import time:       848 |        848 |       aiohttp.locks
import time:       617 |        617 |         aiodns
import time:      2776 |       3393 |       aiohttp.resolver
import time:      3845 |       9187 |     aiohttp.connector
import time:      1037 |       1037 |         _compat_pickle
import time:       166 |        166 |         _pickle
import time:       587 |        587 |             org
import time:       182 |        768 |           org.python
import time:       522 |       1290 |         org.python.core
import time:      2887 |       5379 |       pickle
import time:      2579 |       7957 |     aiohttp.cookiejar
import time:       945 |        945 |           aiohttp._frozenlist
import time:      1175 |       2119 |         aiohttp.frozenlist
import time:      1334 |       3452 |       aiohttp.signals
import time:      9244 |      12696 |     aiohttp.tracing
import time:      7521 |     672127 |   aiohttp.client
import time:      1116 |       1116 |   aiohttp.payload_streamer
import time:       657 |        657 |       gunicorn
import time:       642 |       1298 |     gunicorn.config
import time:      4791 |       6089 |   aiohttp.worker
import time:      4018 |     732547 | aiohttp

Python Version

$ python --version
Python 3.8.9

aiohttp Version

$ python -m pip show aiohttp
3.7.4.post0

multidict Version

$ python -m pip show multidict
5.1.0

yarl Version

$ python -m pip show yarl
1.6.3

OS

Windows 11

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Aug 31, 2022

Already looked at. Will try to get back to merge it for 3.9: #6591
Should halve the import time (maybe more with your caching).

@rafael-zilberman
Copy link
Author

@Dreamsorcerer This issue reproduces even with the fix suggested in the pull request you mentioned

@Dreamsorcerer
Copy link
Member

Actually, your chart doesn't seem to show gunicorn using up all the import time. Is that the one you cached?
The rest of the import time is not nearly as significant, and I'm not sure there's any obvious ways to reduce it.

The biggest thing in your chart is importing http.server, which is stdlib, and accounts for nearly 40% of the import time. So, if you want to improve the import time substantially, I'd suggest seeing if you can make some improvements to http.server in cpython. 2nd biggest delay is importing asyncio, which is over 20% of the import time.

So, those 2 libraries from cpython account for ~60% of the import time in your chart. I don't see anything we can do here. You could also retry with a newer version of Python, to see if there are already improvements.

@rafael-zilberman
Copy link
Author

Actually, your chart doesn't seem to show gunicorn using up all the import time. Is that the one you cached? The rest of the import time is not nearly as significant, and I'm not sure there's any obvious ways to reduce it.

The biggest thing in your chart is importing http.server, which is stdlib, and accounts for nearly 40% of the import time. So, if you want to improve the import time substantially, I'd suggest seeing if you can make some improvements to http.server in cpython. 2nd biggest delay is importing asyncio, which is over 20% of the import time.

So, those 2 libraries from cpython account for ~60% of the import time in your chart. I don't see anything we can do here. You could also retry with a newer version of Python, to see if there are already improvements.

No, the one that already cached in the first graph is asyncio.

If we are only using aiohttp as a client why we need to import http.server?

@Dreamsorcerer
Copy link
Member

It appears to be referenced in 2 places, but 1 of them is here: https://github.com/aio-libs/aiohttp/blob/master/aiohttp/connector.py#L1230

Which looks like a client component.

However, the only thing being used is this:
https://github.com/python/cpython/blob/main/Lib/http/server.py#L626-L629
Which is just created from https://github.com/python/cpython/blob/main/Lib/http/__init__.py#L7

So, we could probably refactor that code to use http.HTTPStatus directly, and then not import http.server anymore.
If you've got time to work on that, it would be great. I doubt myself or the other maintainers will find time to work on it anytime soon.

@Dreamsorcerer Dreamsorcerer reopened this Aug 31, 2022
@Dreamsorcerer Dreamsorcerer changed the title Slow import time Slow import time (http.server) Aug 31, 2022
@Dreamsorcerer Dreamsorcerer added good first issue Good for newcomers Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/ labels Aug 31, 2022
@Zeesky-code
Copy link
Contributor

Hello, I'll like to be assigned to this issue. 😊
However, just to be clear, is it https://github.com/python/cpython/blob/main/Lib/http/server.py#L626-L629 that needs to be refactored?

@Dreamsorcerer
Copy link
Member

Well, no, that's cpython, not aiohttp. :P

What we currently use is defined at: https://github.com/aio-libs/aiohttp/blob/master/aiohttp/http.py#L70
I'm not sure we need to define it in that file at all, so I'd suggest just removing that (and the import http.server).
Then find the 2 places in the code which use RESPONSES and refactor them to use HTTPStatus (imported with from http import HTTPStatus).

@Zeesky-code
Copy link
Contributor

Alright, thanks for the clarification.
I'll get right on it.

@Dreamsorcerer
Copy link
Member

@rafael-zilberman If you could verify that #6903 fixes the import time for you, that would be great.

@rafael-zilberman
Copy link
Author

rafael-zilberman commented Sep 4, 2022

Yep, much better now.
@Dreamsorcerer
@Zeesky-code
0.27s vs 0.8s (~3x times faster)
image

@Zeesky-code
Copy link
Contributor

Happy to help. 😊

@rafael-zilberman
Copy link
Author

@Dreamsorcerer
When @Zeesky-code 's fix can be merged?

@webknjaz
Copy link
Member

@rafael-zilberman there's a few things that need addressing before that can happen.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Nov 21, 2022

The weird thing is when I try it on my machine, the branch seems to always be a bit slower, but I can't see why.

I can see that email.parser, which seems to be a decent chunk of http.server import time, is still imported via aiohttp.helpers, so some import time is still used by that. But, it should still load atleast a little faster...

branch:
Screenshot from 2022-11-21 22-22-46
master:
Screenshot from 2022-11-21 22-23-03

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Nov 21, 2022

Seems like even things like brotli and urllib.request import slower when I'm on that branch, which just doesn't make any sense to me. I'm inclined to say it's just something weird happening on my machine, but would be great if 1 more person could double check the difference between master and the linked branch. You can use python3 -X importtime -c 'import aiohttp' &> importtime and view with tuna importtime, or alternatively run the test with pytest tests/test_imports.py::test_import_time -s (you should see 3 import times in stdout).

There might be a hint of improvement visible in the CI, but it seems pretty insignificant.

Dreamsorcerer added a commit that referenced this issue Nov 28, 2022
## What do these changes do?

Refactor code to use HTTPStatus instead of http.server

Fixes #6901

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sam Bull <aa6bs0@sambull.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug good first issue Good for newcomers Hacktoberfest We think it's good for https://hacktoberfest.digitalocean.com/
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants