-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict to unique distributions in entry points #281
Conversation
…ique distributions. Fixes #280.
In this latest commit, I add 2b64fa2 against the main branch to test performance of entry_points before the patch and now after. On my 2019 Macbook Pro, I see 9.5ms as the baseline and 28ms after the patch, a 200% increase. |
I'd guess the way to avoid the large performance penalty is along the lines of (simplified here) diff --git i/importlib_metadata/__init__.py w/importlib_metadata/__init__.py
index 4c420a5..2a90b6a 100644
--- i/importlib_metadata/__init__.py
+++ w/importlib_metadata/__init__.py
@@ -590,6 +590,7 @@ class PathDistribution(Distribution):
.joinpath(), __div__, .parent, and .read_text().
"""
self._path = path
+ self._normalized_name = path.stem.partition("-")[0]
def read_text(self, filename):
with suppress(
@@ -648,7 +649,7 @@ def entry_points():
:return: EntryPoint objects for all installed packages.
"""
- unique = functools.partial(unique_everseen, key=operator.attrgetter('name'))
+ unique = functools.partial(unique_everseen, key=operator.attrgetter('_normalized_name'))
eps = itertools.chain.from_iterable(
dist.entry_points for dist in unique(distributions())
) (on top of your PR) Also, orthogonally, entry_points.txt parsing itself can be significantly sped up. I'd guess that best would be a hand-rolled parser as the format is actually much less general than everything ConfigParser supports (I didn't find a spec for entry_points.txt, but can it e.g. ever use https://docs.python.org/3/library/configparser.html#interpolation-of-values?), but a quick fix that already yields large gains (~2x speed up on diff --git i/importlib_metadata/__init__.py w/importlib_metadata/__init__.py
index fac3063..257e19e 100644
--- i/importlib_metadata/__init__.py
+++ w/importlib_metadata/__init__.py
@@ -87,6 +87,10 @@ class EntryPoint(
dist: Optional['Distribution'] = None
+ cp = ConfigParser(delimiters='=')
+ # case sensitive: https://stackoverflow.com/q/1611799/812183
+ cp.optionxform = str
+
def load(self):
"""Load the entry point from its definition. If only a module
is indicated by the value, return that module. Otherwise,
@@ -122,11 +126,9 @@ class EntryPoint(
@classmethod
def _from_text(cls, text):
- config = ConfigParser(delimiters='=')
- # case sensitive: https://stackoverflow.com/q/1611799/812183
- config.optionxform = str
- config.read_string(text)
- return cls._from_config(config)
+ cls.cp.clear()
+ cls.cp.read_string(text)
+ return cls._from_config(cls.cp)
@classmethod
def _from_text_for(cls, text, dist): Meanwhile, a hand-rolled parser such as @classmethod
def _from_text(cls, text):
- config = ConfigParser(delimiters='=')
- # case sensitive: https://stackoverflow.com/q/1611799/812183
- config.optionxform = str
- config.read_string(text)
- return cls._from_config(config)
+ if not text:
+ return
+ group = None
+ for line in filter(None, map(str.strip, text.splitlines())):
+ if line.startswith("["):
+ group = line[1:-1]
+ else:
+ name, value = map(str.strip, line.split("=", 1))
+ yield cls(name, value, group) gives you another nearly 2-fold speedup (it depends on how well you want to handle malformed inputs, of course, and this is assuming we indeed don't need to support interpolation or anything fancy). (edit: actually, half of the hand-rolled implementation could even be shared with the implementation of Distribution._read_sections.) |
Could it fall back to ConfigParser in those cases? |
Yes, but we'd need to actually detect if interpolation is being used (for which I guess |
I'm all but certain that interpolation isn't expected. In fact, that's how pkg_resources parses it. So it's definitely safe not to rely on ConfigParser. For this PR, I'm going to accept it as written, and we can work on optimization techniques subsequently. |
Closes #280.