Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make django-lifecycle much, much faster #64

Merged
merged 3 commits into from Oct 11, 2020
Merged

Conversation

dralley
Copy link
Contributor

@dralley dralley commented Sep 29, 2020

Some of the work that the lifecycle mixin is during the initialization of new model objects is very expensive and unnecessary. It's calculating (and caching) field names and foreign key models per-object, rather than per-class / model. All instances of a model are going to have the same field names and foreign key model types so this work actually only needs to be done once per model type.

Replacing cached methods with cached classmethods yields a very sizable performance improvement when creating a bunch of new model instances.

@dralley dralley force-pushed the master branch 2 times, most recently from eca83db to 194861d Compare September 30, 2020 00:21
@dralley
Copy link
Contributor Author

dralley commented Sep 30, 2020

I didn't measure this but I have to imagine that the deduplicated caches would save a lot of memory as well.

@dralley dralley force-pushed the master branch 3 times, most recently from a62b897 to b258b00 Compare September 30, 2020 02:27
Some of the work that the lifecycle mixin is during the initialization
of new model objects is very expensive and unnecessary. It's calculating
(and caching) field names and foreign key models per-object, rather than
per-class / model. All instances of a model are going to have the same
field names and foreign key model types so this work actually only needs
to be done once per model type.

Replacing cached methods with cached classmethods yields a very sizable
performance improvement when creating a bunch of new model instances.
@dralley
Copy link
Contributor Author

dralley commented Sep 30, 2020

I finally got the tests passing. The changes necessary to do so are in a separate commit in case it's easier to review that way.

The performance difference is pretty significant. Our code does a lot of heavy creating, saving, querying and deleting of model instances, many tens of thousands of them, and our speedup ranges from 40% to 65%. It's doing a bunch of other stuff too so it's not a good pure benchmark, but it shows that there are very tangible benefits in real-world code.

Here's some cProfile graphs showing before and after for different parts of our code.


Before:

before

After:

after


Before:

orphan_before

After:

orphan_after

@dralley
Copy link
Contributor Author

dralley commented Sep 30, 2020

(The graphs don't show every single function called, just the ones significant enough impact to be worth paying attention to. That's why the other blocks disappeared -- they're still being called, but only like <10 times instead of tens/hundreds of thousands of times like before, which means the impact is totally insignificant).

attr = getattr(self, name)
if ismethod(attr) and hasattr(attr, "_hooked"):
attr = getattr(cls, name)
if isfunction(attr) and hasattr(attr, "_hooked"):
Copy link
Contributor Author

@dralley dralley Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor semantic change right here - I think the type of the python object under the hood depends on whether you're looking at it from the instance perspective (where methods are special and self is implicitly passed along) or from the class perspective (where methods are just functions and there's no magic going on).

I don't think it's problematic but I want to call it out for a little extra attention.

@@ -196,15 +199,15 @@ def _run_hooked_methods(self, hook: str) -> List[str]:
if when_field:
if self._check_callback_conditions(when_field, callback_specs):
fired.append(method.__name__)
method()
method(self)
Copy link
Contributor Author

@dralley dralley Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flip side of that thing I just mentioned, it's returning functions not bound methods, so we can't rely on internal Python magic to pass self along behind the scenes. But functionally it's the same thing.

@dralley
Copy link
Contributor Author

dralley commented Oct 6, 2020

I can squash the commits once you're done reviewing also

@rsinger86
Copy link
Owner

This is great. Thank you very much.

@rsinger86 rsinger86 merged commit d2b50e1 into rsinger86:master Oct 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants