Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy initialization of axis default tick list #9727

Conversation

timhoffm
Copy link
Member

@timhoffm timhoffm commented Nov 9, 2017

PR Summary

Improves #6664 by lazily initializing the axis ticks.

Implementation details

Instead of initializing self.majorTicks and self.minorTicks to 1-element default lists, we use LazyDefaultTickList as a proxy, that instantiates the default list and replaces itself with it on access.

This is almost a drop-in replacement. The only thing I didn't get to work transparently is mimicking concatenation. Where self.majorTicks + self.minorTicks works for lists, it does not for the LazyDefaultTickList. There's one occurence of this in the code. It could be replaced by list(self.majorTicks) + list(self.minorTicks), but I chose to rewrite this using iterator

I assume that self.majorTicks and self.minorTicks are considered private (they do not show in the docs) and that they should only be modified using the methods of Axis. Therefore, I think it's ok that self.majorTicks + self.minorTicks does not work with LazyDefaultTickList.

Performace Gain

Instatiation of a plot with many axes has become faster by a factor of three.

before change:
grafik

after change:
grafik

As a nice side effect, the test suite runs 10% faster after the change.

Tests

Since this is an drop-in replacement, I don't think that it needs extra tests. I ran the test suite and saw no difference. Before and after the change I had 4 failing tests, so I suppose their failure is unrelated.

PR Checklist

  • Code is PEP 8 compliant

Other checklist items do not apply.

@dstansby dstansby added this to the v2.2 milestone Nov 9, 2017
QuLogic
QuLogic previously requested changes Nov 9, 2017
Copy link
Member

@QuLogic QuLogic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be targetting master; it's not a bug fix.

Also, it appears you may have forgotten to set your name in your git config.

self._lastNumMajorTicks = 1
self._lastNumMinorTicks = 1
if not isinstance(self.majorTicks, LazyDefaultTickList):
del self.majorTicks[:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need this any more since the list is being removed, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you're right. I was wondering anyway, why the previous code bothered to del every item from the list and not just created a new list (self.majorTicks=[] would have been my expectation in the original code).

Also, can you tell if there's still a need for self._lastNumMajorTicks? I haven't looked into this so far. It feels like a helper state, that might not be necessary. If you don't know right now, I can check myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it worked this way so that any use of self.majorTicks would point to the same list. It's interesting that all the tests still pass though, as clearly this new class does not make that same guarantee.

@jklymak
Copy link
Member

jklymak commented Nov 9, 2017

I'm not clear on why this code needs to be wrapped in a class. I'm sure its just my ignorance...

@QuLogic
Copy link
Member

QuLogic commented Nov 9, 2017

You mean why not start with an empty list? Because many things subtly break and it's a giant rabbit hole...

@jklymak
Copy link
Member

jklymak commented Nov 9, 2017

Could we at least add and __add__ attribute so it behaves a bit more like a boring list?

@timhoffm
Copy link
Member Author

@jklymak We want to have a list with always at least one element. But we don't want to actually create that element, unless we really use it. The creation of unused ticks is a significant overhead.

Of course, you can implement that logic without a class, but that would mean, that you'll have conditional logic everytime you use the list, i.e. more complex code and also the checks are still run later when the list is there for sure (probably not relevant performance-wise though).

The class allows to get both, the simple API of a plain list and the lazy initialization.

We could add __add__. But as far as I can see, there's a fundamental limitation. While LazyDefaultTickList() + LzyDefaultTickList() and LazyDefaultTickList + [] can be made to work, I don't think it's possible for [] + LazyDefaultTickList(), because that will invoke list.__add__ over which we have no control. Please correct me, if I'm wrong here.
Therefore I'm not sure that this is a good idea. Moreover, there's no code that uses this functionality. - Except the one occurence I changed, which IMHO is better off with chained iterators anyway. However, I'm open to discussion if __add__ should be added.

@timhoffm
Copy link
Member Author

Thank you all for your feedback so far. This code is intended as a working prototype to get the discussion started. There might still be room for some improvement. I will look into this further on the weekend.

@anntzer
Copy link
Contributor

anntzer commented Nov 10, 2017

If I understand it correctly, LazyDefaultTickList (which should be _LazyDefaultTickList to avoid creating a new public API) should never reach the user, because by the end of the Axes creation, we should have a real list, right? If we can ensure that, then we don't need to worry about the LazyDefaultTickList API.

I wonder if the following approach could be made to work instead: turn majorTicks and minorTicks into custom descriptors (ie, with an appropriately defined __get__), that behave as empty lists (or whatever appropriate) as long as the owner Axis has _in_init set (per #8262), but replace themselves on the instance by a real list (populated accordingly) as soon as they are accessed while _in_init is not set anymore.

@jklymak
Copy link
Member

jklymak commented Nov 10, 2017

Oh, I get it. Sorry for being slow.

OTOH, does this really speed anything up if you actually access the axes? I don't understand how this helps real-world cases, but again, I'm sure I'm just being slow.

EDIT: OK: I checked, and it indeed speeds things up. I'm still not at all clear why. If I do

x = [1., 2.]
y = [1., 2.]
start = time.time()
fig, axs = plt.subplots(10, 10)
axs = axs.flatten()
for ax in axs:
    ax.plot(x, y )
stop = time.time()
print(stop - start)

I get 2.89 s w/o this change and 0.95 with the change.

If I don't plot I get just a bit faster:
2.99s and 0.99s

Like its great this speeds things up so much, but why the heck is the initial tick instantiation so slow? The plotting just adds 0.1 s to the whole thing.

@tacaswell tacaswell changed the base branch from v2.1.x to master November 10, 2017 05:05
@tacaswell tacaswell changed the base branch from master to v2.1.x November 10, 2017 05:05
@tacaswell
Copy link
Member

This should go against master. Tried to re-target it and there are conflicts, I'll do the 2.1.x -> master merge and try again.

Unfortunately many parts of our API that should be private are not and users have found almost all of the API surface... This will just need an api_changes note with any mitigation steps (sounds like just calling list or using the get_* api?) that will work across recent Matplotlib versions.

That is a crazy speed up!

Thanks for working on this @timhoffm !

@jklymak
Copy link
Member

jklymak commented Nov 10, 2017

I think this is really clever and it is great that it speeds things up. OTOH I feel it is a complex solution to the fact that we can't find the recursion in reset_ticks in our already complex code.

@tacaswell tacaswell changed the base branch from v2.1.x to master November 10, 2017 15:19
@tacaswell
Copy link
Member

successfully re-targetted to master.

@efiring
Copy link
Member

efiring commented Nov 10, 2017 via email

@timhoffm timhoffm force-pushed the prevent-unnecessary-axis-ticks-creation branch from 5901e0e to 3288b01 Compare November 12, 2017 17:18
@timhoffm timhoffm force-pushed the prevent-unnecessary-axis-ticks-creation branch from 3288b01 to f494913 Compare November 12, 2017 21:11
@timhoffm
Copy link
Member Author

timhoffm commented Nov 12, 2017

I've followed the suggestion of @anntzer to make this a private class _LazyDefaultTickList.

@jklymak I've looked at __add__ and also other list-like behavior (subclassing Sequence, and MutableSequence). I would advise against it, because it can get a bit tricky. This is a private implementation detail and it's list-like enough for all current purposes. I we should need more later on, it's better to implement and test is when it's needed.

Also I've figured, that self._lastNumMajorTicks is actually not necessary. Its only use is in get_major_ticks(). AFAICS it's only used to iterate over the just appended ticks (please recheck this). The same is true for self._lastNumMinorTicks. Both attributes are removed in the additional commit.

@QuLogic Thanks for targetting master, and for remembering me about my git config.

From my point of view, this is all I want to do with this pull requests. So it's ready to merge. Comments are welcome.

@efiring
Copy link
Member

efiring commented Nov 13, 2017

As @tacaswell commented above, this needs an api_changes note because majorTicks etc. might accessed in user code--someone might be keeping a reference, assuming it is always pointing to the same object. I think it also deserves a whats_new entry.
@QuLogic, do you think that this lack of guaranteed unique identity for majorTicks is enough of a concern to delay merging?
I think you also pointed to the seemingly unnecessary del statements; but they could reduce memory leakage in the event that some user code is holding a reference to the lists. Should they stay, or go?

@timhoffm timhoffm force-pushed the prevent-unnecessary-axis-ticks-creation branch from b28ad0e to 77fa33f Compare November 13, 2017 21:35
@timhoffm
Copy link
Member Author

@efiring @tacaswell I've add the api_changes.

Speaking of api changes, should we make majorTicks and minorTicks explicitly private (_)? It's not documented in the official API doc. The recommended way is using the get_*_ticks() methods.

If you don't want to break this immediately, one could at least change it internally and add a property that restores the original behavior but issues a deprecation warning.

@efiring
Copy link
Member

efiring commented Jan 5, 2018

@QuLogic, what are you thoughts on this now?

can be called multiple times without the overhead of initialzing every
time.

https://github.com/matplotlib/matplotlib/issues/6664
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to consider that checking the git blame is sufficient to see where this comes from, and prefer not adding explicit links to issues (otherwise it's a bit endless). Not a huge deal of course.

@@ -1339,22 +1374,15 @@ def get_major_ticks(self, numticks=None):
numticks = len(self.get_major_locator()())
if len(self.majorTicks) < numticks:
# update the new tick label properties from the old
protoTick = self.majorTicks[0]
for i in range(numticks - len(self.majorTicks)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could even just replace the outer "if len(self.majorTicks) < numticks" by "while len() < numticks" and remove one indent level.

@QuLogic QuLogic dismissed their stale review January 8, 2018 09:57

gitconfig fixed

@timhoffm
Copy link
Member Author

Closed in favor of #10302 and #10303.

@timhoffm timhoffm closed this Jan 24, 2018
@timhoffm timhoffm deleted the prevent-unnecessary-axis-ticks-creation branch January 24, 2018 01:42
@QuLogic QuLogic modified the milestones: needs sorting, v2.2.0 Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants