Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache frame representation info; doubles speed of coordinate generation. #5952

Merged
merged 4 commits into from Oct 28, 2018

Conversation

dmopalmer
Copy link
Contributor

Here's another caching fix. This doubles the speed of SkyCoordinate generation.

%timeit SkyCoord(5,5,unit='deg')
# Before (current master branch)
1000 loops, best of 3: 1.41 ms per loop
# After
1000 loops, best of 3: 689 µs per loop

It also speeds up #3323 Slow iteration over a SkyCoord Object from 2.57 s (already an improvement over the original 19 s) to 1.24 s.

@pllim
Copy link
Member

pllim commented Apr 10, 2017

Your changes broke the tests.

@dmopalmer
Copy link
Contributor Author

So it does.

Do you have any recommendations of where to invalidate the cache when a new subclass of a frame is created? I thought that would be handled by the Metaclass.

       class FakeICRS(ICRS):
            frame_specific_representation_info = {
                'spherical': {'names': ('ra', 'dec', 'distance'),
                              'units': (None, None, None)},
                'unitspherical': {'names': ('ra', 'dec'),
                                  'units': (None, None)}
            }
    
            frame_specific_representation_info = {
                'spherical': [RepresentationMapping('lon', 'ra', u.hourangle),
                              RepresentationMapping('lat', 'dec', None),
                              RepresentationMapping('distance', 'distance')]  # should fall back to default of None unit
            }
            frame_specific_representation_info['unitspherical'] = \
                frame_specific_representation_info['spherical']

@pllim
Copy link
Member

pllim commented Apr 10, 2017

Maybe @eteq knows?

@@ -789,6 +790,11 @@ def _get_representation_info(cls):
# This exists as a class method only to support handling frame inputs
Copy link
Contributor

@mhvk mhvk Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this, I wondered generally whether this still needed to be a class method. A quick test showed no failures at all when one just moves this to the representation_info property. Hence, it feels like this should be separated correctly in parts that are class and self-specific. (I think the conclusion would be to keep it on the class, but do check.)

if cls._representation_info_cachetick == MetaBaseRepresentation.REPRESENTATION_CLASSES_CACHE_TICK:
return cls._repr_attrs
except:
pass

repr_attrs = {}
Copy link
Contributor

@mhvk mhvk Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first loop over REPRESENTATION_CLASSES would seem to have no business being on the class at all, as it only deals with representations. Might it make sense to move this to representation.py, perhaps adding a new dict with this information, and updating it on the metaclass? (I should add that I really hope to remove this whole default units stuff from the representations, as it is has no functional meaning there... But that is for another PR...)

mhvk
mhvk previously requested changes Apr 11, 2017
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that you found another big slow-down!

My main comment is the second one: I think it would be good to split off the part that only deals with representations, and move it to the representation base or meta class.

I also think it would be wise to check where representation_info is used -- it seems to be called as self.representation_info[self.reprensentation] more than once, and it might make sense to cache that separately (cannot change that without changing representation, after all, so easy to reset in representation.setter)

Mostly, though, I would try to avoid to add yet another layer of indirection on what is already a steaming pile... Rather bad practice really to have such complicated calculations in properties; maybe we can lessen this somewhat?

@dmopalmer
Copy link
Contributor Author

I don't really understand what it is used for and why. (All I know is that a complex nested loop is called 22 times every time you want to represent the most basic element of positional astronomy.)

Moving it to somewhere else would require a deeper understanding of the code, its structure, and its history and customs than I have time to acquire at the moment.

I have no objection to someone else providing a better (and working) speed-up.

@mhvk
Copy link
Contributor

mhvk commented Apr 12, 2017

@dmopalmer - fair enough; since I also do not understand all that well why this code is needed or called so often, maybe best is to ping someone else... @eteq: Could you have a look?

@astropy-bot
Copy link

astropy-bot bot commented Sep 27, 2017

Hi humans 👋 - this pull request hasn't had any new commits for approximately 5 months. I plan to close this in a month if the pull request doesn't have any new commits by then.

In lieu of a stalled pull request, please close this and open an issue instead to revisit in the future. Maintainers may also choose to add keep-open label to keep this PR open but it is discouraged unless absolutely necessary.

If you believe I commented on this issue incorrectly, please report this here.

@mhvk
Copy link
Contributor

mhvk commented Sep 27, 2017

Ideally, this is still tackled, but in pieces.

@astropy-bot
Copy link

astropy-bot bot commented Nov 21, 2017

⏰ Time's up! ⏰

I'm going to close this pull request as per my previous message. If you think what is being added/fixed here is still important, please remember to open an issue to keep track of it. Thanks!

If this is the first time I am commenting on this issue, or if you believe I closed this issue incorrectly, please report this here.

@astropy-bot astropy-bot bot closed this Nov 21, 2017
@eteq eteq reopened this Nov 21, 2017
@eteq
Copy link
Member

eteq commented Nov 21, 2017

I'm reopening this but milestoning it for 3.1 - slated for that is a major re-write of the SkyCoord initializer in the name of performance. It may be this becomes irrelevant at that time, in which case we should close it.

@eteq eteq added this to the v3.1 milestone Nov 21, 2017
@astropy-bot
Copy link

astropy-bot bot commented Nov 21, 2017

Hi there @dmopalmer 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

I noticed the following issue with this pull request:

  • Changelog entry not present (or pull request number missing) and neither the Affects-dev nor the no-changelog-entry-needed label are set

Would it be possible to fix this? Thanks!

If there are any issues with this message, please report them here.

@eteq
Copy link
Member

eteq commented Nov 21, 2017

To answer the underlying question of why it's called so often: getting the representation info is needed to do all kinds of things like asking "what are the names of the attributes used on this frame". That's done in the SkyCoord initializer because it does a lot of "taking apart" an input object and then "putting it back together again". One of the goals in the v3.1 optimization would be to undo that, possibly making this cache less/unnecessary.

But more generally, the problem with caching this in the manner this PR does is that in theory it can change. I'm not sure if anyone actually uses this, but it might happen in some intermediate manner during class creation? My suspicion is that this is the origin of the failure. But some slightly cleverer caching should still be possible...

@mhvk
Copy link
Contributor

mhvk commented Dec 7, 2017

See #6941 for a PR that improves item access by a factor 50. There is definitely low-hanging fruit that should get picked before we add more layers of caching (though this particular layer does still seem reasonable).

@eteq eteq dismissed mhvk’s stale review October 28, 2018 03:50

overtaken by events (see comment)

@eteq
Copy link
Member

eteq commented Oct 28, 2018

@dmopalmer - while this issue sat for a long while, in #7949 I finally cracked the case and got this caching working. While it was not directly based on this PR (due to a lot of changes required for velocities that made it hard to do so), it was nonetheless inspired by your attempt here. So I just now pushed up a "no-op" merge commit to this branch, which will enable us to merge this PR to record the commits in the git history so that we can be sure you get credit for the effort here that led to #7949 .

Thanks for the work here!

@eteq eteq merged commit 7b3f86c into astropy:master Oct 28, 2018
@dmopalmer dmopalmer deleted the baseframe-cache branch July 8, 2019 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

6 participants