Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure Distribution can be used in Latitude and Longitude #14421

Merged

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Feb 20, 2023

This pull request is to address the fact that one cannot initialize a Longitude with a Distribution, which resulted from an error being raised on comparing a Distribution with a non-void array -- the solution was to override __eq__ and __ne__ (other comparisons are not affected).

EDIT: Furthermore, for wrapping, __getitem__ and __setitem__ needed to be changed to ensure that things like dist[dist < 0] += 360 work. And in Latitude._validate_angles, one needs to be sure that no assumption of a float dtype is made.

This part of a hopefully larger attempt to ensure we can use Distribution also inside SkyCoord, etc.

Note: since this is a bug, backporting to 5.2. But perhaps note to 5.0, since really this late it does not seem helpful to fix LTS for this any more. EDIT: while this is a bug, the change to coordinates makes me feel that it is better to treat it as an "enhancement" and not back-port.

@mhvk mhvk added Bug uncertainty 💤 backport-v5.2.x on-merge: backport to v5.2.x labels Feb 20, 2023
@mhvk mhvk added this to the v5.2.2 milestone Feb 20, 2023
@mhvk mhvk requested a review from eteq February 20, 2023 02:41
@github-actions
Copy link

github-actions bot commented Feb 20, 2023

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • Is a milestone set? Milestone must be set but we cannot check for it on Actions; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@mhvk mhvk changed the title Ensure Distribution can be compared with non-distribution Ensure Distribution can be used in Latitude and Longitude Feb 20, 2023
@mhvk mhvk removed the 💤 backport-v5.2.x on-merge: backport to v5.2.x label Feb 20, 2023
@mhvk mhvk modified the milestones: v5.2.2, v5.3 Feb 20, 2023
@mhvk mhvk force-pushed the distribution-comparison-and-interaction-with-longitude branch 3 times, most recently from e2ce4fe to 5631ab9 Compare February 20, 2023 14:56
@eerovaher
Copy link
Member

It would certainly be good if astropy.coordinates worked better with Distribution, but any implementation must deal with the edge case where the Distribution contains values both above and below the wrapping angle. With 5631ab9 the following occurs:

>>> from astropy import units as u
>>> from astropy.coordinates import Longitude
>>> from astropy.uncertainty import Distribution
>>> d = Distribution([359, 361] * u.deg)
>>> d.pdf_std()
<Quantity 1. deg>
>>> Longitude(d).pdf_std()
<Longitude 179. deg>

@mhvk
Copy link
Contributor Author

mhvk commented Feb 20, 2023

Hmm, yes, that is unavoidable right now... Of course, the problem is that a standard deviation of a Longitude (and really any Angle) just generally doesn't make sense - one should use the circular version. The same holds for all the other .pdf_*.

Maybe we should have an option to use circular for them? And default to those for Angle subclasses? That is quite easy to implement... Let me ping @eteq as well.

p.s. Not sure we have to solve the above for this PR... @eerovaher - what do you think, shall I just open another issue?

@nstarman
Copy link
Member

Maybe we should have an option to use circular for them? And default to those for Angle subclasses? That is quite easy to implement... Let me ping @eteq as well.

That seems the right solution.
Are you thinking a kwarg to switch between linear and circular std?

@eerovaher
Copy link
Member

Maybe we should have an option to use circular for them? And default to those for Angle subclasses?

I agree that Angle and its subclasses should use the circular versions, but I'm not sure giving the option not to use the circular versions would be valuable.

p.s. Not sure we have to solve the above for this PR... @eerovaher - what do you think, shall I just open another issue?

I'd rather have this sorted out in this PR, especially if the implementation is simple. The ability to create a Longitude from a Distribution is not very useful if the resulting object doesn't behave properly.

@mhvk
Copy link
Contributor Author

mhvk commented Feb 20, 2023

I was thinking of a kwarg with a default that depends on either the unit or the class.

@eerovaher - I don't quite agree that LongitudeDistribution as is would not be useful, as it really is meant to be a step to allow Representation and SkyCoord to work - for which pdf_mean can be quite useful. I worry that here we are going to be stuck on implementation details and naming... (and I do think it needs a bit of thought on how exactly to implement it).

@mhvk
Copy link
Contributor Author

mhvk commented Feb 20, 2023

Or, rephrasing, I'd rather have PRs be small and to the point. This one fixes real bugs -- of not being able to compare with == and != and not being able to do in-place operations -- which were exposed by trying to make LongitudeDistribution and LatitudeDistribution, both of which should have worked already.

In contract, having circular pdf_* is a new feature -- one that I agree is necessary, but which feels like a separate issue.

@mhvk
Copy link
Contributor Author

mhvk commented Feb 21, 2023

Hmm, I thought we could simply use routines in astropy.stats.circstats, but looking closer I see that while it has circmean, circstd and circvar, it has no median or [s]mad or implementations for percentiles and histograms. I guess all of those involve a choice of how to sort so are less obvious.

@nstarman
Copy link
Member

https://insidebigdata.com/2021/02/12/circular-statistics-in-python-an-intuitive-intro/
Has a suggestion for the circular median.

@mhvk
Copy link
Contributor Author

mhvk commented Feb 21, 2023

Nice! I opened a feature request for more circular statistics, as well as discussion on how to implement it for Distribution -- see #14427.

Copy link
Member

@eerovaher eerovaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's proceed in small steps then and limit the scope of the pull request to what has already been implemented.

I do have a question about test setup. What is the advantage of implementing classes with a setup_class() method over using pytest fixtures or perhaps global constants? There is an obvious downside to implementing test classes - they use up a level of indentation.

Comment on lines 3 to 4
``dist[dist<0] *= -1`` work. This also ensures that the ``Angle`` subclasses
``Longitude`` and ``Latitude`` can be initialized with a ``Distribution``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to advertise initializing Angle subclasses with Distribution quite yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you're right - let's announce that properly when the statistics make sense!

@mhvk mhvk force-pushed the distribution-comparison-and-interaction-with-longitude branch from 5631ab9 to cac6be5 Compare February 21, 2023 17:29
@mhvk
Copy link
Contributor Author

mhvk commented Feb 21, 2023

On setup_class, it is mostly what I'm used too - I do like combining tests that (more or less) logically belong together in classes, and find the pytest fixtures very hard to read because the fixtures feel so unpythonic -- only with pytest.mark.parametrize does it make obvious sense to me what is happening...Also, with fixtures, initialization code tends to move far away from where it is used (like in Table), so I find I never fully understand what exactly is being tested.

Anyway, bottom line: subjective preference!

@mhvk mhvk force-pushed the distribution-comparison-and-interaction-with-longitude branch from cac6be5 to 0b6e26a Compare February 22, 2023 21:22
Copy link
Member

@eerovaher eerovaher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can approve the changes to coordinates, but I am not a maintainer of uncertainty, so I'll avoid merging.

@mhvk
Copy link
Contributor Author

mhvk commented Feb 27, 2023

@eteq - do you have time to look at this PR to ensure Distribution can be used more widely? I think it is relatively straightforward - the bigger question arguably is how to deal with it inside representations, etc.

@mhvk
Copy link
Contributor Author

mhvk commented Mar 29, 2023

@eteq - ping! This PR touches the Distribution class quite a bit, so review by you would be great!

@mhvk
Copy link
Contributor Author

mhvk commented Apr 7, 2023

Since feature-freeze is coming up soon: it would be good if @eteq could review this PR, but failing that I think it should be merged; it does fix some actual bugs.

@pllim
Copy link
Member

pllim commented Apr 18, 2023

There is already approval and no response from @eteq in the past 2 weeks, so I am merging. Thanks!

@pllim pllim merged commit cb4a986 into astropy:main Apr 18, 2023
@mhvk mhvk deleted the distribution-comparison-and-interaction-with-longitude branch April 18, 2023 18:29
Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, should have followed-up here yesterday! @mhvk and I discussed offline I'd prioritize this one so I had a pending review I hadn't finished when this got merged. But I'll just go ahead and make a PR that implements my suggestions here later today and @mhvk can review it there.

invalid_angles = np.any(angles.value < -limit) or np.any(angles.value > limit)
# Ensure ndim>=1 so that comparison is done using the angle dtype.
# Otherwise, e.g., np.array(np.pi/2, 'f4') > np.pi/2 will yield True.
# (This feels like a bug -- see https://github.com/numpy/numpy/issues/23247)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like since this was written there might be a fix in a future numpy for this - might want to add a todo note aobut that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -311,6 +311,9 @@ def view(self, dtype=None, type=None):

# Override __getitem__ so that 'samples' is returned as the sample class.
def __getitem__(self, item):
if isinstance(item, Distribution):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thing it might be better to use duck-typing here - will make a follow-on PR with a proposal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after looking more closely I realize this is a bigger topic that's not worth worrying about right now, because it would be more consistent to do this with all the isinstance calls in this sub-package, which might have unintended consequences...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. While usually a big fan of duck-typing, I think in this instance, isinstance may actually be OK; e.g., I don't think we can count on other implementations to use .distribution as an attribute... But something to think about nevertheless...

@@ -320,6 +323,30 @@ def __getitem__(self, item):
else:
return result

def __setitem__(self, item, value):
if isinstance(item, Distribution):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same here as above)

cls.q = cls.a << u.deg
cls.dq = Distribution(cls.q)

@pytest.mark.parametrize("angle_cls", [Angle, Longitude, Latitude])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe should add EarthLocation to this list? Although that might be a whole can of worms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EarthLocation will be interesting, since it is already a structured type. That will really need some thought, since presumably each sample then has x,y,z inside, but we still have to be sure el_dist['x'] works... In other words, I'm pretty sure this will not yet work!

@pllim
Copy link
Member

pllim commented Apr 19, 2023

Ooops, my bad on merging too early. I apologize for the extra work.

@mhvk
Copy link
Contributor Author

mhvk commented Apr 19, 2023

No worries, you couldn't know I had just decided to e-mail Erik directly...

@pllim
Copy link
Member

pllim commented Apr 19, 2023

Not yet... 👹

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants