Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PValue and TestStat behave like Reals #668

Merged
merged 9 commits into from
Mar 13, 2021
Merged

Make PValue and TestStat behave like Reals #668

merged 9 commits into from
Mar 13, 2021

Conversation

palday
Copy link
Member

@palday palday commented Mar 12, 2021

  • Add comparison operator methods for PValue and TestStat
  • Add tests for show(::TestStat)
  • Make PValue a subtype of Real

@palday palday requested a review from nalimilan March 12, 2021 13:29
@@ -460,6 +460,14 @@ struct PValue
end
PValue(p::PValue) = p

float(p::PValue) = float(p.v)

for op in [Symbol("=="), :≈, :<, :≤, :>, :≥]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for op in [Symbol("=="), :, :<, :, :>, :]
for op in [:(==), :, :<, :, :>, :]

Maybe also define isequal and isless?

Actually I wonder whether it would be simpler to define promote_type(::Type{PValue{S}}, y::T) where {S, T<:Real} = promote_type(S, T), making S a type parameter with v::S. Then I think we'll get all operations on numbers defined for free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get isequal and isless for free with the symbolic forms (and I use isequal and isless) in the tests.

I was also thinking about whether promote_type might be the easier route. I'll try that later. :)

Copy link
Member

@nalimilan nalimilan Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes so isless falls back to <, but it's not correct for NaN, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh.... good question

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I know why I didn't do promote_type: that would be technically breaking, both for PValue and TestStat. We're pre 1.0 so another release isn't a big deal. But that might be something to get one more opinion on (e.g. @ararslan).

(I also want PValue and TestStat to have a consistent interface between themselves since they're both essentially just pretty printers for a semantically annotated Real.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean it would be breaking if you do something like [pval, 1.0]? Yes that would be slightly breaking so that would probably warrant tagging a minor release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and adding a type parameter to a struct also breaks (de)serialization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's keep your current approach then, and only use promote_type in another PR that would be merged only once we have other breaking changes to tag.

Copy link
Member Author

@palday palday Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as this one (and CoefTable show method) lands, I'll open that PR 😄

src/statmodels.jl Outdated Show resolved Hide resolved
src/statmodels.jl Outdated Show resolved Hide resolved
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
Co-authored-by: Alex Arslan <ararslan@comcast.net>
src/statmodels.jl Outdated Show resolved Hide resolved
src/statmodels.jl Outdated Show resolved Hide resolved
@palday palday requested a review from ararslan March 12, 2021 19:55
@palday palday merged commit 20411f4 into master Mar 13, 2021
@palday palday deleted the pa/p-is-real branch March 13, 2021 00:22
@nalimilan
Copy link
Member

Damn, this is really tricky. Since we now define isequal, it would be good to check that hash is also consistent. Otherwise putting PValue objects in a dict will give incorrect results.

AFAICT that's the case thanks to the hash(x::Real, h::UInt) fallback, but the only way to be sure is to test it. Defining hash(x::PValue, h::UInt) = hash(x.val, h) would probably be more efficient.

Out of curiosity, in what use case did you feel the need to this? :-) I realize that PValue and TestValue were originally essentially simple wrappers to customize printing internally which were not exposed to users, but we're now making them full-fledged types.

@palday
Copy link
Member Author

palday commented Mar 15, 2021

Damn, this is really tricky. Since we now define isequal, it would be good to check that hash is also consistent. Otherwise putting PValue objects in a dict will give incorrect results.

AFAICT that's the case thanks to the hash(x::Real, h::UInt) fallback, but the only way to be sure is to test it. Defining hash(x::PValue, h::UInt) = hash(x.val, h) would probably be more efficient.

Hahahahah, I'll add this as well. And as soon as we get this and the show method tagged for a patch release, I'll open a PR for the promote-rule which will be breaking but just avoid so much of this hassle. 😄

Out of curiosity, in what use case did you feel the need to this? :-) I realize that PValue and TestValue were originally essentially simple wrappers to customize printing internally which were not exposed to users, but we're now making them full-fledged types.

We had something at the dayjob where there was some conditional behavior based on a p-value threshold (I know, I don't like it either, but so much of the applied stats world still does that). In one instance, the p-values were already wrapped for pretty printing (and expensive to re-compute) and we had to add the filter at the very end. When doing that, I noticed that TestStat already subtyped Real, so I thought it made sense to just say "okay, these are Reals with pretty printing (and bounds checking for PValue), so let's make them behave that way". I guess I'm not the only person who will say "give me all the rows in this CoefTable where p < 0.05", so it will be nice to have comparison operators Just Work™.

@nalimilan
Copy link
Member

Hahahahah, I'll add this as well. And as soon as we get this and the show method tagged for a patch release, I'll open a PR for the promote-rule which will be breaking but just avoid so much of this hassle. smile

Thanks!

We had something at the dayjob where there was some conditional behavior based on a p-value threshold (I know, I don't like it either, but so much of the applied stats world still does that). In one instance, the p-values were already wrapped for pretty printing (and expensive to re-compute) and we had to add the filter at the very end. When doing that, I noticed that TestStat already subtyped Real, so I thought it made sense to just say "okay, these are Reals with pretty printing (and bounds checking for PValue), so let's make them behave that way". I guess I'm not the only person who will say "give me all the rows in this CoefTable where p < 0.05", so it will be nice to have comparison operators Just Work™.

OK. But do you mean you get PValue objects when extracting values from CoefTable objects? I thought it was used only internally for printing, but never returned to users.

@palday
Copy link
Member Author

palday commented Mar 15, 2021

Uh.... I don't know. (We're not using CoefTable at that point in the actual example I was looking at.)

@ararslan
Copy link
Member

...come to think of it, how did we end up with PValues? 🤔

@palday
Copy link
Member Author

palday commented Mar 15, 2021

A custom show method working on a column table where we assumed that the element-level pretty printing was handled via show for that eltype.

And when this comparison wasn't implemented, everything broke.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants