Make PValue and TestStat behave like Reals #668

palday · 2021-03-12T13:06:36Z

Add comparison operator methods for PValue and TestStat
Add tests for show(::TestStat)
Make PValue a subtype of Real

nalimilan · 2021-03-12T13:52:37Z

src/statmodels.jl

@@ -460,6 +460,14 @@ struct PValue
 end
 PValue(p::PValue) = p

+float(p::PValue) = float(p.v)
+
+for op in [Symbol("=="), :≈, :<, :≤, :>, :≥]


Suggested change

for op in [Symbol("=="), :≈, :<, :≤, :>, :≥]

for op in [:(==), :≈, :<, :≤, :>, :≥]

Maybe also define isequal and isless?

Actually I wonder whether it would be simpler to define promote_type(::Type{PValue{S}}, y::T) where {S, T<:Real} = promote_type(S, T), making S a type parameter with v::S. Then I think we'll get all operations on numbers defined for free.

You get isequal and isless for free with the symbolic forms (and I use isequal and isless) in the tests.

I was also thinking about whether promote_type might be the easier route. I'll try that later. :)

Ah yes so isless falls back to <, but it's not correct for NaN, right?

Uh.... good question

Oh wait, I know why I didn't do promote_type: that would be technically breaking, both for PValue and TestStat. We're pre 1.0 so another release isn't a big deal. But that might be something to get one more opinion on (e.g. @ararslan).

(I also want PValue and TestStat to have a consistent interface between themselves since they're both essentially just pretty printers for a semantically annotated Real.)

You mean it would be breaking if you do something like [pval, 1.0]? Yes that would be slightly breaking so that would probably warrant tagging a minor release.

Yeah, and adding a type parameter to a struct also breaks (de)serialization.

Maybe let's keep your current approach then, and only use promote_type in another PR that would be merged only once we have other breaking changes to tag.

As soon as this one (and CoefTable show method) lands, I'll open that PR 😄

src/statmodels.jl

Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr> Co-authored-by: Alex Arslan <ararslan@comcast.net>

src/statmodels.jl

test/statmodels.jl

nalimilan · 2021-03-15T13:03:44Z

Damn, this is really tricky. Since we now define isequal, it would be good to check that hash is also consistent. Otherwise putting PValue objects in a dict will give incorrect results.

AFAICT that's the case thanks to the hash(x::Real, h::UInt) fallback, but the only way to be sure is to test it. Defining hash(x::PValue, h::UInt) = hash(x.val, h) would probably be more efficient.

Out of curiosity, in what use case did you feel the need to this? :-) I realize that PValue and TestValue were originally essentially simple wrappers to customize printing internally which were not exposed to users, but we're now making them full-fledged types.

palday · 2021-03-15T13:17:18Z

Damn, this is really tricky. Since we now define isequal, it would be good to check that hash is also consistent. Otherwise putting PValue objects in a dict will give incorrect results.

AFAICT that's the case thanks to the hash(x::Real, h::UInt) fallback, but the only way to be sure is to test it. Defining hash(x::PValue, h::UInt) = hash(x.val, h) would probably be more efficient.

Hahahahah, I'll add this as well. And as soon as we get this and the show method tagged for a patch release, I'll open a PR for the promote-rule which will be breaking but just avoid so much of this hassle. 😄

Out of curiosity, in what use case did you feel the need to this? :-) I realize that PValue and TestValue were originally essentially simple wrappers to customize printing internally which were not exposed to users, but we're now making them full-fledged types.

We had something at the dayjob where there was some conditional behavior based on a p-value threshold (I know, I don't like it either, but so much of the applied stats world still does that). In one instance, the p-values were already wrapped for pretty printing (and expensive to re-compute) and we had to add the filter at the very end. When doing that, I noticed that TestStat already subtyped Real, so I thought it made sense to just say "okay, these are Reals with pretty printing (and bounds checking for PValue), so let's make them behave that way". I guess I'm not the only person who will say "give me all the rows in this CoefTable where p < 0.05", so it will be nice to have comparison operators Just Work™.

nalimilan · 2021-03-15T14:15:05Z

Hahahahah, I'll add this as well. And as soon as we get this and the show method tagged for a patch release, I'll open a PR for the promote-rule which will be breaking but just avoid so much of this hassle. smile

Thanks!

We had something at the dayjob where there was some conditional behavior based on a p-value threshold (I know, I don't like it either, but so much of the applied stats world still does that). In one instance, the p-values were already wrapped for pretty printing (and expensive to re-compute) and we had to add the filter at the very end. When doing that, I noticed that TestStat already subtyped Real, so I thought it made sense to just say "okay, these are Reals with pretty printing (and bounds checking for PValue), so let's make them behave that way". I guess I'm not the only person who will say "give me all the rows in this CoefTable where p < 0.05", so it will be nice to have comparison operators Just Work™.

OK. But do you mean you get PValue objects when extracting values from CoefTable objects? I thought it was used only internally for printing, but never returned to users.

palday · 2021-03-15T14:16:27Z

Uh.... I don't know. (We're not using CoefTable at that point in the actual example I was looking at.)

ararslan · 2021-03-15T16:35:32Z

...come to think of it, how did we end up with PValues? 🤔

palday · 2021-03-15T16:37:38Z

A custom show method working on a column table where we assumed that the element-level pretty printing was handled via show for that eltype.

And when this comparison wasn't implemented, everything broke.

palday added 5 commits March 12, 2021 14:03

make PValue and TestStat behave like numbers

517b278

add tests for TestStat show methods

de162ed

make PValue real

e216699

fix CoefTable show method for PValue <: Real

f33dd5f

add TestStat to imports

1e5d001

palday requested a review from nalimilan March 12, 2021 13:29

nalimilan reviewed Mar 12, 2021

View reviewed changes