Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to ValueSupport interface #951

Closed
wants to merge 41 commits into from
Closed

Change to ValueSupport interface #951

wants to merge 41 commits into from

Conversation

richardreeve
Copy link
Contributor

@richardreeve richardreeve commented Aug 4, 2019

This change to the interface removes Discrete and Continuous from the code as aliases for ContiguousSupport{Int} and ContinuousSupport{Float64} - introduced in #945 - from a suggestion by @matbesancon, and to be done after or instead of #945, which it currently includes. All of the Discrete[*]Distribution and Continuous[*]Distribution are removed from the code too, as there was concern expressed in #945 that I was introducing too many type aliases. This is done because most of the distributions that use these aren't actually constrained to have Int or Float64 as their `eltype, and there were problems with breaking some external packages.

This could be made breaking by removing Discrete and Continuous and the *Distribution aliases from src/deprecates.jl where they have been moved to, but this wouldn't affect the Distributions package as it no longer uses them at all (except in test/types.jl). Keeping them in might discourage people from fixing their code. Unfortunately I can't see how to deprecate them as they are used as const aliases for types, and typeof(DeprecatedThing) isn't a type...

The PR would still be mildly breaking even then, though, as ContinuousSupport becomes ContinuousSupport{T<:Number}, and this can't be avoided unless someone has a synonym for continuous. I'm not sure how many people use ContinuousSupport outside the package though - I can't find it on github in any julia code. (UncountableSupport{T} is an option, but at the moment ContinuousSupport requires support to be over a single continuous domain...)

Edit: I'd forgotten that I introduced ContinuousSupport in #945!

richardreeve and others added 18 commits July 31, 2019 00:34
…ypes for countable and continuous support respectively, and discontinuous distributions in general.
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
…amework.

Minor bugfix for MixtureModel with non-fully parameterised types and add testing.
Introduce Integer*Distribution, and parameterise Continuous*Distribution{T} for uncountable T.
Fix lots of bugs where distributions were incorrectly reporting their types.
…where the support is over contiguous integers (as most are).
Fix most countable distributions to use ContiguousSupport.
More ContiguousSupport fixes, and updating testing.
@codecov-io
Copy link

codecov-io commented Aug 22, 2019

Codecov Report

Merging #951 into master will increase coverage by 0.16%.
The diff coverage is 69.07%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #951      +/-   ##
==========================================
+ Coverage   78.05%   78.22%   +0.16%     
==========================================
  Files         112      112              
  Lines        5382     5410      +28     
==========================================
+ Hits         4201     4232      +31     
+ Misses       1181     1178       -3
Impacted Files Coverage Δ
src/multivariate/mvtdist.jl 59.78% <ø> (ø) ⬆️
src/matrix/matrixbeta.jl 97.56% <ø> (ø) ⬆️
src/univariate/continuous/triweight.jl 54.83% <ø> (ø) ⬆️
src/univariate/continuous/inversegamma.jl 95% <ø> (ø) ⬆️
src/univariate/continuous/betaprime.jl 94.28% <ø> (ø) ⬆️
src/univariate/continuous/kolmogorov.jl 20.98% <ø> (ø) ⬆️
src/univariate/continuous/vonmises.jl 86.66% <ø> (ø) ⬆️
src/samplers/exponential.jl 66.66% <ø> (ø) ⬆️
src/multivariate/mvnormal.jl 71.59% <ø> (ø) ⬆️
src/samplers/poisson.jl 90.9% <ø> (ø) ⬆️
... and 99 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c89d557...955c861. Read the comment docs.

@richardreeve richardreeve changed the title WIP: Breaking change to ValueSupport interface - DO NOT MERGE WIP: Possible breaking change to ValueSupport interface Aug 22, 2019
@richardreeve
Copy link
Contributor Author

richardreeve commented Aug 22, 2019

This PR is no longer breaking, because code has been added in to src/deprecates.jl that provides all of the missing types again. However the package itself no longer uses these "deprecated" const aliases.

@richardreeve richardreeve changed the title WIP: Possible breaking change to ValueSupport interface Change to ValueSupport interface Aug 22, 2019
@mschauer
Copy link
Member

Thank you, this is already nicer. This is also much closer in spirit to what I have in mind -- but again we do not need to be breaking: we can just avoid making the eltype of the distributions part of the ValueSupport parameter, c.f. #958. This is important. Needlessly enforcing type restrictions breaks automatic differentiation, static arrays, and others. Making them part of the abstract super type is the same kind of mistake. We have learned this in the past, see #957, #921, #943, #896 and the entire bunch of related issues/prs.

As I keep arguing, eltype is just sufficient to capture the sample type. No need to redo everything with Support{T} if Support and eltype are reach enough. I have not seen convincing arguments to the contrary. I am not against extending the system by Discrete, ContiguousIntegerSupport, Mixture or whatever is needed.

@richardreeve
Copy link
Contributor Author

Can I stress that this is not breaking, and also that it has been here for weeks - #945 was just the type hierarchy split out. I'll look at the other issues later when I have time.

@richardreeve
Copy link
Contributor Author

As far as I can see, your example problems above are cases where people have previously made exactly the class of errors I am proposing to eliminate (#896) - I have fixed 10s of these bugs in this PR - or are fixing times when people have got their types confused. I certainly don't see anything I'm doing as needlessly enforcing type restrictions - I don't believe I add any more restrictions, though it is possible I have inadvertently - I am just correctly parameterising on the types already used to avoid this class of error arising.

There's a lot more work to be done here incidentally. What I'd like to be able to do down the line is to seamlessly handle any non-Real type in Continuous distributions - at the moment the code will not handle correctly when x and x^2 do not have the same units - e.g. Unitful units, so that I can just say that a distribution has a mean of 1.0m and a variance of 2.0m^2. The problem is I believe this either needs something like computed field types, or traits, neither of which are currently in base Julia.

@matbesancon
Copy link
Member

@richardreeve can you get the last bit of coverage up? I'm not always a fan of it but it helps to keep healthy habits for such large code base

src/utils.jl Outdated
@@ -9,6 +9,38 @@ macro check_args(D, cond)
end
end

isuncountable(x) = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be for adding only one of isuncountable, iscountable, given that one is the negation of the other.
iscountablereal should also not be necessary, and can rely on iscountable and type dispatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's a problem there.

Nearly all distributions cannot currently handle not-Float64 eltypes, but will be able to handle <: Real eltypes with only a modest amount of work. However, it's much harder to get them to handle (un)countable non-real <: Number eltypes, such as Unitful types. Nonetheless, the aspiration should be to handle all of these types, so I want to be able to move easily from the iscountablereal(T) to iscountable(T) checks which have been created (@check_countable(), etc.).

Similarly, iscountable(T) is not just !isuncountable(T). Non-Number types are rejected by both for now, as they are well beyond the scope of the current series of PRs. I could call them iscountablenumbner() and isuncountablenumber() to make it clear that they are not inverses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS The other option I thought of (as you suggest) was to have type dispatch on the macro for the time being (which is the only user of isuncountablereal(), and this function is also not exported incidentally), but I assume that's still not possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've got rid of this conflict - I've added three level logic - missing - for the cases where we haven't yet resolved whether it's countable or uncountable, then the two are inverses of each other.

@richardreeve
Copy link
Contributor Author

The overall coverage does go up. The problem with the patch coverage is that there is code in the PR that isn’t used yet, because the actual distributions that use it are in the next PR! The easiest way of increasing coverage for that is to add in the new distributions :)

@matbesancon
Copy link
Member

@richardreeve can you fix the two conflicts? The PR that created them was tiny boilerplate, so it shouldn't be a big deal

@matbesancon
Copy link
Member

Nevermind, I fixed the two conflicts

const DiscreteMultivariateDistribution = Distribution{Multivariate, Discrete}
const ContinuousMultivariateDistribution = Distribution{Multivariate, Continuous}
const DiscreteMatrixDistribution = Distribution{Matrixvariate, Discrete}
const ContinuousMatrixDistribution = Distribution{Matrixvariate, Continuous}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Base.@deprecate_binding for these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea thanks, it preserves compatibility but signals what is changing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I had no idea that existed - I see it isn't exported in fact, so I don't feel too bad. Anyway, I'll definitely do that.

Copy link
Contributor Author

@richardreeve richardreeve Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that it doesn't handle parametric types, so I can't do it for all of the deprecated types:

const DiscreteDistribution{F<:VariateForm}   = Distribution{F,Discrete}
const ContinuousDistribution{F<:VariateForm} = Distribution{F,Continuous}

but I've covered most of them now. However, testing them is another matter:

julia> @test_deprecated DiscreteMatrixDistribution
WARNING: Distributions.DiscreteMatrixDistribution is deprecated, use Distribution{Matrixvariate, Discrete} instead.
  likely near REPL[6]:1
Log Test Failed at REPL[6]:1
  Expression: $(Expr(:escape, :DiscreteMatrixDistribution))
  Log Pattern: (:warn, r"deprecated"i, Ignored(), :depwarn) match_mode = :any
  Captured Logs:

ERROR: There was an error during testing

Is the message returned by @deprecate_binding not a deprecation warning?

@richardreeve
Copy link
Contributor Author

OK @matbesancon. We’re now ahead on all metrics and These issues are dealt with...

abstract type ValueSupport end
struct Discrete <: ValueSupport end
struct Continuous <: ValueSupport end
`S <: Support{T}` specifies the support of sample elements as T,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support of a measure has a specific conflicting definition (https://en.wikipedia.org/wiki/Support_(measure_theory)) we might want to call this differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What definition are we going to use. What is the support of a Bernoulli(1.0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with T being the eltype of samples (see below), we should specify this here.

struct ContinuousSupport{N <: Number} <: Support{N} end
abstract type CountableSupport{C} <: Support{C} end
struct ContiguousSupport{C <: Integer} <: CountableSupport{C} end
struct UnionSupport{N1, N2,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document

src/common.jl Outdated

The default element type of a sample. This is the type of elements of the samples generated
by the `rand` method. However, one can provide an array of different element types to
store the samples using `rand!`.
"""
Base.eltype(s::Sampleable{F,Discrete}) where {F} = Int
Base.eltype(s::Sampleable{F,Continuous}) where {F} = Float64
Base.eltype(::Sampleable{F, <: Support{N}}) where {F, N} = N
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be the type of rand(d) or the eltype of rand(d)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below it is eltype of rand(d).

value_support(::Type{Distribution{VF,VS}}) where {VF<:VariateForm,VS<:ValueSupport} = VS
value_support(::Type{T}) where {T<:Distribution} = value_support(supertype(T))
variate_form(::Type{<:Sampleable{VF, <:Support}}) where {VF<:VariateForm} = VF
value_support(::Type{<:Sampleable{<:VariateForm,VS}}) where {VS<:Support} = VS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be called support now/later?

@@ -116,18 +114,18 @@ const DistributionType{D<:Distribution} = Type{D}
const IncompleteFormulation = Union{DistributionType,IncompleteDistribution}

"""
succprob(d::DiscreteUnivariateDistribution)
succprob(d::UnivariateDistribution{ContiguousSupport{T}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
succprob(d::UnivariateDistribution{ContiguousSupport{T}})
succprob(d::UnivariateDistribution{<:ContiguousSupport})

ccdf(d::DiscreteUnivariateDistribution, x::Integer) = 1.0 - cdf(d, x)
ccdf(d::DiscreteUnivariateDistribution, x::Real) = ccdf(d, floor(Int,x))
ccdf(d::UnivariateDistribution, x) = 1.0 - cdf(d, x)
ccdf(d::UnivariateDistribution{<:ContiguousSupport}, x::Integer) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method not needed?

logcdf(d::DiscreteUnivariateDistribution, x::Real) = logcdf(d, floor(Int,x))
logcdf(d::UnivariateDistribution, x) = log(cdf(d, x))
logcdf(d::UnivariateDistribution{<:ContiguousSupport}, x::Integer) =
log(cdf(d, x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method not needed?

@@ -439,10 +463,13 @@ invlogccdf(d::UnivariateDistribution, lp::Real) = quantile(d, -expm1(lp))

# gradlogpdf

gradlogpdf(d::ContinuousUnivariateDistribution, x::Real) = throw(MethodError(gradlogpdf, (d, x)))
gradlogpdf(d::UnivariateDistribution{<:ContinuousSupport}, x::Real) =
throw(MethodError(gradlogpdf, (d, x)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method needed?

@@ -522,7 +551,7 @@ loglikelihood(d::UnivariateDistribution, X::AbstractArray) = sum(x -> logpdf(d,

macro _delegate_statsfuns(D, fpre, psyms...)
dt = eval(D)
T = dt <: DiscreteUnivariateDistribution ? :Int : :Real
T = eltype(dt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before T was a symbol, does this work as well?

@@ -9,6 +9,34 @@ macro check_args(D, cond)
end
end

isuncountable(x) = missing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc string?

@matbesancon
Copy link
Member

@richardreeve I updated the branch to fix conflicts, can you check @mschauer's comments above and make the corresponding modifications?

@richardreeve
Copy link
Contributor Author

I'll try to get it done as soon as possible, but this is my busiest time of the year for teaching - the summer was when I had time to work on this unfortunately.

end


## a type to indicate zero vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardreeve I messed up my conflict merge, could you remove things related to ZeroVector in this file, it was deleted in:
#1020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants