Skip to content

Some comments on the route to Distributions 1.0 #1109

@azev77

Description

@azev77
  • Make it easy to locate all relevant distributions.
    E.g. suppose I want all continuous univariate distributions w/ support [0, +\infty)...
    MLJ.jl makes this easy:
    using MLJ; X, y = @load_boston;
    m = models(): creates a vector of all 132 models.
    m = models(matching(X, y)): vector of 53 models that work w/ the data
    m = models(matching(X, y), x -> x.prediction_type == :deterministic): vec 50 models
    Distributions.jl doesn't currently have the equivalent:
    Distributions.continuous_distributions: arcsine should be Arcsine etc
    The following gets us part of the way there:
    filter(!isabstracttype, subtypes(Distribution))
    filter(!isabstracttype, subtypes(UnivariateDistribution))
    filter(!isabstracttype, subtypes(MultivariateDistribution))
    filter(!isabstracttype, subtypes(MatrixDistribution))
    filter(!isabstracttype, subtypes(ContinuousDistribution))
    filter(!isabstracttype, subtypes(ContinuousMultivariateDistribution))
    For distributions matching support we discussed: all(insupport.(dist, data))

  • it would simplify testing & other things if there was a slightly more structured (almost cookie-cutter) template for adding distributions.
    Some have no default params: Chi(): MethodError: no method matching Chi()
    Sometimes mean= NaN vs mean=Inf
    mean(LogitNormal()) gives error (perhaps use numerical?)
    Some dist entropy throws an error instead of NaN or Inf
    Perhaps: The entropy for this distribution has not been coded. Please submit a PR.
    If no closed form entropy is coded/exists, perhaps entropy() should compute it numerically?
    Fit truncated distributions Feature Request: Fit truncated normal #1108
    Some dist don't have quantiles coded: PGeneralizedGaussian, Skellam, VomMises

  • A convenient way to loop through all available (non-abstract type) distributions.
    You would find some inconsistencies.
    fieldnames(Normal) gives unicode (:μ, :σ)
    fieldnames(Dirichlet) gives (:alpha, :alpha0, :lmnB)

  • before 1.0 check out @cscherrer's note.

  • this repo is still missing many useful distributions (R Task Views)
    This seems like a great job for a student (maybe JSoC or GSoC or otherwise)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions