Skip to content

Output of methods in KernelDensity.jl #126

Open
@jaksle

Description

@jaksle

I deal with data analysis quite often and I use KernelDensity.jl which I think is currently the default kernel density library for Julia, being a part of JuliaStats. It works well, but I think there were truly unfortunate design choices made when this library was first designed, which make it less useful than it could be and also not very suitable to be extended.

The problem is as follows: KernelDensity returns only numerical values of the estimated pdf at a fixed grid. You can use it for a plot, but not for much else.

From the point of view of statistics, kernel density is a method of distribution estimation. What you want is a distribution, mixture distribution to be precise. So, the result of the estimation should be parameters of this distribution, which are:

  • modes locations: this is just the original sample,
  • modes pdfs: which is a distribution from the set of kernels with the found bandwidth,
  • weights: typically just uniform.

If you know that you can:

  • calculate the estimated pdf at whatever grid you choose,
  • calculate the corresponding cdf and cf wherever you choose,
  • draw a random sample from this distribution,
  • estimate other stuff such as moments, etc.
  • transform this distribution.

And the class of mixture distributions is implemented in Distributions.jl, so the generic return type is already done. The question is if what I described above is important enough to implement it. And if it would be implemented, there is no way to reconcile it with the old interface. One could just make the new interface and also keep the old one for the sake of compatibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions