Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New spire.stats package (take 2) #554

Closed
wants to merge 12 commits into from

Conversation

andrelfpinto
Copy link
Contributor

Hello again,
This is a second attempt to an alternate proposal for the random package as seen in #214. For more information about the first one (to avoid redundancy) refer to #460. The only difference is that this one complies with the restructuring that took place to add scala-js support.
I'm sorry that I cannot give continuous contributions to the project.
As always, feel free to suggest modifications and improvements.

In the present structure, random.Dist is a way of producing arbitrary
values from a Generator. However a probability distribution should be a
way of assign a probability to each measurable subset of the possible
outcomes of a random experiment [1], i.e. you should be able to ask it
for pdf, cdf, mean etc. The mechanism for production of values according
to a probability distribution function should in reality be a random
variable [2].
With the proposed separation it should be possible to characterize a
probability distribution and still be able to modify the production of
values from the Generator (the role of random.Dist would basically go to
stats.RandomVariable).
stats.Distribution and stats.RandomVariable would be immutable.

[1] http://en.wikipedia.org/wiki/Probability_distribution
[2] http://en.wikipedia.org/wiki/Random_variable
Extract Anderson-Darling test from random.GaussianTest to be able to use
it in the library, with any continuous distribution.
Create location-scale family to help implementation of these families,
e.g. uniform and normal distributions.
In a location-scale family a general distribution can be written as
a function of the standard distribution (zero location and unit
scale) and its own location and scale.
Enrich Generator to return the next random value based on a specified
type. That way one can ask for a next[Double] or next[Int] instead of
nextDouble or nextInt.
However there's a name clash, therefore the method name used is
next0[A].
Should this method be included in the Generator class itself?
Discrete distributions need 2 types: one integral for the discrete
sample space, used in the generation of random values, and one
fractional for the calculations of pdf, cdf etc.
Changes also propagate to Distribution definition.
@denisrosset denisrosset mentioned this pull request Mar 20, 2018
@denisrosset
Copy link
Collaborator

Grouped in its own issue. This should probably go in a separate module, maybe with the spire.random subpackage.
@andrelfpinto Are you interested moving this forward and reopen this PR?

@andrelfpinto
Copy link
Contributor Author

I will try to. Maybe I will need some help.

@denisrosset denisrosset reopened this Mar 20, 2018
@denisrosset
Copy link
Collaborator

Are you at nescala btw? If you do anything, please base your PR on the modularization PR, and put stuff in a separate module such as spire-stats or spire-random. We hope to get this big changes done as soon as possible, but we may hit roadblocks.

JoeWrightss pushed a commit to JoeWrightss/spire that referenced this pull request Apr 13, 2019
@larsrh
Copy link
Contributor

larsrh commented Aug 8, 2020

Closing this because it hasn't seen activity in over a year.

@larsrh larsrh closed this Aug 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants