Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats: add noncentral hypergeometric distributions (Fisher's and Wallenius') #13330

Merged
merged 52 commits into from
Feb 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
2930eb4
ENH: stats: add unmodified BiasedUrn sent by Agner Fog
mdhaber Jan 2, 2021
83b97b4
ENH: stats: wrap Fisher's NCHG - thanks @mckib2!
mdhaber Jan 2, 2021
cade126
ENH: stats: add fnch distribution to SciPy
mdhaber Jan 3, 2021
39f9c39
ENH: stats: override fnch distribution _stats
mdhaber Jan 3, 2021
ae3ea0e
DOC: stats: fix oops in FNCH tutorial
mdhaber Jan 3, 2021
6601eb8
MAINT: stats: add pre_build_hook for BiasedUrn extension
mdhaber Jan 3, 2021
e70b44a
MAINT: stats: fix name of license.txt
mdhaber Jan 3, 2021
ec7471b
MAINT: stats: remove generated cxx file
mdhaber Jan 3, 2021
001584b
ENH: stats: add Wallenius' NCH
mdhaber Jan 3, 2021
a09a7b0
MAINT: stats: vendor bitgen_t definition for NumPy 1.17 support
mdhaber Jan 3, 2021
95fd004
MAINT: stats: ignore macOS warnings to avoid modifying BiasedUrn code
mdhaber Jan 3, 2021
d158ebb
Merge remote-tracking branch 'upstream/master' into biasedurn
mdhaber Jan 3, 2021
db451d7
DOC: stats: fix mistake from merge conflict
mdhaber Jan 3, 2021
ce8bb16
FIX: add compatibility headers for numpy random c api.
mckib2 Jan 3, 2021
b17575f
Merge pull request #51 from mckib2/biasedurn-npyrandom-fix
mdhaber Jan 3, 2021
e667e6b
DOC: stats: appease sphinx
mdhaber Jan 3, 2021
7014cea
FIX: refactor so biasedurn works with numpy<1.17
mckib2 Jan 4, 2021
c2a7d5b
FIX: changes required to cythonize biasedurn for new numpy c api
mckib2 Jan 4, 2021
7b061d6
FIX: correct function signature of func ptr member
mckib2 Jan 4, 2021
9407c7e
FIX: old numpy needs named parameter in function decl
mckib2 Jan 4, 2021
45b8513
Merge pull request #52 from mckib2/biasedurn-crash-fix
mdhaber Jan 4, 2021
7f7c478
MAINT: stats: revert to state that passes tests on most platforms
mdhaber Jan 4, 2021
2ec6579
FIX: fix undefined function call; add c compatibility header; remove …
mckib2 Jan 5, 2021
6246a9c
FIX: use global object to track random state and fix crashes
mckib2 Jan 5, 2021
dfb4d50
Revert "MAINT: stats: revert to state that passes tests on most platf…
mdhaber Jan 5, 2021
83fe68c
Merge branch 'biasedurn-crash-fix' into biasedurn
mdhaber Jan 5, 2021
a56f5aa
FIX: resolve Generator objects if numpy.random has them; check numpy …
mckib2 Jan 13, 2021
9b3b2be
FIX: don't rely on _lib at build time
mckib2 Jan 13, 2021
92fe2e4
Merge pull request #55 from mckib2/biasedurn-numpy
mdhaber Jan 13, 2021
9b58841
Merge branch 'master' into biasedurn
mdhaber Jan 13, 2021
a3ab512
TST: stats: add test of wnch against implementation in mpmath
mdhaber Jan 26, 2021
55d39d0
MAINT: stats: gitignore auto-generated biasedurn.pyx/cxx files
mdhaber Jan 26, 2021
f0989af
MAINT: stats: correct noncentral hypergeometric rvs for ndarray shape…
mdhaber Jan 26, 2021
0e5b5de
MAINT: stats: refactor _vectorize_rvs_over_shapes for clarity
mdhaber Jan 26, 2021
16485d2
MAINT: stats: fix _vectorize_rvs_over_shapes to follow all broadcasti…
mdhaber Jan 29, 2021
39f13af
DOC: stats: eliminate NCH symbol m_2 (replace with N - m_1)
mdhaber Jan 29, 2021
85c3b45
DOC: stats: replace NCH math N with math M
mdhaber Jan 29, 2021
41389fe
DOC: stats: replace NCH math n with math N
mdhaber Jan 29, 2021
f7643be
DOC: stats: replace NCH math m_1 with math n
mdhaber Jan 29, 2021
598e19c
FIX: update NPY_OLD logic; add normal generator; replace c_fnch with …
mckib2 Jan 30, 2021
5517b53
Merge pull request #56 from mckib2/biasedurn-normal-gen
mdhaber Feb 1, 2021
0ea9c91
Merge branch 'master' into biasedurn
mdhaber Feb 1, 2021
1044c55
MAINT: stats: minor adjustments to documentation and comments per review
mdhaber Feb 10, 2021
cd2a59c
MAINT: stats: rename fnch/wnch to nchypergeom_fisher/wallenius
mdhaber Feb 17, 2021
4dee624
MAINT: stats: Don't use raw string prefix.
WarrenWeckesser Feb 17, 2021
a83285b
Merge branch 'master' into biasedurn
WarrenWeckesser Feb 20, 2021
d9564b4
MAINT: stats: add example invalid parameters for nchypergeoms
mdhaber Feb 20, 2021
adb695e
DOC: stats: note that invalid shape parameters are now also required …
mdhaber Feb 20, 2021
bd3d85d
Update doc/source/dev/contributor/adding_new/new_stats_distribution.r…
mdhaber Feb 20, 2021
92add1e
DOC: Fix markup in contributor docs about adding a new distribution
WarrenWeckesser Feb 22, 2021
bd89a93
STY: stats: Fix a few PEP-8 issues.
WarrenWeckesser Feb 22, 2021
1d7a87e
MAINT: stats: Add a blank line (undo an unintentional change in a pre…
WarrenWeckesser Feb 22, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,8 @@ scipy/stats/mvnmodule.c
scipy/stats/statlibmodule.c
scipy/stats/vonmises_cython.c
scipy/stats/_stats.c
scipy/stats/biasedurn.cxx
scipy/stats/biasedurn.pyx
scipy/stats/_sobol.c
scipy/version.py
scipy/special/_exprel.c
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ Implementation
list in
`scipy/stats/_distr_params.py <https://github.com/scipy/scipy/blob/master/scipy/stats/_distr_params.py#L5>`_.
These shape parameters are used both for testing and automatic documentation generation.
#. Add the name and an _invalid_ set of example shape parameters to the
list in ``invdistcont``, also in
`_distr_params.py <https://github.com/scipy/scipy/blob/master/scipy/stats/_distr_params.py>`_.
These shape parameters are also used for testing.
#. Add a ``TestSquirrel`` class and any specific tests to
`scipy/stats/tests/test_distributions.py <https://github.com/scipy/scipy/blob/master/scipy/stats/tests/test_distributions.py>`_.
#. Run and pass(!) the tests.
Expand All @@ -94,4 +98,4 @@ References
.. [JKB] Johnson, Kotz, and Balakrishnan, "Continuous Univariate Distributions, Volume 1", Second Edition, John Wiley and Sons,
p. 173 (1994).

.. _WikipediaDistributions: https://en.wikipedia.org/wiki/List_of_probability_distributions
.. _WikipediaDistributions: https://en.wikipedia.org/wiki/List_of_probability_distributions
3 changes: 1 addition & 2 deletions doc/source/tutorial/stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,7 @@ introspection:
>>> print('number of continuous distributions: %d' % len(dist_continu))
number of continuous distributions: 102
>>> print('number of discrete distributions: %d' % len(dist_discrete))
number of discrete distributions: 17

number of discrete distributions: 19

Common methods
^^^^^^^^^^^^^^
Expand Down
2 changes: 2 additions & 0 deletions doc/source/tutorial/stats/discrete.rst
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,8 @@ Discrete Distributions in `scipy.stats`
discrete_geom
discrete_nbinom
discrete_hypergeom
discrete_nchypergeom_fisher
discrete_nchypergeom_wallenius
discrete_nhypergeom
discrete_zipf
discrete_zipfian
Expand Down
52 changes: 52 additions & 0 deletions doc/source/tutorial/stats/discrete_nchypergeom_fisher.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@

.. _discrete-nchypergeom-fisher:

Fisher's Noncentral Hypergeometric Distribution
===============================================

A random variable has Fisher's Noncentral Hypergeometric distribution with
parameters

:math:`M \in {\mathbb N}`,
:math:`n \in [0, M]`,
:math:`N \in [0, M]`,
:math:`\omega > 0`,

if its probability mass function is given by

.. math::

p(x; M, n, N, \omega) = \frac{\binom{n}{x}\binom{M - n}{N-x}\omega^x}{P_0},

for
:math:`x \in [x_l, x_u]`,
where
:math:`x_l = \max(0, N - (M - n))`,
:math:`x_u = \min(N, n)`,

.. math::

P_k = \sum_{y=x_l}^{x_u} \binom{n}{y} \binom{M - n}{N-y} \omega^y y^k,

and the binomial coefficients are

.. math::

\binom{n}{k} \equiv \frac{n!}{k! (n - k)!}.

Other functions of this distribution are

.. math::
:nowrap:

\begin{eqnarray*}
\mu & = & \frac{P_0}{P_1},\\
\mu_{2} & = & \frac{P_2}{P_0} - \left(\frac{P_1}{P_0}\right)^2,\\
\end{eqnarray*}

References
----------
- Agner Fog, "Biased Urn Theory", https://cran.r-project.org/web/packages/BiasedUrn/vignettes/UrnTheory.pdf
- "Fisher's noncentral hypergeometric distribution", Wikipedia, https://en.wikipedia.org/wiki/Fisher's_noncentral_hypergeometric_distribution

Implementation: `scipy.stats.nchypergeom_fisher`
42 changes: 42 additions & 0 deletions doc/source/tutorial/stats/discrete_nchypergeom_wallenius.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@

.. _discrete-nchypergeom-wallenius:

Wallenius' Noncentral Hypergeometric Distribution
=================================================

A random variable has Wallenius' Noncentral Hypergeometric distribution with
parameters

:math:`M \in {\mathbb N}`,
:math:`n \in [0, M]`,
:math:`N \in [0, M]`,
:math:`\omega > 0`,

if its probability mass function is given by

.. math::

p(x; N, n, M) = \binom{n}{x} \binom{M - n}{N-x}\int_0^1 \left(1-t^{\omega/D}\right)^x\left(1-t^{1/D}\right)^{N-x} dt

for
:math:`x \in [x_l, x_u]`,
where
:math:`x_l = \max(0, N - (M - n))`,
:math:`x_u = \min(N, n)`,

.. math::

D = \omega(n - x) + ((M - n)-(N-x)),

and the binomial coefficients are

.. math::

\binom{n}{k} \equiv \frac{n!}{k! (n - k)!}.

References
----------
- Agner Fog, "Biased Urn Theory", https://cran.r-project.org/web/packages/BiasedUrn/vignettes/UrnTheory.pdf
- "Wallenius' noncentral hypergeometric distribution", Wikipedia, https://en.wikipedia.org/wiki/Wallenius'_noncentral_hypergeometric_distribution

Implementation: `scipy.stats.nchypergeom_wallenius`
36 changes: 19 additions & 17 deletions scipy/stats/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -174,23 +174,25 @@
.. autosummary::
:toctree: generated/

bernoulli -- Bernoulli
betabinom -- Beta-Binomial
binom -- Binomial
boltzmann -- Boltzmann (Truncated Discrete Exponential)
dlaplace -- Discrete Laplacian
geom -- Geometric
hypergeom -- Hypergeometric
logser -- Logarithmic (Log-Series, Series)
nbinom -- Negative Binomial
nhypergeom -- Negative Hypergeometric
planck -- Planck (Discrete Exponential)
poisson -- Poisson
randint -- Discrete Uniform
skellam -- Skellam
yulesimon -- Yule-Simon
zipf -- Zipf (Zeta)
zipfian -- Zipfian
bernoulli -- Bernoulli
betabinom -- Beta-Binomial
binom -- Binomial
boltzmann -- Boltzmann (Truncated Discrete Exponential)
dlaplace -- Discrete Laplacian
geom -- Geometric
hypergeom -- Hypergeometric
logser -- Logarithmic (Log-Series, Series)
nbinom -- Negative Binomial
nchypergeom_fisher -- Fisher's Noncentral Hypergeometric
nchypergeom_wallenius -- Wallenius's Noncentral Hypergeometric
nhypergeom -- Negative Hypergeometric
planck -- Planck (Discrete Exponential)
poisson -- Poisson
randint -- Discrete Uniform
skellam -- Skellam
yulesimon -- Yule-Simon
zipf -- Zipf (Zeta)
zipfian -- Zipfian

An overview of statistical functions is given below. Many of these functions
have a similar version in `scipy.stats.mstats` which work for masked arrays.
Expand Down
Loading