Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats.unuran: add a higher level API #8

Closed
wants to merge 30 commits into from

Commits on Jun 9, 2021

  1. ENH: stats: add the UNU.RAN source tree with "uniform" directory removed

    This commit adds UNU.RAN source tree to scipy.stats with Makefiles and configuration
    files removed. The "uniform" directory has also been removed as it contains third-party
    uniform random number generator which we aren't allowed to use. Other files have been
    kept intact.
    tirthasheshpatel committed Jun 9, 2021
    Configuration menu
    Copy the full SHA
    65beb47 View commit details
    Browse the repository at this point in the history
  2. DEP: stats: remove deprecated functionality from UNU.RAN source tree

    This commit removes all the deprecated files from UNU.RAN source tree.
    Declarations of the deprecated functions have been removed from `unuran.h`.
    tirthasheshpatel committed Jun 9, 2021
    Configuration menu
    Copy the full SHA
    d34bf78 View commit details
    Browse the repository at this point in the history
  3. MAINT: stats: replace all the .ch files with .h files

    This commit replaces all the `.ch` files with `.h` files as SciPy's build system
    doesn't understand `.ch` files. The C files including these `.ch` have been edited
    to now include the coressponding `.h` file.
    tirthasheshpatel committed Jun 9, 2021
    Configuration menu
    Copy the full SHA
    c38d637 View commit details
    Browse the repository at this point in the history
  4. REF: stats.unuran: Refactor UNU.RAN URNG to allow NumPy BitGenerator …

    …as default
    
    This commit refactors functions `unur_get_default_urng` and `unur_get_default_urng_aux` to
    allow NumPy BitGenerator as the default URNG. Previously, UNU.RAN defined a macro to get its
    own URNG from the `uniform` directory but as it is now removed, that macro will fail with an
    error. Hence, the macro (present in `unuran_config.h`) has been changed to successfully obtain
    a default URNG (which is set during `scipy.stats` import).
    tirthasheshpatel committed Jun 9, 2021
    Configuration menu
    Copy the full SHA
    d7d3a50 View commit details
    Browse the repository at this point in the history

Commits on Jun 11, 2021

  1. ENH: stats: add a cython wrapper for UNU.RAN and add tests

    This commit adds a Cython wrapper wrapping methods Transformed Density Rejection
    and Discrete Alias-Urn from UNU.RAN. A few tests and docs have been added. I have
    written a small C API for aquiring and releasing Python callbacks during calls to
    UNU.RAN C functions.
    tirthasheshpatel committed Jun 11, 2021
    Configuration menu
    Copy the full SHA
    f9b248b View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2021

  1. REF, TST: stats: accept a single dist object instead of seperate me…

    …thod parameters.
    
    This refactors the public API of the methods TransformedDensityRejection and DiscreteAliasUrn
    method to accept a single `dist` object that contains all the required methods. This closely
    follows the design of scipygh-13319. So, for example, previously one had to write required functions
    seperately and pass them to the method for setup:
    
        >>> from scipy.stats import TransformedDensityRejection
        >>>
        >>> pdf = lambda x: 1-x*x
        >>> dpdf = lambda x: -2*x
        >>>
        >>> rng = TransformedDensityRejection(pdf, dpdf, domain=(-1,1), seed=123)
    
    But, with the new API, one can just pass a `dist` object with all those methods bundled in it:
    
        >>> class MyDist:
        ...     def pdf(self, x):
        ...         return 1-x*x
        ...     def dpdf(self, x):
        ...         return -2*x
        ...
        >>> dist = MyDist()
        >>> rng = TransformedDensityRejection(dist, domain=(-1,1), seed=123)
    
    Tests have been alterned to use the new API. New tests for the API itself need to be added.
    tirthasheshpatel committed Jun 14, 2021
    Configuration menu
    Copy the full SHA
    e167743 View commit details
    Browse the repository at this point in the history

Commits on Jun 15, 2021

  1. TST: stats: refactor the test suite and write stronger tests

    The tests suite has been refactored to be more readable. Some common tests and data
    has been centralized. Justifications for a few tests in the form of comments have
    been added. Some of the validations in the API have also been centralized.
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    7b4f569 View commit details
    Browse the repository at this point in the history
  2. DOC: stats: add references, update rvs docs, and small changes

    This implements some suggestions from the code review:
      * use UNU.RAN manual references in our doc
      * update documentation for RVS method
      * document default value of center as 0 and c as -0.5
      * remove underscores from rvs numbers
      * mention that the variants have been explained in notes
      * "w.r.t the variate" --> "w.r.t x (i.e. the variate)"
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    bcd404c View commit details
    Browse the repository at this point in the history
  3. MAINT: stats: split arguments in separate lines for readability

    * split arguments in separate lines for readability
    * remove the file * lines from ukraine.pxd
    * remove unused/unvalidated arguments from _validate_args for readability
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    8834be7 View commit details
    Browse the repository at this point in the history
  4. TST: stats: use threshold p-value 0.1 instead of 0.999

    Relax the chi-squared tests to pass if p-value is > 0.1 instead of trying
    to assert if the p-value is 0.999 as the latter case occurs only 0.1% of the
    times which is very rare.
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    4918266 View commit details
    Browse the repository at this point in the history
  5. REF, TST: evaluate the PMF in the domain to report helpful errors

    UNU.RAN was unable to recognize inf, nan and all zeros in the PV and threw unhelpful
    "unknown error". Hence, the DAU method has been refactored to evaluate the PMF during
    argument validation and check the values in the PV to report helpful errors. This is
    possible to do as DAU only works for distribution with finite domain.
    
    The tests have also been changed to incorporate this refactor.
    
    Docs of the DAU method have also been updated.
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    03e7dbc View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c146a60 View commit details
    Browse the repository at this point in the history
  7. BUG: stats: fix failing tests

    Two tests failed:
      1. TestDiscreteAliasUrn::test_basic
      2. Both `test_seed` tests
    
    The former was because a non-string distribution name was present in `distdiscrete` which led
    to problems with getattr. The straight-forward fix for this was to skip for non-string `distname`s
    
    The latter was due to a more subtle reason. I used global variable to store the NumPy RNG for sampling
    when `np.__version__ < 1.19`. Because of this, when a new NumPy RNG was set, all the methods would start
    using that RNG and seeding would immediately break. This has been fixed by using a seperate function in the
    base-class `Method` and a global function for default RNG.
    tirthasheshpatel committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    c00613a View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b025c93 View commit details
    Browse the repository at this point in the history

Commits on Jun 17, 2021

  1. Configuration menu
    Copy the full SHA
    eeed862 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7e4928a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1a57c00 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5232c10 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2021

  1. BUG: stats: fix ppf_hat to work for invalid input

    UNU.RAN seg faults when nans are present in the data. The behaviour for quantiles
    less than 0 and greater than 1 is also different than that of SciPy's. So, code
    and tests have been added to match the SciPy's behaviour and handle nans properly.
    tirthasheshpatel committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    b14f518 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b6e7d22 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6ae6b79 View commit details
    Browse the repository at this point in the history
  4. MAINT, TYP: stats: add __init__.pyi and ignore errors in test_samplin…

    …g.py
    
    Stubs were not getting exported as no __init__.pyi file was found by mypy.
    This file has been added and other errors (false positives) have been ignored.
    tirthasheshpatel committed Jun 18, 2021
    Configuration menu
    Copy the full SHA
    d2bcaa7 View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2021

  1. Configuration menu
    Copy the full SHA
    d19d0d8 View commit details
    Browse the repository at this point in the history
  2. DOC: stats: add tutorial

    This commit adds a tutorial on Universal Non-Uniform Random Number Generators
    in SciPy.
    tirthasheshpatel committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    42fb8cd View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    878d736 View commit details
    Browse the repository at this point in the history
  4. TST: stats: add 'randint' to basic_fail_dists

    'randint' was failing with an "unknown error" on 32-bit Ubuntu with NumPy 1.16.5.
    The error is occurring on line 723 in file src/methods/dau.c. Looks like it might
    be due to floating-point errors. As this is not inside SciPy (and also only exists
    on a very specific platform and numpy version), I have skipped this test case.
    
    Also, previously, a test was skipped if the distribution name wasn't string. This
    has been corrected to run the test on that test case.
    tirthasheshpatel committed Jun 19, 2021
    Configuration menu
    Copy the full SHA
    95304db View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2021

  1. TST: stats: add tests from UNU.RAN's test suite

    UNU.RAN tests TDR with some special custom distributions. Those tests have
    been added to SciPy. UNU.RAN's tests DAU with a few random and geometric
    PVs. Our test suite tests for all the discrete distributions and is stronger
    than UNU.RAN's. So, no tests for DAU have been added
    tirthasheshpatel committed Jun 20, 2021
    Configuration menu
    Copy the full SHA
    01cf222 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ca45d2b View commit details
    Browse the repository at this point in the history
  3. ENH: stats.unuran: add a higher level API

    This adds a higher-level class called `GenericGenerator` which can be used to
    create generators using UNU.RAN's String API. Currently, only strings are allowed
    but it can be extended to accept a object containing a UNU.RAN distribution.
    
    One can simply pass a string with the distribution to sample from and create a
    generator:
    
        >>> rng = GenericGenerator("normal")
        >>> rvs = rng.rvs(100_000)
    
    It is also possible to pass the method and its parameters as keyword arguments.
    
        >>> rng = GenericGenerator("normal", method="tdr", domain=(-5, 5), c=0.,
        ...                        cpoints=(-4.5, -3.5, -2.5, 0.0, 2.5, 3.5, 4.5))
        >>> rvs = rng.rvs(100_000)
    
    It is also possible to change the parameters of the distribution. The below example
    creates a samplers for the normal distribution with mean 6 and standard
    deviation 4.25:
    
        >>> rng = GenericGenerator("normal(6, 4.25)")
    
    Advanced use case includes passing a custom distribution with methods coded as strings.
    
        >>> rng = GenericGenerator("distr = cont; pdf = 'exp(-0.5 * x^2)'")
        >>> rvs = rng.rvs(100_000)
    tirthasheshpatel committed Jun 20, 2021
    Configuration menu
    Copy the full SHA
    cd2bd96 View commit details
    Browse the repository at this point in the history
  4. REF: stats: use a dist object as input and remove string API

    Take a distribution object as input.
    tirthasheshpatel committed Jun 20, 2021
    Configuration menu
    Copy the full SHA
    5823dbf View commit details
    Browse the repository at this point in the history