-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multimethods for the random module #73
Multimethods for the random module #73
Conversation
Can you come up with a reproducer for it? |
What happened: Calling Reproducing the error: In [1]: import numpy as np
In [2]: import dask.array as da
In [3]: pvals = [1/6] * 6
In [4]: da.random.multinomial(20, da.from_array(np.asarray(pvals))) TracebackTypeError Traceback (most recent call last)
<ipython-input-4-eed5bfc93022> in <module>
----> 1 da.random.multinomial(20, da.from_array(np.asarray(pvals)))
~/miniconda3/envs/unumpy/lib/python3.7/site-packages/dask/array/random.py in multinomial(self, n, pvals, size, chunks,
**kwargs)
336 size=size,
337 chunks=chunks,
--> 338 extra_chunks=((len(pvals),),),
339 )
340
~/miniconda3/envs/unumpy/lib/python3.7/site-packages/dask/array/random.py in _wrap(self, funcname, size, chunks,
extra_chunks, *args, **kwargs)
189 (0,) * len(size),
190 small_args,
--> 191 small_kwargs,
192 )
193
~/miniconda3/envs/unumpy/lib/python3.7/site-packages/dask/array/random.py in _apply_random(RandomState, funcname,
state_data, size, args, kwargs)
466 state = RandomState(state_data)
467 func = getattr(state, funcname)
--> 468 return func(*args, size=size, **kwargs)
469
470
mtrand.pyx in numpy.random.mtrand.RandomState.multinomial()
~/miniconda3/envs/unumpy/lib/python3.7/site-packages/dask/array/core.py in __len__(self)
1186 def __len__(self):
1187 if not self.chunks:
-> 1188 raise TypeError("len() of unsized object")
1189 return sum(self.chunks[0])
1190
TypeError: len() of unsized object What's expected to happen: Return the same result as when passing a list of In [1]: import numpy as np
In [2]: import dask.array as da
In [3]: pvals = [1/6] * 6
In [4]: da.random.multinomial(20, pvals)
Out[4]: dask.array<multinomial, shape=(6,), dtype=int64, chunksize=(6,), chunktype=numpy.ndarray> Environment:
|
The simpler fix would be to not dispatch |
More multimethods/classes: Should we include these as well? |
Here's the docs for |
This is fixed, thanks for the help.
I've added these as well. |
|
||
|
||
class RandomState(metaclass=ClassOverrideMetaWithConstructor): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add methods for these classes as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at: https://github.com/Quansight-Labs/unumpy/blob/master/unumpy/_multimethods.py#L244-L295
You will see how the ufunc
class defines methods. The same needs to be done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for RandomState
or Generator
as well? As I understand we just have to write the multimethods again but as class methods, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prioritize Generator
, but also RandomState
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will see how the
ufunc
class defines methods. The same needs to be done here.
I tried calling ufunc
's accumulate
method but it returns a BackendNotImplementedError
. I think the methods are broken and need to be rewritten in terms of overridden_class
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does reduce
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it works if reduce
is mapped in _implementations
in a given backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need a way to set the domain
of classes to numpy.classname
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a working implementation that serves as a simple solution (although maybe not the ideal) which maps the Generator
's methods. In the NumPy backend would be something like this:
for k, v in unumpy.random.Generator.__dict__.items():
if isinstance(v, _Function):
_implementations[v] = getattr(np.random.Generator, k)
So for each multimethod in unumpy's Generator
class it maps the corresponding method in NumPy's Generator
. I think this for
loop is at least better than manually adding the methods to the dictionary since they are manifold.
Maybe we need a way to set the
domain
of classes tonumpy.classname
.
This sounds like a more interesting solution since the problem resides in the incomplete name of a given class method (i.e., without the class name prefix). I've looked into __qualname__
which seems to give the full name (without the modules) so we might want to use something like that.
The last commit adds the multimethods for the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, @joaosferreira. Any further comments, @peterbell10?
Thanks, @joaosferreira! |
This picks up the work started in #46 that added multimethods for random routines. Most of the multimethods added in the previous PR where revised and hopefully some corrections were made. These were mostly changing the argument replacers and marking arguments for dispatching. This PR also adds two classes,
RandomState
andGenerator
. The multimethods added are manifold and are listed below:Seeding and State
get_state
set_state
seed
Simple random data
rand
randn
randint
random_integers
random_sample
choice
bytes
Permutations
shuffle
permutation
Distributions
beta
binomial
chisquare
dirichlet
exponential
f
gamma
geometric
gumbel
hypergeometric
laplace
logistic
lognormal
logseries
multinomial
multivariate_normal
negative_binomial
noncentral_chisquare
noncentral_f
normal
pareto
poisson
power
rayleigh
standard_cauchy
standard_exponential
standard_gamma
standard_normal
standard_t
triangular
uniform
vonmises
wald
weibull
zipf
Notes:
random
,ranf
andsample
. Although they are documented as aliases forrandom_sample
they all reference different objects and so they have their own multimethods.Generator
class is commented out in the tests because I'm not entirely sure on what argument I should pass it.multinomial
is failing with a weird error in the Dask backend. I think it needs to be issued upstream but I'm not sure.