New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of random number generator for Myia #301
Conversation
Alternatively, we could require the seed to be a constant and do that processing outside of myia.
I don't get why you need to do inplace updates. There should be a better way than creating an empty array and updating the values one by one. Not to mention that it's going to be super slow. |
Hi @abergeron Ok, exceptions seem to work now. I removed string formatting, extra exception arguments, and fixed other things and now it does not produce errors anymore. About inplace updates: current generator code produces 1 scalar at a time, so I need to call the generator for as many values as I need. It's something like: rstate = ...
values = []
for _ in range(n_elements):
val, rstate = next_value(rstate)
values.append(val) But now I found a My problem here is that, it seems this scan primitive is not yet used. Is it? |
288511e
to
fd91855
Compare
5808102
to
4da0e05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a first review pass. I did not look at the RNG code yet (I will try to do that soon), but I looked at changes in the other parts of Myia.
It looks good for the most part, although I doubt the changes to allow upcasting will generally work, and I'd like to know why they are in this PR (were they needed to make something work at some point?) More importantly, though, the lack of upcasting so far was on purpose, so if we are going to change that policy we would need to discuss it separately (and it should probably be a toggle).
myia/abstract/amerge.py
Outdated
raise TypeMismatchError(x1, x2) | ||
elif not forced: | ||
# Try to infer the smallest super-type containing both types x1 and x2. | ||
super_type = xtype.number_super_set(x1, x2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this needed for?
If we are doing upcasting, amerge
is probably not a good place to do that, because it will not help us put the proper casts in place. For example, if you have:
x = int8(13)
y = int32(4000)
z = x if condition else y
amerge
will be used to check the type of z and it will say that z has type int32
, but that's not true, strictly speaking: the type of z would have to be Union[int8, int32]
. For it to be int32
we need to wrap x into a cast, but amerge will not tell us to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I needed upcast because I have to multiply an integer with a floating value in RNG, and the floating value could be either 16, 32 or 64 bits. But ultimately, I was able to solve the problem by casting integer to the right floating type using a dictionary.
myia/abstract/infer.py
Outdated
@@ -296,9 +296,19 @@ def get_inferrer_for(self, m: MacroFunction): | |||
argrefs = [self.ref(node, ctx) for node in n_args] | |||
|
|||
if isinstance(fn, AbstractType): | |||
# Replace abstract type instanciation with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Replace abstract type instanciation with | |
# Replace abstract type instantiation with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/abstract/infer.py
Outdated
if isinstance(cls, AbstractScalar): | ||
newcall = g.apply(P.scalar_cast, *n_args, | ||
AbstractScalar({VALUE: ANYTHING, | ||
TYPE: cls.xtype()})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just pass cls directly here instead of building an AbstractScalar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/abstract/infer.py
Outdated
AbstractScalar({VALUE: ANYTHING, | ||
TYPE: cls.xtype()})) | ||
else: | ||
newfn = g.apply(P.partial, P.make_record, fn.xvalue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already have fn.xvalue()
in val
, might as well use it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/compile/backends/pytorch.py
Outdated
|
||
|
||
def _pytorch_array_reduce_mul(tshp): | ||
"""Generate implementation for product reduction based on givne axes.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Generate implementation for product reduction based on givne axes.""" | |
"""Generate implementation for product reduction based on given axes.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/operations/op_full.py
Outdated
|
||
|
||
@core | ||
def full(shape, fill_value, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to have default arguments here.
def full(shape, fill_value, dtype): | |
def full(shape, fill_value, dtype=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/operations/op_full.py
Outdated
or a numpy dtype (e.g. np.int32) | ||
:return: an array | ||
""" | ||
abstract_scalar_type = to_scalar_type(dtype, typeof(fill_value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of the default
argument on to_scalar_type
. I think it would be better to do it like this:
abstract_scalar_type = to_scalar_type(dtype, typeof(fill_value)) | |
if dtype is None: | |
dtype = typeof(fill_value) | |
abstract_scalar_type = to_scalar_type(dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
myia/operations/ops_dunder.py
Outdated
@@ -164,6 +164,7 @@ def protocol(x, y): | |||
name='or', lattr='__or__', rattr='__ror__', | |||
pyop=operator.or_ | |||
) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you separating xor from its friends? :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useless line removed!
tests/operations/test_operations.py
Outdated
@@ -51,3 +54,25 @@ def test_bitwise_lshift(a, b): | |||
) | |||
def test_bitwise_rshift(a, b): | |||
return a >> b | |||
|
|||
|
|||
def test_op_full(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use @mt
for this test and test_op_prod
.
myia/operations/op_full.py
Outdated
:param fill_value: a scalar value | ||
:param dtype: either a string (e.g. 'int32') | ||
or a numpy dtype (e.g. np.int32) | ||
:return: an array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the docstyle we've been using so far (not that we've been super consistent about it). You can check in other files.
Hi @breuleux , ultimately, I split again the PR into 2 branches: this one with just the code related to RNG, and another with operations full and prod : https://github.com/notoraptor/myia/tree/ops-full-and-prod After checking the code, I realized that I finally don't use prod and full in my current RNG implementation, because, ultimately, I simplified the code so that the RNG just generates 1 scalar at a time (allowing me to avoid having to deal with any array to update). So these both ops are not currently useful here. But I still applied your suggestions to whole code before split. Also, when trying to test full op with |
@notoraptor Okay, great! Can you create a PR with the full and prod operations? |
Rebased, and commits cleaned. |
PR updated. Add primitives Actually works only with Pytorch backend. My old RNG implementation is still here. |
So I kind of feel that On the other hand, in the future, I think we want the RNG to become a Myia graph (once the compiler's in a better shape to handle it), so we could also just kick that can down the road. |
|
||
|
||
@standard_prim(P.random_initialize) | ||
async def infer_random_initialize(self, engine, seed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we require the seed to be an integer? If so, add a type check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seed is the key, and it should be a uint32. So I will add a typecheck.
rstate: AbstractRandomState, | ||
low: AbstractScalar, | ||
high: AbstractScalar, | ||
size: AbstractTuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should standardize on the name shape
for shape arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
value_type = AbstractArray(output_scalar_type, | ||
{SHAPE: output_shape, TYPE: xtype.NDArray}) | ||
else: | ||
value_type = output_scalar_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem possible to generate a random array with shape ()
. In general, operations should be as regular as possible, so if the size is a tuple, the output should be an array with that shape. If the user wants a scalar they can use array_to_scalar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. So I should check that given shape is not an empty tuple ? Because it's valid in pytorch:
>>> import torch
>>> torch.zeros(())
tensor(0.)
>>> torch.zeros(()).uniform_()
tensor(0.9694)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's fine if the shape is the empty tuple, but it should return an AbstractArray in that case (with shape == ()). Basically you can just remove the conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
myia/pipeline/standard.py
Outdated
@@ -86,6 +86,8 @@ | |||
np.tanh: operations.array_tanh, | |||
np.sum: operations.sum, | |||
np.prod: operations.prod, | |||
public_api.myia_random_initialize: operations.random_initialize, | |||
public_api.myia_random_uniform: operations.random_uniform, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what public_api
is for. myia.operations
already is a public API that the user is allowed to import from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ok, I did not know. So, user could just directly import operations random_initialize
and random_uint
and use them as functions in python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
tests/test_public_api.py
Outdated
rstate = run_random_generator(run_relay) | ||
key, counter = rstate | ||
assert key == 12345678 | ||
assert counter == 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for the Relay implementation, I would like to see an additional test that tests the exact values returned by the RNG with some specific seed. For example, initialize with seed 1234, generate one 2x2 matrix, check the values, generate a second 2x2 matrix, check the values. That way, if something is changed in the RNG implementation, we can verify that the same numbers come out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
myia/compile/backends/relay.py
Outdated
assert ref_seed.is_constant(type(None)) or ref_seed.is_constant(int) | ||
seed = ref_seed.value | ||
if seed is None: | ||
seed = 12345678 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there should be a default seed, especially not in the backend implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok!
Hi @breuleux I added 2 commits to fix all your points ! By the way, @abergeron was right, some tess (not RNG ones, but other in tests suite) won't work because of changes in TVM api. So, I guess this PR will need to wait for PR #322 at least ! |
- use pytorch RNG implementation based on torch.Generator - implement counter-based RNG Philox2x32 in relay backend
@abergeron @breuleux PR rebased! |
# Convert torch tensor to numpy tensor. | ||
output = self.to_numpy(v) | ||
# If possible and necessary, cast numpy tensor to expected tensor. | ||
array_type = t.element.xtype() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you ever need to do that, then whatever function produced the array has a bug. All primitives should return the right type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abergeron It seems we cannot cast values to neither uint16
nor uint32
nor uint64
(unsigned integers except uint8
) in Pytorch. This code currently fails:
import numpy as np
from myia import myia
from myia.operations import array_cast
@myia(backend='pytorch')
def f(val):
return array_cast(val, np.uint32)
def main():
val = np.arange(4, dtype='float32')
res = f(val)
print(res.dtype)
if __name__ == '__main__':
main()
With this error:
File "/home/notoraptor/mila/dev/git/myia/myia/compile/backends/pytorch.py", line 178, in pytorch_array_cast
dt = _type_map[t.value.xtype()]
KeyError: UInt[32]
Indeed, _type_map
does not handle unsigned types very well: https://github.com/mila-iqia/myia/blob/master/myia/compile/backends/pytorch.py#L16
Checking torch documentation, it seems indeed some torch dtypes are missing: https://pytorch.org/docs/stable/tensor_attributes.html#torch-dtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then for this change, but this is just more motivation for me to drop the pytorch backend.
self.require_constant(e, argnum=f'"3:size[{edx}]"') | ||
for edx, e in enumerate(shape.elements)) | ||
if not output_shape: | ||
output_shape = (1,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be supported at the primitive level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed!
I ran coverage locally: There are some spots missing. |
Update test_rng to test shapes (1,) and (). Test python implementations. Simplify relay_philox.py. Remove unused code.
Code updated ! Except for this: #301 (comment) |
Hi @abergeron @breuleux
I just make this PR to ease any discussion around the RNG implementation.
Currently, I have implemented the RNG in pure Python, based on Theano code. For reference:
Implementation currently seems to work. I also added a test to compare output with the Theano output, and it currently produces same results.
Then, my next step was to try to compile the python implementation into myia, but I am facing many issues:
raise
primitive in operations. I don't yet know why compilation fails.scalar_to_array
+distribute
op, but, from what I saw, it seems these operations are currently used inzero_like
op (isn't it?), which is not suited here. So I guess I need to create azeros
operation that can take just shape and dtype as argument, and I guess I should usenp.zeros
as entry point (ie. value to be replaced by the op later in compilation).np.asarray
op here to ease this step.nstreams=1
). Thus, in intermediate functions, I could just return the new updated state vector (it contains only 6 values) instead of trying to update a potential huge matrix inplace. So, I removed a place where inplace update was used, but I will still need to make inplace updates when generating random numbers, as I will need to a) create the output array, and b) update each array value with a generated random scalar.This is where I am, currently. What do you think?