Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of random number generator for Myia #301

Merged
merged 2 commits into from Feb 21, 2020

Conversation

notoraptor
Copy link
Collaborator

Hi @abergeron @breuleux

I just make this PR to ease any discussion around the RNG implementation.

Currently, I have implemented the RNG in pure Python, based on Theano code. For reference:

Implementation currently seems to work. I also added a test to compare output with the Theano output, and it currently produces same results.

Then, my next step was to try to compile the python implementation into myia, but I am facing many issues:

  • Python implementation checks input seeds, as all values are not allowed, and generate some exceptions if bad seeds are given. It seems I can not currently compile exceptions raising with Myia, though I saw a raise primitive in operations. I don't yet know why compilation fails.
  • I need to generate the output array and populate it with random values, but I don't yet have a way to create new data. @breuleux told me about scalar_to_array + distribute op, but, from what I saw, it seems these operations are currently used in zero_like op (isn't it?), which is not suited here. So I guess I need to create a zeros operation that can take just shape and dtype as argument, and I guess I should use np.zeros as entry point (ie. value to be replaced by the op later in compilation).
  • I also need to create the initial random state from given seed, which can be either an integer or a ready-to-use integer vector with 6 values. I guess I should create something like np.asarray op here to ease this step.
  • I simplified the random generator so that it uses only one vector as state to update, instead of using a matrix as in Theano (in Theano, this should be equivalent to get random numbers with parameter nstreams=1). Thus, in intermediate functions, I could just return the new updated state vector (it contains only 6 values) instead of trying to update a potential huge matrix inplace. So, I removed a place where inplace update was used, but I will still need to make inplace updates when generating random numbers, as I will need to a) create the output array, and b) update each array value with a generated random scalar.

This is where I am, currently. What do you think?

@abergeron
Copy link
Contributor

  • Python implementation checks input seeds, as all values are not allowed, and generate some exceptions if bad seeds are given. It seems I can not currently compile exceptions raising with Myia, though I saw a raise primitive in operations. I don't yet know why compilation fails.

raise should work. However I see that some exceptions use string formatting and that doesn't work. Also, you can't have anything other than a single string in the exception args.

Alternatively, we could require the seed to be a constant and do that processing outside of myia.

  • I simplified the random generator so that it uses only one vector as state to update, instead of using a matrix as in Theano (in Theano, this should be equivalent to get random numbers with parameter nstreams=1). Thus, in intermediate functions, I could just return the new updated state vector (it contains only 6 values) instead of trying to update a potential huge matrix inplace. So, I removed a place where inplace update was used, but I will still need to make inplace updates when generating random numbers, as I will need to a) create the output array, and b) update each array value with a generated random scalar.

I don't get why you need to do inplace updates. There should be a better way than creating an empty array and updating the values one by one. Not to mention that it's going to be super slow.

@notoraptor
Copy link
Collaborator Author

Hi @abergeron

Ok, exceptions seem to work now. I removed string formatting, extra exception arguments, and fixed other things and now it does not produce errors anymore.

About inplace updates: current generator code produces 1 scalar at a time, so I need to call the generator for as many values as I need. It's something like:

rstate = ...
values = []
for _ in range(n_elements):
    val, rstate = next_value(rstate)
    values.append(val)

But now I found a array_scan implementation in Myia. So, if I use it, with something like distribute(scalar_to_array(0)) as array input, and an adequate function, I guess I could avoid create an array to be updated, and just let the scan function generate it for me, right?

My problem here is that, it seems this scan primitive is not yet used. Is it?

Copy link
Member

@breuleux breuleux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first review pass. I did not look at the RNG code yet (I will try to do that soon), but I looked at changes in the other parts of Myia.

It looks good for the most part, although I doubt the changes to allow upcasting will generally work, and I'd like to know why they are in this PR (were they needed to make something work at some point?) More importantly, though, the lack of upcasting so far was on purpose, so if we are going to change that policy we would need to discuss it separately (and it should probably be a toggle).

raise TypeMismatchError(x1, x2)
elif not forced:
# Try to infer the smallest super-type containing both types x1 and x2.
super_type = xtype.number_super_set(x1, x2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this needed for?

If we are doing upcasting, amerge is probably not a good place to do that, because it will not help us put the proper casts in place. For example, if you have:

x = int8(13)
y = int32(4000)
z = x if condition else y

amerge will be used to check the type of z and it will say that z has type int32, but that's not true, strictly speaking: the type of z would have to be Union[int8, int32]. For it to be int32 we need to wrap x into a cast, but amerge will not tell us to do this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I needed upcast because I have to multiply an integer with a floating value in RNG, and the floating value could be either 16, 32 or 64 bits. But ultimately, I was able to solve the problem by casting integer to the right floating type using a dictionary.

@@ -296,9 +296,19 @@ def get_inferrer_for(self, m: MacroFunction):
argrefs = [self.ref(node, ctx) for node in n_args]

if isinstance(fn, AbstractType):
# Replace abstract type instanciation with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Replace abstract type instanciation with
# Replace abstract type instantiation with

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

if isinstance(cls, AbstractScalar):
newcall = g.apply(P.scalar_cast, *n_args,
AbstractScalar({VALUE: ANYTHING,
TYPE: cls.xtype()}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just pass cls directly here instead of building an AbstractScalar?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

AbstractScalar({VALUE: ANYTHING,
TYPE: cls.xtype()}))
else:
newfn = g.apply(P.partial, P.make_record, fn.xvalue())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already have fn.xvalue() in val, might as well use it :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!



def _pytorch_array_reduce_mul(tshp):
"""Generate implementation for product reduction based on givne axes."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Generate implementation for product reduction based on givne axes."""
"""Generate implementation for product reduction based on given axes."""

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!



@core
def full(shape, fill_value, dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to have default arguments here.

Suggested change
def full(shape, fill_value, dtype):
def full(shape, fill_value, dtype=None):

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

or a numpy dtype (e.g. np.int32)
:return: an array
"""
abstract_scalar_type = to_scalar_type(dtype, typeof(fill_value))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of the default argument on to_scalar_type. I think it would be better to do it like this:

Suggested change
abstract_scalar_type = to_scalar_type(dtype, typeof(fill_value))
if dtype is None:
dtype = typeof(fill_value)
abstract_scalar_type = to_scalar_type(dtype)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -164,6 +164,7 @@ def protocol(x, y):
name='or', lattr='__or__', rattr='__ror__',
pyop=operator.or_
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you separating xor from its friends? :(

Suggested change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless line removed!

@@ -51,3 +54,25 @@ def test_bitwise_lshift(a, b):
)
def test_bitwise_rshift(a, b):
return a >> b


def test_op_full():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use @mt for this test and test_op_prod.

:param fill_value: a scalar value
:param dtype: either a string (e.g. 'int32')
or a numpy dtype (e.g. np.int32)
:return: an array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the docstyle we've been using so far (not that we've been super consistent about it). You can check in other files.

@notoraptor
Copy link
Collaborator Author

Hi @breuleux , ultimately, I split again the PR into 2 branches: this one with just the code related to RNG, and another with operations full and prod : https://github.com/notoraptor/myia/tree/ops-full-and-prod

After checking the code, I realized that I finally don't use prod and full in my current RNG implementation, because, ultimately, I simplified the code so that the RNG just generates 1 scalar at a time (allowing me to avoid having to deal with any array to update). So these both ops are not currently useful here. But I still applied your suggestions to whole code before split.

Also, when trying to test full op with @mt, I got some issues, one with relay backend, and some related to string handling (when passing d-type as a string value). I don't yet know why I have these errors, but I could work on it later.

@breuleux
Copy link
Member

@notoraptor Okay, great! Can you create a PR with the full and prod operations?

@notoraptor
Copy link
Collaborator Author

@breuleux done here ! #310

@notoraptor
Copy link
Collaborator Author

Rebased, and commits cleaned.

@notoraptor
Copy link
Collaborator Author

PR updated. Add primitives random_initialize and random_uniform + placeholder functions myia_random_initialize and myia_random_uniform that allow to generate uniform random tensors.

Actually works only with Pytorch backend.

My old RNG implementation is still here.

@breuleux
Copy link
Member

So I kind of feel that random_uniform as a primitive may be a bit too high-level, one issue being that it throws away 8 bits of information for each float, and there's no way for the user to recover them, so if e.g. they actually wanted a matrix of random uint32 from 0 to MAXINT, they couldn't do that. Would it be possible to have random_uint instead, and do the conversion to the [0, 1) range using Myia operations? This would require finding out how to generate a max range uint32 with PyTorch.

On the other hand, in the future, I think we want the RNG to become a Myia graph (once the compiler's in a better shape to handle it), so we could also just kick that can down the road.



@standard_prim(P.random_initialize)
async def infer_random_initialize(self, engine, seed):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we require the seed to be an integer? If so, add a type check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seed is the key, and it should be a uint32. So I will add a typecheck.

rstate: AbstractRandomState,
low: AbstractScalar,
high: AbstractScalar,
size: AbstractTuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should standardize on the name shape for shape arguments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

value_type = AbstractArray(output_scalar_type,
{SHAPE: output_shape, TYPE: xtype.NDArray})
else:
value_type = output_scalar_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem possible to generate a random array with shape (). In general, operations should be as regular as possible, so if the size is a tuple, the output should be an array with that shape. If the user wants a scalar they can use array_to_scalar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So I should check that given shape is not an empty tuple ? Because it's valid in pytorch:

>>> import torch
>>> torch.zeros(())
tensor(0.)
>>> torch.zeros(()).uniform_()
tensor(0.9694)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's fine if the shape is the empty tuple, but it should return an AbstractArray in that case (with shape == ()). Basically you can just remove the conditional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

@@ -86,6 +86,8 @@
np.tanh: operations.array_tanh,
np.sum: operations.sum,
np.prod: operations.prod,
public_api.myia_random_initialize: operations.random_initialize,
public_api.myia_random_uniform: operations.random_uniform,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what public_api is for. myia.operations already is a public API that the user is allowed to import from.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok, I did not know. So, user could just directly import operations random_initialize and random_uint and use them as functions in python code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

rstate = run_random_generator(run_relay)
key, counter = rstate
assert key == 12345678
assert counter == 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for the Relay implementation, I would like to see an additional test that tests the exact values returned by the RNG with some specific seed. For example, initialize with seed 1234, generate one 2x2 matrix, check the values, generate a second 2x2 matrix, check the values. That way, if something is changed in the RNG implementation, we can verify that the same numbers come out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

assert ref_seed.is_constant(type(None)) or ref_seed.is_constant(int)
seed = ref_seed.value
if seed is None:
seed = 12345678
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there should be a default seed, especially not in the backend implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok!

@notoraptor
Copy link
Collaborator Author

Hi @breuleux I added 2 commits to fix all your points !

By the way, @abergeron was right, some tess (not RNG ones, but other in tests suite) won't work because of changes in TVM api. So, I guess this PR will need to wait for PR #322 at least !

- use pytorch RNG implementation based on torch.Generator
- implement counter-based RNG Philox2x32 in relay backend
@notoraptor notoraptor changed the title [wip][do not merge] Implementation of random number generator for Myia Implementation of random number generator for Myia Feb 20, 2020
@notoraptor
Copy link
Collaborator Author

@abergeron @breuleux PR rebased!

# Convert torch tensor to numpy tensor.
output = self.to_numpy(v)
# If possible and necessary, cast numpy tensor to expected tensor.
array_type = t.element.xtype()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you ever need to do that, then whatever function produced the array has a bug. All primitives should return the right type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abergeron It seems we cannot cast values to neither uint16 nor uint32 nor uint64 (unsigned integers except uint8) in Pytorch. This code currently fails:

import numpy as np
from myia import myia
from myia.operations import array_cast


@myia(backend='pytorch')
def f(val):
    return array_cast(val, np.uint32)


def main():
    val = np.arange(4, dtype='float32')
    res = f(val)
    print(res.dtype)


if __name__ == '__main__':
    main()

With this error:

  File "/home/notoraptor/mila/dev/git/myia/myia/compile/backends/pytorch.py", line 178, in pytorch_array_cast
    dt = _type_map[t.value.xtype()]

KeyError: UInt[32]

Indeed, _type_map does not handle unsigned types very well: https://github.com/mila-iqia/myia/blob/master/myia/compile/backends/pytorch.py#L16

Checking torch documentation, it seems indeed some torch dtypes are missing: https://pytorch.org/docs/stable/tensor_attributes.html#torch-dtype

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then for this change, but this is just more motivation for me to drop the pytorch backend.

self.require_constant(e, argnum=f'"3:size[{edx}]"')
for edx, e in enumerate(shape.elements))
if not output_shape:
output_shape = (1,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be supported at the primitive level.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removed!

@abergeron
Copy link
Contributor

I ran coverage locally:
report.txt

There are some spots missing.

Update test_rng to test shapes (1,) and ().

Test python implementations.

Simplify relay_philox.py.

Remove unused code.
@notoraptor
Copy link
Collaborator Author

Code updated ! Except for this: #301 (comment)

@breuleux breuleux merged commit 0294648 into mila-iqia:master Feb 21, 2020
@notoraptor notoraptor deleted the rng branch March 6, 2020 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants