Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add `distributions.Independent` #6324

Merged
merged 10 commits into from Apr 1, 2019

Conversation

Projects
None yet
4 participants
@ganow
Copy link
Contributor

ganow commented Feb 20, 2019

Typical DL frameworks which handle probabilistic distributions such as Tensorflow Probability and PyTorch Distributions have Independent distribution class. I implemented the same interface in Chainer.

Independent distribution reinterprets some of the batch dims of distribution as event dims. This is mainly useful for changing the shape of the result of distribution.log_prob(x), and it makes code clearer and simpler. For example, the current official example of the VAE's objective function is like

...
q_z = D.Normal(loc=mu(x), log_scale=ln_sigma(x))
z = q_z.sample(k)
p_x = D.Bernoulli(logit=h(z))
p_z = D.Normal(zeros_array, ones_array)

reconstr = F.mean(F.sum(p_x.log_prob(
    F.broadcast_to(x[None, :], (k,) + x.shape)), axis=-1))
kl_penalty = F.mean(F.sum(chainer.kl_divergence(q_z, p_z), axis=-1))
...

However, this code assumes the last dimension of p_x, q_z, and p_z should be treated as event dim. This code will not work when we change q_z from D.Normal to D.MultivariateNormal. By using D.Independent, we can rewrite the above code as

...
q_z = D.Independent(D.Normal(loc=mu(x), log_scale=ln_sigma(x)))
z = q_z.sample(k)
p_x = D.Independent(D.Bernoulli(logit=h(z)), reinterpreted_batch_ndims=1)
p_z = D.Independent(
    D.Normal(zeros_array, ones_array), reinterpreted_batch_ndims=1)

reconstr = F.mean(p_x.log_prob(
    F.broadcast_to(x[None, :], (self.k,) + x.shape)))
kl_penalty = F.mean(chainer.kl_divergence(q_z, p_z))
...

This code is assumption-free, and we can treat distribution.event_shape appropriately.

@toslunar
Copy link
Member

toslunar left a comment

Could you separate a PR for adding missing params attributes (6f0f368)?

@ganow

This comment has been minimized.

Copy link
Contributor Author

ganow commented Feb 26, 2019

Could you separate a PR for adding missing params attributes (6f0f368)?

Independent class needs xp to compute covariance property, and distribution.xp refers distribution.params. which means that this implementation will not work without params property for some distributions. Should I split the PR even though?

@kmaehashi kmaehashi requested a review from toslunar Mar 19, 2019

@toslunar toslunar added this to the v6.0.0rc1 milestone Mar 19, 2019

Show resolved Hide resolved chainer/distributions/independent.py Outdated
Show resolved Hide resolved chainer/distributions/independent.py Outdated

@kmaehashi kmaehashi changed the title add `distributions.Independent` Add `distributions.Independent` Mar 20, 2019

@chainer-ci

This comment has been minimized.

Copy link
Collaborator

chainer-ci commented Mar 22, 2019

Can one of the admins verify this patch?

@toslunar toslunar self-requested a review Mar 22, 2019


def _reduce(self, op, stat):
range_ = tuple(
(-1 - numpy.arange(self.reinterpreted_batch_ndims)).tolist())

This comment has been minimized.

Copy link
@toslunar

toslunar Mar 22, 2019

Member

How about range_ = tuple(range(-self.reinterpreted_batch_ndims, 0))?

Note that this relationship holds only if the covariance matrix of the
original distribution is given analytically.
'''
num_repeat = functools.reduce(

This comment has been minimized.

Copy link
@toslunar

toslunar Mar 22, 2019

Member

Could you use chainer.utils.size_of_shape?

num_repeat = functools.reduce(
operator.mul,
self.distribution.batch_shape[-self.reinterpreted_batch_ndims:], 1)
dim = functools.reduce(operator.mul, self.distribution.event_shape, 1)

This comment has been minimized.

Copy link
@toslunar
def _block_indicator(self):
num_repeat = functools.reduce(
operator.mul,
self.distribution.batch_shape[-self.reinterpreted_batch_ndims:], 1)

This comment has been minimized.

Copy link
@toslunar
num_repeat = functools.reduce(
operator.mul,
self.distribution.batch_shape[-self.reinterpreted_batch_ndims:], 1)
dim = functools.reduce(operator.mul, self.distribution.event_shape, 1)

This comment has been minimized.

Copy link
@toslunar
inner_shape, inner_event_shape, reinterpreted_batch_ndims)
return list(map(
lambda dicts: dict(dicts[0], **dicts[1]),
itertools.product(parameter_list, shape_pattern)))

@property
def covariance(self):
'''Returns the covariance of the distribution based on the original

This comment has been minimized.

Copy link
@toslunar

toslunar Mar 22, 2019

Member

Could you use double quotes " and add a summary line?

[H405] Multi line docstrings should start with a one line summary followed by an empty line.

https://docs.openstack.org/hacking/latest/user/hacking.html#docstrings

return self._reduce(prod.prod, self.distribution.cdf(x))

def icdf(self, x):
'''Cumulative distribution function for multivariate variable is not

This comment has been minimized.

Copy link
@toslunar
def event_shape(self):
return self.__event_shape

@property

This comment has been minimized.

Copy link
@toslunar

toslunar Mar 22, 2019

Member

Could you use cache.cached_property because the returned variable depends on the enable_backprop config.

block_indicator = self.xp.reshape(
self._block_indicator,
tuple([1] * len(self.batch_shape)) + self._block_indicator.shape)
return cov * block_indicator

This comment has been minimized.

Copy link
@toslunar

toslunar Mar 22, 2019

Member

To make a block diagonal matrix, the backward of F.diagonal (or the backward of unary F.einsum) could be used.

def diag_einsum(
input_subscripts, output_subscript, *ioperands, **kwargs):
output_shape, = utils.argument.parse_kwargs(kwargs, ('output_shape', None))
return einsum.DiagEinSum(
in_subs=input_subscripts,
out_sub=output_subscript,
out_shape=output_shape,
).apply(ioperands)[0]
@testing.parameterize(*testing.product_dict(
[
{'subscripts': 'i->ij', 'i_shapes': ((3,),), 'o_shape': (3, 4)},
{'subscripts': '->i', 'i_shapes': ((),), 'o_shape': (3,)},
{'subscripts': ',i->ij', 'i_shapes': ((), (2,),), 'o_shape': (2, 3)},
{'subscripts': ',ij->i', 'i_shapes': ((), (3, 4),), 'o_shape': (3,)},
],
[
{'dtype': numpy.float32},
{'dtype': numpy.float64},
]
))
class TestDiagEinSum(unittest.TestCase):

@toslunar

This comment has been minimized.

Copy link
Member

toslunar commented Apr 1, 2019

Jenkins, test this please.

@chainer-ci

This comment has been minimized.

Copy link
Collaborator

chainer-ci commented Apr 1, 2019

Jenkins CI test (for commit 5b28e26, target branch master) failed with status FAILURE.
(For contributors, please wait until the reviewer confirms the details of the error.)

@toslunar
Copy link
Member

toslunar left a comment

I'll merge this because

  • the interface LGTM; and
  • the remaining issues are minor, compared to the importance of the feature.

@toslunar toslunar self-assigned this Apr 1, 2019

@toslunar toslunar merged commit 7a59307 into chainer:master Apr 1, 2019

1 of 3 checks passed

Jenkins Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build failed
Details
coverage/coveralls Coverage decreased (-1.7%) to 86.157%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.