WIP Refactor Distribution and random variables #2833
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Work in progress!!
Refactoring the Distribution class and the various random variable classes a bit. The goal here is to decouple those, to make things cleaner, and I think it should also make it easier to play with different backends.
It should also make it much easier to implement support for changing the dimension of free variables after model creation (if they depend on a shared variable for instance), and to get rid of the test_value thing in theano.
This changes a bit how we think about the shapes of variables. Previously the shape was set statically by inspecting the test_value of the created theano variable, and then fixed in the distribution. With this PR, each distribution has two shapes:
atom_shape
andparam_shape
. The atom shape is the shape of one observation, so for most distributions this will just be()
. For eg the MVNormal this will be(n,)
.param_shape
is the shape implied by the parameters of the distribution. So if we have apm.Normal('a', mu=np.zeros(5), sd=1)
, the param shape is(5,)
. Since parameters can be theano variables theparam_shape
is not known statically, but can only be computed given values for all previous variables. (The same should probably be true for theatom_shape
, but I haven't implemented that yet).The distribution itself doesn't know the shape of the random variable anymore, this really belongs into the random variable.
Since we do not use the test_values to infer that shape anymore, we need to keep track of the variable shapes in the model. It keeps dicts of shapes and default values (
model._RV_shapes, model._default_values
), and uses those shapes and default values to infer the shape of newly created variables `.