You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
really like your implementation but noticed the static batch size was causing me all sorts of grief when i wanted to play around with training. after a bit of mucking around i came up with a solution that i feel is a little more elegant.
basically the issue arises because during construction there is a call to the instantiated layer. at that point the tensor being passed in as "x" to the sampling layers call() function has an undefined batch_size. at build time all we need to do is return a tensor with the appopriate shape, we dont actually need to call the K.random_normal() function which is the only part of this function that needs the batch_size explicitly.
long story short, stick this in your Sampling.call() function:
# trick to allow setting batch at train/eval timeifx[0].shape[0].value==None:
returnmean+0*stddev
in context that is (i made some slight other changes to function but you can ignore them, this is just so you can see how y fix would fit into the function):
defcall(self, x):
iflen(x) !=2:
raiseException('input layers must be a list: mean and stddev')
iflen(x[0].shape) !=2orlen(x[1].shape) !=2:
raiseException('input shape is not a vector [batchSize, latentSize]')
mean=x[0]
stddev=x[1]
# trick to allow setting batch at train/eval timeifx[0].shape[0].value==None:
returnmean+0*stddevifself.reg:
# kl divergence:latent_loss=-0.5*K.mean(1+stddev-K.square(mean)
-K.exp(stddev), axis=-1)
ifself.reg=='bvae':
# use beta to force less usage of vector space:# also try to use <capacity> dimensions of the space:latent_loss=self.beta*K.abs(latent_loss-self.capacity/self.shape.as_list()[1])
self.add_loss(latent_loss, x)
epsilon=K.random_normal(shape=self.shape,
mean=0., stddev=1.)
ifself.random:
# 'reparameterization trick':returnmean+K.exp(stddev/2) *epsilonelse: # do not perform random sampling, simply grab the impulse valuereturnmean+0*stddev# Keras needs the *0 so the gradinent is not None
The text was updated successfully, but these errors were encountered:
Ok, thanks for the recommendation! I was not sure how to get around the issue of not knowing batch size at runtime, this makes a lot of sense. I will try to update the repo when I have some free time.
heyo,
really like your implementation but noticed the static batch size was causing me all sorts of grief when i wanted to play around with training. after a bit of mucking around i came up with a solution that i feel is a little more elegant.
basically the issue arises because during construction there is a call to the instantiated layer. at that point the tensor being passed in as "x" to the sampling layers call() function has an undefined batch_size. at build time all we need to do is return a tensor with the appopriate shape, we dont actually need to call the K.random_normal() function which is the only part of this function that needs the batch_size explicitly.
long story short, stick this in your Sampling.call() function:
in context that is (i made some slight other changes to function but you can ignore them, this is just so you can see how y fix would fit into the function):
The text was updated successfully, but these errors were encountered: