I do not think the capacity argument works #3

alecGraves · 2019-03-27T04:14:19Z

What was past me thinking? I do not know 😄

 if self.reg == 'bvae':
            # kl divergence:
            latent_loss = -0.5 * K.mean(1 + stddev
                                - K.square(mean)
                                - K.exp(stddev), axis=-1)
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            latent_loss = self.beta * K.abs(latent_loss - self.capacity/self.shape.as_list()[1])
            self.add_loss(latent_loss, x)

I just randomly subtract a constant from my loss?

This is more like it:

if self.reg == 'bvae':
            # kl divergence:
            latent_losses = -0.5 * (1 + stddev
                                - K.square(mean)
                                - K.exp(stddev))
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            bvae_weight = self.beta * K.ones(shape=(self.shape.as_list()[1]-self.capacity))
            if self.capacity > 0:
                vae_weight = K.ones(shape=(self.capacity))
                bvae_weight = K.concatenate([vae_weight, bvae_weight], axis=-1)
            latent_loss = K.abs(K.mean(bvae_weight*latent_losses, axis=-1))
            
            self.add_loss(latent_loss, x)

beatriz-ferreira · 2019-04-24T15:43:02Z

Hi,

I'm using your implementation of the beta-VAE.
You haven't committed this change of the loss function to your code. Should we use the new one you proposed or the old one? Have you tested that the new one works and does what it is supposed to do?

Thank you in advance!

alecGraves · 2019-04-25T00:52:13Z

I have not tested anything. I would recommend leaving the capacity argument at the default value of zero for now.

beatriz-ferreira · 2019-04-25T02:11:23Z

Ok, thank you for your reply!
In the beta-VAE paper (https://openreview.net/references/pdf?id=Sy2fzU9gl) there's no capacity parameter. They just play with beta and the size of the latent representation, am I right?

Another suggestion would be to normalize the beta, as they do in the paper, to take into account the dimension of the input and dimension of the latent space :-)

alecGraves · 2019-04-27T01:41:42Z

@beatriz-ferreira Thank you for the suggestion, I missed that detail from the paper, but normalization makes a lot of sense.

alecGraves · 2019-06-05T17:10:27Z

I removed the capacity argument from the layer. The idea behind the capacity argument is to guide the model to learn a representation with a specific number of distributions, but still allow the model to expand the number used if necessary. It is kinda silly, as a standard VAE allows one to specify the number of distributions to use directly.

I removed this (nonworking) argument from the sampling layer in a recent commit.

Closing...

alecGraves added the bug Something isn't working label Mar 27, 2019

beldaz mentioned this issue Apr 23, 2019

better way to handle batch_size #1

Closed

alecGraves mentioned this issue Apr 27, 2019

Implement Beta-normalization #6

Open

alecGraves closed this as completed Jun 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I do not think the capacity argument works #3

I do not think the capacity argument works #3

alecGraves commented Mar 27, 2019

beatriz-ferreira commented Apr 24, 2019

alecGraves commented Apr 25, 2019

beatriz-ferreira commented Apr 25, 2019

alecGraves commented Apr 27, 2019

alecGraves commented Jun 5, 2019

I do not think the capacity argument works #3

I do not think the capacity argument works #3

Comments

alecGraves commented Mar 27, 2019

beatriz-ferreira commented Apr 24, 2019

alecGraves commented Apr 25, 2019

beatriz-ferreira commented Apr 25, 2019

alecGraves commented Apr 27, 2019

alecGraves commented Jun 5, 2019