Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I do not think the capacity argument works #3

Closed
alecGraves opened this issue Mar 27, 2019 · 5 comments
Closed

I do not think the capacity argument works #3

alecGraves opened this issue Mar 27, 2019 · 5 comments
Labels
bug Something isn't working

Comments

@alecGraves
Copy link
Owner

What was past me thinking? I do not know 😄

 if self.reg == 'bvae':
            # kl divergence:
            latent_loss = -0.5 * K.mean(1 + stddev
                                - K.square(mean)
                                - K.exp(stddev), axis=-1)
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            latent_loss = self.beta * K.abs(latent_loss - self.capacity/self.shape.as_list()[1])
            self.add_loss(latent_loss, x)

I just randomly subtract a constant from my loss?

This is more like it:

if self.reg == 'bvae':
            # kl divergence:
            latent_losses = -0.5 * (1 + stddev
                                - K.square(mean)
                                - K.exp(stddev))
            # use beta to force less usage of vector space:
            # also try to use <capacity> dimensions of the space:
            bvae_weight = self.beta * K.ones(shape=(self.shape.as_list()[1]-self.capacity))
            if self.capacity > 0:
                vae_weight = K.ones(shape=(self.capacity))
                bvae_weight = K.concatenate([vae_weight, bvae_weight], axis=-1)
            latent_loss = K.abs(K.mean(bvae_weight*latent_losses, axis=-1))
            
            self.add_loss(latent_loss, x)
@alecGraves alecGraves added the bug Something isn't working label Mar 27, 2019
@beatriz-ferreira
Copy link

Hi,

I'm using your implementation of the beta-VAE.
You haven't committed this change of the loss function to your code. Should we use the new one you proposed or the old one? Have you tested that the new one works and does what it is supposed to do?

Thank you in advance!

@alecGraves
Copy link
Owner Author

I have not tested anything. I would recommend leaving the capacity argument at the default value of zero for now.

@beatriz-ferreira
Copy link

Ok, thank you for your reply!
In the beta-VAE paper (https://openreview.net/references/pdf?id=Sy2fzU9gl) there's no capacity parameter. They just play with beta and the size of the latent representation, am I right?

Another suggestion would be to normalize the beta, as they do in the paper, to take into account the dimension of the input and dimension of the latent space :-)

@alecGraves
Copy link
Owner Author

@beatriz-ferreira Thank you for the suggestion, I missed that detail from the paper, but normalization makes a lot of sense.

@alecGraves
Copy link
Owner Author

I removed the capacity argument from the layer. The idea behind the capacity argument is to guide the model to learn a representation with a specific number of distributions, but still allow the model to expand the number used if necessary. It is kinda silly, as a standard VAE allows one to specify the number of distributions to use directly.

I removed this (nonworking) argument from the sampling layer in a recent commit.

Closing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants