Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAT implementation is wrong #27

Closed
AskAukNuTutor opened this issue Jun 3, 2019 · 2 comments
Closed

VAT implementation is wrong #27

AskAukNuTutor opened this issue Jun 3, 2019 · 2 comments

Comments

@AskAukNuTutor
Copy link

Thank you for your code.

Following the original VAT paper, consistency_func in hparams.py should be reverse_kl for VAT, although it is set to forward_kl in your code.

The adversarial noise r in VAT is obtained by maximizing D_KL(p(y|x)||p(y|x+r)), however, the consistency loss D_KL(p(y|x+r)||p(y|x)) is used when consistency_func=forward_kl. It matters because of the asymmetricity of KL divergence, I think.

@craffel
Copy link
Contributor

craffel commented Jun 3, 2019

We tried both and forward_kl worked better.

@craffel craffel closed this as completed Jun 3, 2019
@AskAukNuTutor
Copy link
Author

IMO, consistency_func cannot be a hyper-parameter to be tuned (it's a part of the VAT model). If you consider consistency_func to be a hyper-parameter, it should be noted in Table 4 of your NIPS'18 paper.

For instance, I compared the VAT+EntMin with the following two settings in the CIFAR10-4000 scenario:
Setting-A: consistency_func=forward_kl and max_cons_multiplier=0.3 (original parameters)
Setting-B: consistency_func=reverse_kl and max_cons_multiplier=1.0 (modified parameters)

As a result, I observed that VAT+EntMin with Setting-B outperformed that with Setting-A about 2% in test error rates (11.7% vs 13.7%). Of course it is a result of a single run, so I do not insist that Setting-B outperforms Setting-A in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants