Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

test() function is correct? #47

Open
yunsangq opened this issue Aug 19, 2019 · 10 comments
Open

test() function is correct? #47

yunsangq opened this issue Aug 19, 2019 · 10 comments

Comments

@yunsangq
Copy link

I tested cifar10(abnormal_class: bird) with the latest commit code which solved the loss problem.

As a result, a maximum of 0.579(AUC) was obtained, which was different from the GANomaly-bird score of 0.510 on paper(skip-ganomaly).

So while checking the source code, I found that self.netg.eval() is missing from the test() function in lib/model.py

So I added self.netg.eval() to 197 line and retested to get a 0.410 AUC.
It was quite different from the results of the paper.

I think it's right to test by adding eval(). What do you think?
If adding eval() is right, what do you think about the evaluation results of the paper?

@dsuess
Copy link

dsuess commented Aug 23, 2019

I got the same result. Also, when you shuffle the test-dataset, you get the same lower score. I think here's what's happening:

  • the batchnorm is not frozen during inference and changes while we perform inference
  • since the weights are "used to" the original batchnorm values, we get better reconstructions at the beginning of the evaluation
  • since the "normal" examples are evaluated first and the anomaly score is based on the reconstruction error, the model's evaluation results are skewed in the original implementation

@tiandamiao
Copy link

When testing(the test set shuffle = True), how do you solve it?

@tiandamiao
Copy link

@dsuess

@dsuess
Copy link

dsuess commented Aug 30, 2019

Could you please be more specific what you mean by "solve"?

@tiandamiao
Copy link

Could you please be more specific what you mean by "solve"?

I mean, when shuffle the test set, the AUC value is very low, how to solve it and why? And,when testing ,Does the test set need to be shuffled during testing? The default is False.
second,in the test function, Do you think it's necessary to add self.netg.eval()? I want to get more reliable results.
My confusion stems entirely from the testing phase.I hope to hear from you.thanks

@dsuess
Copy link

dsuess commented Aug 31, 2019

I think this rather shows that there's severe problem with this implementation -- the evaluation score shouldn't depend on the order. As far as I can tell, the only way to reproduce the high scores from the paper is by not shuffling, which biases the results as outlined above. Maybe @samet-akcay could clarify things

@tiandamiao
Copy link

@samet-akcay I really need your help.

@oziris
Copy link

oziris commented Sep 17, 2019

The test function doesn't call the eval() mode (this influences the BatchNorm layers behavior) .

While in training:
self.netg.train()
self.netd.train()

While in testing:
self.netg.eval()
self.netd.eval()

In testing only generator part is used, so actually you can do just:
self.opt.phase = 'test'
self.netg.eval()

@ChristianEschen
Copy link

I would agree that one should use the .eval() function during testing

@lzzlxxlsz
Copy link

@samet-akcay I really need your help.

你好,请问一下,你实验结果怎么样呢对于你自己的数据集

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants