Getting NaN loss values while training PatchCore model on custom dataset #288

RahaviSelvarajan · 2022-04-27T17:42:05Z

Describe the bug

I am training the patchcore model on a custom dataset. During training, loss value in the progress bar is shown as NaN. Why would this happen?

Screenshots
Epoch 0: 2%|█▋ Aggregating the embedding extracted from the training set.5/218 [00:06<04:30, 1.27s/it, loss=nan]
Creating CoreSet Sampler via k-Center Greedy
Getting the coreset from the main embedding.
Assigning the coreset as the memory bank.
Epoch 1: 2%|███▏ Aggregating the embedding extracted from the training set. | 5/218 [01:34<1:07:23, 18.98s/it, loss=nan]
Creating CoreSet Sampler via k-Center Greedy
Getting the coreset from the main embedding.
Assigning the coreset as the memory bank.
Epoch 2: 2%|███▏ Aggregating the embedding extracted from the training set. | 5/218 [06:39<4:43:55, 79.98s/it, loss=nan]

alexriedel1 · 2022-04-27T21:18:51Z

This is because you do not update any weights during the "training" of patchcore but only save features while inferencing your training data (https://arxiv.org/pdf/2106.08265.pdf). Doing this for multiple epochs will not improve the model unless you apply random image augmentations in each epoch (which is why the default number of training epochs is 1 and you shouldn't change this parameter unless for any obvious reasons)

samet-akcay · 2022-04-28T06:23:32Z

@RahaviSelvarajan, as @alexriedel1 pointed out, the reason is because the CNN is only used for feature extraction, which doesn't produce a loss value. If this looks confusing, we could consider removing it.

nguyenanhtuan1008 · 2022-10-04T09:00:01Z

@samet-akcay so in this situation. Train 1 epoch is the same result compared to multiple epochs?

samet-akcay · 2022-10-05T12:43:35Z

@nguyenanhtuan1008, yes. Training single vs multiple epoch would yield the same result for all models that use CNN for feature extraction.

The exception would be to apply random image augmentations, as @alexriedel1 pointed out above.

nixczhou · 2022-11-25T14:23:09Z

hi regarding this, how to do augmentation in patchcore. Seems like patchcore need heavy augmentation right?

samet-akcay added the Model label Apr 28, 2022

samet-akcay self-assigned this Jun 17, 2022

ashwinvaidya17 added this to the Backlog milestone Oct 3, 2022

djdameln closed this as completed Dec 20, 2022

ashwinvaidya17 mentioned this issue Dec 22, 2022

[Task]: loss=nan #809

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting NaN loss values while training PatchCore model on custom dataset #288

Getting NaN loss values while training PatchCore model on custom dataset #288

RahaviSelvarajan commented Apr 27, 2022

alexriedel1 commented Apr 27, 2022

samet-akcay commented Apr 28, 2022

nguyenanhtuan1008 commented Oct 4, 2022

samet-akcay commented Oct 5, 2022 •

edited

nixczhou commented Nov 25, 2022

Getting NaN loss values while training PatchCore model on custom dataset #288

Getting NaN loss values while training PatchCore model on custom dataset #288

Comments

RahaviSelvarajan commented Apr 27, 2022

alexriedel1 commented Apr 27, 2022

samet-akcay commented Apr 28, 2022

nguyenanhtuan1008 commented Oct 4, 2022

samet-akcay commented Oct 5, 2022 • edited

nixczhou commented Nov 25, 2022

samet-akcay commented Oct 5, 2022 •

edited