1 epoch #67

dk-teknologisk-mlnn · 2022-01-10T10:52:10Z

Is it meant to be only one epoch in training?
Your config files state 1 epoch, is that just as quick example?
I tried to train PADIM for 10 on mvtec leather and wood and the metrics stay the same anyway, so it seems nothing as gained by training more.
Lightning module also warn that there is no optimizer so I guess train only finds the correct thresholds and that takes 1 epoch.

samet-akcay · 2022-01-10T11:19:04Z

Hi @sequoiagrove, PADIM algorithm doesn't require any CNN-based learning. It rather uses the CNN to extract the features of the training set, which is then used to fit a multivariate gaussian model. We therefore use 1 epoch to go through the entire training set and extract the features.

samet-akcay · 2022-01-10T11:21:02Z

For the warning that there is no optimiser is also related to the above statement. Since we use the CNN only to extract the features, there is no optimiser set for the CNN training.

dk-teknologisk-mlnn · 2022-01-10T12:15:39Z

that's what I thought.
to make the infrence work I had to copy paste some code snippets from your differnt branches to get a displayable heatmap that makes sense.
One issue is that there is not "stats" and thresholds in meta_data, only the image size, so I changed it to return the anomaly map unaltered : output ,score= inference.predict(image=args.image_path, superimpose=False)
and the inference.py takes anomaly_map, image_score . then I normalize it myself i = (i-min) / (max-min) . Looking good.
I tried training my own example with only good samples. It highlighted most my flaws, except some very small subtle changes.
is that expected outcome and will it be better at finding small anomalies if I provide annotated anomaly images in training? or do I need to choose one of the other models for such challenges?

Nevertheless, Impressive work :)

samet-akcay · 2022-01-10T12:30:09Z

One issue is that there is not "stats" and thresholds in meta_data, only the image size, so I changed it to return the anomaly map unaltered : output ,score= inference.predict(image=args.image_path, superimpose=False)
and the inference.py takes anomaly_map, image_score . then I normalize it myself i = (i-min) / (max-min) . Looking good.
I tried training my own example with only good samples. It highlighted most my flaws, except some very small subtle changes.

This is a PR we just merged this morning, and haven't thoroughly tested yet. Maybe @ashwinvaidya17 could provide a better insight here.

is that expected outcome and will it be better at finding small anomalies if I provide annotated anomaly images in training? or do I need to choose one of the other models for such challenges?

The models don't use annotated images, so adding them wouldn't help. To find the small anomalies, you could either increase the image size or configure tiling from the config file. This is mainly because when a large image is resized into 256x256, detecting small anomalies becomes even smaller and detecting them becomes almost impossible. Using larger input or tiled input image could provide better performance.

In addition, our hyper-parameter optimisation tool will soon become publicly available so that parameter tuning could also be done to find the best parametrisation for custom datasets.

dk-teknologisk-mlnn · 2022-01-10T12:54:46Z

ah ok , I checked out this friday. Now I re-checked out and now I get good maps out of the box, as long I reverted lightning back to 1.3.6 , put number of workers down to a reasonable amount, and add cv2.waitKey() after imshow.

samet-akcay · 2022-01-10T13:05:39Z

Yeah, there is a PR that bumps up the lightning version to 1.6.0dev, but there are some breaking changes, and might take some time to merge this.

Good catch for cv2.waitKey(), we'll add it asap

dk-teknologisk-mlnn · 2022-01-10T13:45:34Z

Also found that line 149 in torch.py (inferencer) has to be 👍
anomaly_map = anomaly_map.detach().numpy()
in order to run stfpm models.

and patchcore cannot train at the moment due to datatypes:

| Name | Type | Params

0 | image_threshold | AdaptiveThreshold | 0
1 | pixel_threshold | AdaptiveThreshold | 0
2 | training_distribution | AnomalyScoreDistribution | 0
3 | min_max | MinMax | 0
4 | image_metrics | MetricCollection | 0
5 | pixel_metrics | MetricCollection | 0
6 | model | PatchcoreModel | 68.9 M

68.9 M Trainable params
0 Non-trainable params
68.9 M Total params
275.533 Total estimated model params size (MB)
Epoch 0: 6%|███████████▌ | 8/132 [00:26<06:56, 3.36s/it]Traceback (most recent call last):
File "tools\train.py", line 66, in
train()
File "tools\train.py", line 61, in train
trainer.fit(model=model, datamodule=datamodule)
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 458, in fit
self._run(model)
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 756, in _run
self.dispatch()
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 797, in dispatch
self.accelerator.start_training(self)
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 807, in run_stage
return self.run_train()
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 869, in run_train
self.train_loop.run_training_epoch()
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 566, in run_training_epoch
self.on_train_epoch_end(epoch_output)
File "C:\Anaconda3\envs\anomalib_env\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 606, in on_train_epoch_end
training_epoch_end_output = model.training_epoch_end(processed_epoch_output)
File "d:\projects\anomalib\anomalib\models\patchcore\model.py", line 297, in training_epoch_end
embedding = self.model.subsample_embedding(embedding, sampling_ratio)
File "d:\projects\anomalib\anomalib\models\patchcore\model.py", line 229, in subsample_embedding
random_projector.fit(embedding)
File "d:\projects\anomalib\anomalib\models\patchcore\utils\sampling\random_projection.py", line 124, in fit
self.sparse_random_matrix = self._sparse_random_matrix(n_features=n_features).to(device)
File "d:\projects\anomalib\anomalib\models\patchcore\utils\sampling\random_projection.py", line 85, in _sparse_random_matrix
components[i, c_idx] = data.double()
IndexError: tensors used as indices must be long, byte or bool tensors

samet-akcay · 2022-01-10T13:47:48Z

Thanks for reporting these!

ashwinvaidya17 · 2022-01-10T14:22:06Z

@sequoiagrove Thanks for reporting these 😀
The inference.py does not superimpose anomaly maps. It would be good to add an option for this and make it a part of this issue.
I'll try to reproduce the patchcore issue but it seems to be working in the tests. I'll have a look.

dk-teknologisk-mlnn · 2022-01-10T14:55:07Z

could it be confudsion between torch install that happened when I struggled to enable my GPU?
form conda list:

pytorch 1.10.1 py3.8_cuda11.3_cudnn8_0 pytorch
pytorch-lightning 1.3.6 pypi_0 pypi
torch 1.8.1 pypi_0 pypi
torch-metrics 1.1.7 pypi_0 pypi
torchaudio 0.10.1 py38_cu113 pytorch
torchmetrics 0.6.2 pypi_0 pypi
torchvision 0.9.1 pypi_0 pypi

ashwinvaidya17 · 2022-01-10T15:55:32Z

@sequoiagrove Could be. That's another issue that's been on our list for some time 🙃

dk-teknologisk-mlnn · 2022-01-11T07:21:28Z

fixed it:
in patchcore/utils/random projections.py line 79-83:
c_idx = torch.tensor(
sample_without_replacement(
n_population=n_features, n_samples=nnz_idx, random_state=self.random_state
),dtype=torch.long
)

dk-teknologisk-mlnn · 2022-01-11T07:24:59Z

patchcore inference:
File "d:\projects\anomalib\anomalib\utils\normalization\min_max.py", line 31, in normalize
normalized = ((targets - threshold) / (max_val - min_val)) + 0.5
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor'

mixed datatypes.

Brilliant usage of union btw :)
I guess it is the end of patchcore training it should be casting the types consistently?

meta_data is:

{'image_threshold': tensor(2.0865), 'pixel_threshold': tensor(2.8785), 'min': tensor(0.7478), 'max': tensor(4.2167), 'image_shape': (1024, 1024)}

anomalymap is tensor and pred_score is array.

if I run padim, both of them are tensors.

dk-teknologisk-mlnn · 2022-01-11T07:50:46Z

Found the issue. I prnted the data types trough out the inference. in model.py score and map is tensor all the way, it is in the deploy/torch.py you ask "isinstance(map, tensor)" . andboth of them are, but it is false because it is two tensors, not just one. and the code for false is to convert pred_score to numpy. but anomaly_map is assumed to be numpy already. the metadata is still tensors so I cant just also convert the map to numpy.
if I just don't convert the score it works, but I guess that breaks some of the other models.
so we need to handle the special case of getting two tensors.

dk-teknologisk-mlnn · 2022-01-11T07:56:45Z

this works for all three models I trained padim, patchcore and stfpm:

    if isinstance(predictions, Tensor):
        anomaly_map = predictions
        pred_score = anomaly_map.reshape(-1).max()
    else:
        if isinstance(predictions[1],( Tensor)):
            anomaly_map, pred_score = predictions
            pred_score = pred_score.detach()
        else:               
            anomaly_map, pred_score = predictions
            pred_score = pred_score.detach().numpy()

dk-teknologisk-mlnn · 2022-01-11T15:22:56Z

I tried to make a new environment to install the exact versions in your requirements. I had to make the same fixes as above to get patchcore working.
In mvtec examples it works well on carpet, wood and leather. but on for example screw it is nowhere near the performance reported. Is the mvtec benchmark for all the mvtec data categories trained with different hyperparameters?
So far the best model on my own datasets is padim.

DATALOADER:0 TEST RESULTS
{'image_AUROC': 0.44906747341156006,
'image_F1': 0.8561151623725891,
'pixel_AUROC': 0.9092798233032227,
'pixel_F1': 0.03107343800365925}

ashwinvaidya17 · 2022-01-11T15:28:14Z

@sequoiagrove It is possible that some metrics might have diverged from when we collected the results. There is a plan to re-evaluate all the algorithms. Also, a benchmarking script is in PR state which will help gather results but merging this is pushed back before a refactor we are planning. Here is a tip, if you want to log anomaly images, you can modify log_images to log_images_to: [local]. It will save the results in the results folder after training completion.

dk-teknologisk-mlnn · 2022-01-11T15:44:55Z

diverged metrics: dropping from 0.99 to 0.44 and 0.03 is rather critical?
log images: nice :)
Also padim dropped in performance, but not as crazy.
Here's a patchcore result of "good" parts:

this is padim:
DATALOADER:0 TEST RESULTS
{'image_AUROC': 0.7589669823646545,
'image_F1': 0.8787878751754761,
'pixel_AUROC': 0.9781586527824402,
'pixel_F1': 0.22379672527313232}

dk-teknologisk-mlnn · 2022-01-11T16:36:22Z

I tried this other patchcore repo [ https://github.com/hcw-00/PatchCore_anomaly_detection ]on mvtec/screws and it gives me:

{'img_auc': 0.5911047345767575, 'pixel_auc': 0.9048583939897462}

and rather random anomaly maps as well..

samet-akcay · 2022-01-12T05:07:33Z

Thanks for reporting this discrepancy @sequoiagrove. We'll investigate the benchmarks

samet-akcay assigned ashwinvaidya17 Jan 10, 2022

blakshma mentioned this issue Jan 16, 2022

Updated coreset subsampling method to improve accuracy #73

Merged

This was referenced Jan 17, 2022

Diverged metrics in PatchCore: dropping from 0.99 to 0.44 and 0.03 is rather critical? #74

Closed

Fix Mix Data type issue on inferencer #77

Merged

samet-akcay closed this as completed in #77 Jan 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1 epoch #67

1 epoch #67

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

ashwinvaidya17 commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

ashwinvaidya17 commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022 •

edited

Loading

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

ashwinvaidya17 commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

samet-akcay commented Jan 12, 2022

1 epoch #67

1 epoch #67

Comments

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

samet-akcay commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

| Name | Type | Params

0 | image_threshold | AdaptiveThreshold | 0 1 | pixel_threshold | AdaptiveThreshold | 0 2 | training_distribution | AnomalyScoreDistribution | 0 3 | min_max | MinMax | 0 4 | image_metrics | MetricCollection | 0 5 | pixel_metrics | MetricCollection | 0 6 | model | PatchcoreModel | 68.9 M

samet-akcay commented Jan 10, 2022

ashwinvaidya17 commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 10, 2022

ashwinvaidya17 commented Jan 10, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022 • edited Loading

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

ashwinvaidya17 commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

dk-teknologisk-mlnn commented Jan 11, 2022

samet-akcay commented Jan 12, 2022

0 | image_threshold | AdaptiveThreshold | 0
1 | pixel_threshold | AdaptiveThreshold | 0
2 | training_distribution | AnomalyScoreDistribution | 0
3 | min_max | MinMax | 0
4 | image_metrics | MetricCollection | 0
5 | pixel_metrics | MetricCollection | 0
6 | model | PatchcoreModel | 68.9 M

dk-teknologisk-mlnn commented Jan 11, 2022 •

edited

Loading