Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor Error with Padim in Getting STarted Notebook #1807

Closed
1 task done
ChrisEvo3 opened this issue Mar 2, 2024 · 3 comments · Fixed by #1852
Closed
1 task done

Tensor Error with Padim in Getting STarted Notebook #1807

ChrisEvo3 opened this issue Mar 2, 2024 · 3 comments · Fixed by #1852

Comments

@ChrisEvo3
Copy link

Describe the bug

RuntimeError: The size of tensor a (4096) must match the size of tensor b (65536) at non-singleton dimension 2

This error occurs when training the model

Dataset

MVTec

Model

PADiM

Steps to reproduce the behavior

run the Getting startet notebook, same Problem in 501a_training_a_model_with_cubes_from_a_robotic_arm Notebook

OS information

-Anaconda
-Jupyter Notebook

Expected behavior

from anomalib.engine import Engine
from anomalib.utils.normalization import NormalizationMethod

engine = Engine(
normalization=NormalizationMethod.MIN_MAX,
threshold="F1AdaptiveThreshold",
task=TaskType.CLASSIFICATION,
image_metrics=["AUROC"],
accelerator="auto",
check_val_every_n_epoch=1,
devices=1,
max_epochs=1,
num_sanity_val_steps=0,
val_check_interval=1.0,
)

engine.fit(model=model, datamodule=datamodule)

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

No response

Configuration YAML

?

Logs

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..

  | Name                  | Type                     | Params
-------------------------------------------------------------------
0 | model                 | PadimModel               | 11.2 M
1 | normalization_metrics | MinMax                   | 0     
2 | image_threshold       | F1AdaptiveThreshold      | 0     
3 | pixel_threshold       | F1AdaptiveThreshold      | 0     
4 | image_metrics         | AnomalibMetricCollection | 0     
5 | pixel_metrics         | AnomalibMetricCollection | 0     
-------------------------------------------------------------------
11.2 M    Trainable params
0         Non-trainable params
11.2 M    Total params
44.706    Total estimated model params size (MB)
Training: |                                                                                      | 0/? [00:00<…
Validation: |                                                                                    | 0/? [00:00<…
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[19], line 17
      2 from anomalib.utils.normalization import NormalizationMethod
      4 engine = Engine(
      5     normalization=NormalizationMethod.MIN_MAX,
      6     threshold="F1AdaptiveThreshold",
   (...)
     14     val_check_interval=1.0,
     15 )
---> 17 engine.fit(model=model, datamodule=datamodule)

File ~\anaconda3\Lib\site-packages\anomalib\engine\engine.py:385, in Engine.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    383     self.trainer.validate(model, val_dataloaders, datamodule=datamodule, ckpt_path=ckpt_path)
    384 else:
--> 385     self.trainer.fit(model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\trainer.py:543, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    541 self.state.status = TrainerStatus.RUNNING
    542 self.training = True
--> 543 call._call_and_handle_interrupt(
    544     self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    545 )

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\call.py:44, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
     42     if trainer.strategy.launcher is not None:
     43         return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
---> 44     return trainer_fn(*args, **kwargs)
     46 except _TunerExitException:
     47     _call_teardown_hook(trainer)

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\trainer.py:579, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    572 assert self.state.fn is not None
    573 ckpt_path = self._checkpoint_connector._select_ckpt_path(
    574     self.state.fn,
    575     ckpt_path,
    576     model_provided=True,
    577     model_connected=self.lightning_module is not None,
    578 )
--> 579 self._run(model, ckpt_path=ckpt_path)
    581 assert self.state.stopped
    582 self.training = False

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\trainer.py:986, in Trainer._run(self, model, ckpt_path)
    981 self._signal_connector.register_signal_handlers()
    983 # ----------------------------
    984 # RUN THE TRAINER
    985 # ----------------------------
--> 986 results = self._run_stage()
    988 # ----------------------------
    989 # POST-Training CLEAN UP
    990 # ----------------------------
    991 log.debug(f"{self.__class__.__name__}: trainer tearing down")

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\trainer.py:1032, in Trainer._run_stage(self)
   1030         self._run_sanity_check()
   1031     with torch.autograd.set_detect_anomaly(self._detect_anomaly):
-> 1032         self.fit_loop.run()
   1033     return None
   1034 raise RuntimeError(f"Unexpected state {self.state}")

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\fit_loop.py:205, in _FitLoop.run(self)
    203 try:
    204     self.on_advance_start()
--> 205     self.advance()
    206     self.on_advance_end()
    207     self._restarting = False

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\fit_loop.py:363, in _FitLoop.advance(self)
    361 with self.trainer.profiler.profile("run_training_epoch"):
    362     assert self._data_fetcher is not None
--> 363     self.epoch_loop.run(self._data_fetcher)

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\training_epoch_loop.py:139, in _TrainingEpochLoop.run(self, data_fetcher)
    137 try:
    138     self.advance(data_fetcher)
--> 139     self.on_advance_end(data_fetcher)
    140     self._restarting = False
    141 except StopIteration:

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\training_epoch_loop.py:287, in _TrainingEpochLoop.on_advance_end(self, data_fetcher)
    283 if not self._should_accumulate():
    284     # clear gradients to not leave any unused memory during validation
    285     call._call_lightning_module_hook(self.trainer, "on_validation_model_zero_grad")
--> 287 self.val_loop.run()
    288 self.trainer.training = True
    289 self.trainer._logger_connector._first_loop_iter = first_loop_iter

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\utilities.py:182, in _no_grad_context.<locals>._decorator(self, *args, **kwargs)
    180     context_manager = torch.no_grad
    181 with context_manager():
--> 182     return loop_run(self, *args, **kwargs)

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\evaluation_loop.py:135, in _EvaluationLoop.run(self)
    133     self.batch_progress.is_last_batch = data_fetcher.done
    134     # run step hooks
--> 135     self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
    136 except StopIteration:
    137     # this needs to wrap the `*_step` call too (not just `next`) for `dataloader_iter` support
    138     break

File ~\anaconda3\Lib\site-packages\lightning\pytorch\loops\evaluation_loop.py:396, in _EvaluationLoop._evaluation_step(self, batch, batch_idx, dataloader_idx, dataloader_iter)
    390 hook_name = "test_step" if trainer.testing else "validation_step"
    391 step_args = (
    392     self._build_step_args_from_hook_kwargs(hook_kwargs, hook_name)
    393     if not using_dataloader_iter
    394     else (dataloader_iter,)
    395 )
--> 396 output = call._call_strategy_hook(trainer, hook_name, *step_args)
    398 self.batch_progress.increment_processed()
    400 if using_dataloader_iter:
    401     # update the hook kwargs now that the step method might have consumed the iterator

File ~\anaconda3\Lib\site-packages\lightning\pytorch\trainer\call.py:309, in _call_strategy_hook(trainer, hook_name, *args, **kwargs)
    306     return None
    308 with trainer.profiler.profile(f"[Strategy]{trainer.strategy.__class__.__name__}.{hook_name}"):
--> 309     output = fn(*args, **kwargs)
    311 # restore current_fx when nested context
    312 pl_module._current_fx_name = prev_fx_name

File ~\anaconda3\Lib\site-packages\lightning\pytorch\strategies\strategy.py:412, in Strategy.validation_step(self, *args, **kwargs)
    410 if self.model != self.lightning_module:
    411     return self._forward_redirection(self.model, self.lightning_module, "validation_step", *args, **kwargs)
--> 412 return self.lightning_module.validation_step(*args, **kwargs)

File ~\anaconda3\Lib\site-packages\anomalib\models\image\padim\lightning_model.py:111, in Padim.validation_step(***failed resolving arguments***)
     96 """Perform a validation step of PADIM.
     97 
     98 Similar to the training step, hierarchical features are extracted from the CNN for each batch.
   (...)
    107     These are required in `validation_epoch_end` for feature concatenation.
    108 """
    109 del args, kwargs  # These variables are not used.
--> 111 batch["anomaly_maps"] = self.model(batch["image"])
    112 return batch

File ~\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~\anaconda3\Lib\site-packages\anomalib\models\image\padim\torch_model.py:145, in PadimModel.forward(self, input_tensor)
    143     output = embeddings
    144 else:
--> 145     output = self.anomaly_map_generator(
    146         embedding=embeddings,
    147         mean=self.gaussian.mean,
    148         inv_covariance=self.gaussian.inv_covariance,
    149     )
    150 return output

File ~\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~\AppData\Roaming\Python\Python311\site-packages\torch\nn\modules\module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~\anaconda3\Lib\site-packages\anomalib\models\image\padim\anomaly_map.py:130, in AnomalyMapGenerator.forward(self, **kwargs)
    127 mean: torch.Tensor = kwargs["mean"]
    128 inv_covariance: torch.Tensor = kwargs["inv_covariance"]
--> 130 return self.compute_anomaly_map(embedding, mean, inv_covariance)

File ~\anaconda3\Lib\site-packages\anomalib\models\image\padim\anomaly_map.py:100, in AnomalyMapGenerator.compute_anomaly_map(self, embedding, mean, inv_covariance)
     81 def compute_anomaly_map(
     82     self,
     83     embedding: torch.Tensor,
     84     mean: torch.Tensor,
     85     inv_covariance: torch.Tensor,
     86 ) -> torch.Tensor:
     87     """Compute anomaly score.
     88 
     89     Scores are calculated based on embedding vector, mean and inv_covariance of the multivariate gaussian
   (...)
     98         Output anomaly score.
     99     """
--> 100     score_map = self.compute_distance(
    101         embedding=embedding,
    102         stats=[mean.to(embedding.device), inv_covariance.to(embedding.device)],
    103     )
    104     up_sampled_score_map = self.up_sample(score_map)
    105     return self.smooth_anomaly_map(up_sampled_score_map)

File ~\anaconda3\Lib\site-packages\anomalib\models\image\padim\anomaly_map.py:48, in AnomalyMapGenerator.compute_distance(embedding, stats)
     46 # calculate mahalanobis distances
     47 mean, inv_covariance = stats
---> 48 delta = (embedding - mean).permute(2, 0, 1)
     50 distances = (torch.matmul(delta, inv_covariance) * delta).sum(2).permute(1, 0)
     51 distances = distances.reshape(batch, 1, height, width)

RuntimeError: The size of tensor a (4096) must match the size of tensor b (65536) at non-singleton dimension 2

Code of Conduct

  • I agree to follow this project's Code of Conduct
@CrisGao32
Copy link

Hi there,
I also met the same problem, have you figured out any solutions? Thanks.

@samet-akcay
Copy link
Contributor

are you sure that you are using the latest anomalib version? @djdameln recently fixed 501a notebook with PR #1852

I've just checked it with the most the most recent commit from the main branch, which works fine.
image

I'm closing this issue. If you still encounter this issue, feel free to re-open it. Thanks!

@samet-akcay samet-akcay linked a pull request Mar 21, 2024 that will close this issue
9 tasks
@mohblnk
Copy link

mohblnk commented Apr 3, 2024

I still face this error installing from source...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants