-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingwaiting on authorWaiting on user action, correction, or updateWaiting on user action, correction, or update
Description
Bug description
I don;t have a return statement in my training method but i still get the misconfiguration error for not returning none from training_epoch_end. the code breaks aftet he epoch zero. I'm trying to use lightening to train a ResNet (pretrained) model on new storm images data.
How to reproduce the bug
No response
Error messages and logs
---------------------------------------------------------------------------
MisconfigurationException Traceback (most recent call last)
Input In [81], in <cell line: 2>()
1 storm_model = PretrainedWindModel(hparams=hparams)
----> 2 storm_model.fit()
Input In [78], in PretrainedWindModel.fit(self)
107 def fit(self):
108 self.trainer = pl.Trainer(
109 max_epochs=self.max_epochs,
110 default_root_dir=self.output_path,
(...)
123 num_sanity_val_steps=self.hparams.get("val_sanity_checks", 0),
124 )
--> 125 self.trainer.fit(self)
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py:582, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
580 raise TypeError(f"`Trainer.fit()` requires a `LightningModule`, got: {model.__class__.__qualname__}")
581 self.strategy._lightning_module = model
--> 582 call._call_and_handle_interrupt(
583 self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
584 )
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\call.py:38, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
36 return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
37 else:
---> 38 return trainer_fn(*args, **kwargs)
40 except _TunerExitException:
41 trainer._call_teardown_hook()
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py:624, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
617 ckpt_path = ckpt_path or self.resume_from_checkpoint
618 self._ckpt_path = self._checkpoint_connector._set_ckpt_path(
619 self.state.fn,
620 ckpt_path, # type: ignore[arg-type]
621 model_provided=True,
622 model_connected=self.lightning_module is not None,
623 )
--> 624 self._run(model, ckpt_path=self.ckpt_path)
626 assert self.state.stopped
627 self.training = False
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py:1061, in Trainer._run(self, model, ckpt_path)
1057 self._checkpoint_connector.restore_training_state()
1059 self._checkpoint_connector.resume_end()
-> 1061 results = self._run_stage()
1063 log.detail(f"{self.__class__.__name__}: trainer tearing down")
1064 self._teardown()
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py:1140, in Trainer._run_stage(self)
1138 if self.predicting:
1139 return self._run_predict()
-> 1140 self._run_train()
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\trainer\trainer.py:1163, in Trainer._run_train(self)
1160 self.fit_loop.trainer = self
1162 with torch.autograd.set_detect_anomaly(self._detect_anomaly):
-> 1163 self.fit_loop.run()
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\loops\loop.py:200, in Loop.run(self, *args, **kwargs)
198 self.on_advance_start(*args, **kwargs)
199 self.advance(*args, **kwargs)
--> 200 self.on_advance_end()
201 self._restarting = False
202 except StopIteration:
File C:\Users\Nitin_Kumar\anaconda3\lib\site-packages\pytorch_lightning\loops\fit_loop.py:285, in FitLoop.on_advance_end(self)
283 epoch_end_outputs = self.trainer._call_lightning_module_hook("training_epoch_end", epoch_end_outputs)
284 if epoch_end_outputs is not None:
--> 285 raise MisconfigurationException(
286 "`training_epoch_end` expects a return of None. "
287 "HINT: remove the return statement in `training_epoch_end`."
288 )
289 # free memory
290 self._outputs = []
MisconfigurationException: `training_epoch_end` expects a return of None. HINT: remove the return statement in `training_epoch_end`.
Environment
Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 1.10):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):
More info
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingwaiting on authorWaiting on user action, correction, or updateWaiting on user action, correction, or update