Many bug fixes, added flexibility, parity tests with pytorch and more
Pre-release
Pre-release
Overview
This release aims at fixing particular issues and improving the user development experience via extending docs, adding typing and supporting python 3.8. In particular, some of the release highlights are:
- Added benchmark for comparing lightning with vanilla implementations
- Extended optimizer support with particular frequency
- Several improvements for loggers such as represent no-primitive types, supporting hierarchical dictionaries for hyper param searchers
- Added model configuration checking before it runs
- Simplify the PL examples structure (shallower and more readable)
- Improved Trainer CLI arguments handling (generalization)
- Two Trainer argument become deprecated:
print_nan_gradsandshow_progress_bar
Detail changes
Added
- Added same step loggers' metrics aggregation (#1278)
- Added parity test between a vanilla MNIST model and lightning model (#1284)
- Added parity test between a vanilla RNN model and lightning model (#1351)
- Added Reinforcement Learning - Deep Q-network (DQN) lightning example (#1232)
- Added support for hierarchical
dict(#1152) - Added
TrainsLoggerclass (#1122) - Added type hints to
pytorch_lightning.core(#946) - Added support for
IterableDatasetin validation and testing (#1104) - Added support for non-primitive types in
hparamsforTensorboardLogger(#1130) - Added a check that stops the training when loss or weights contain
NaNorinfvalues. (#1097) - Added support for
IterableDatasetwhenval_check_interval=1.0(default), this will trigger validation at the end of each epoch. (#1283) - Added
summarymethod to Profilers. (#1259) - Added informative errors if user defined dataloader has zero length (#1280)
- Added testing for python 3.8 (#915)
- Added a
training_epoch_endmethod which is the mirror ofvalidation_epoch_end. (#1357) - Added model configuration checking (#1199)
- Added support for optimizer frequencies through
LightningModule.configure_optimizers()(#1269) - Added option to run without an optimizer by returning
Nonefromconfigure_optimizers. (#1279) - Added a warning when the number of data loader workers is small. (#1378)
Changed
- Changed (renamed and refactored)
TensorRunningMean->TensorRunningAccum: running accumulations were generalized. (#1278) - Changed
progress_bar_refresh_ratetrainer flag to disable progress bar when setting to 0. (#1108) - Enhanced
load_from_checkpointto also forward params to the model (#1307) - Updated references to self.forward() to instead use the
__call__interface. (#1211) - Changed default behaviour of
configure_optimizersto use no optimizer rather than Adam. (#1279) - Allow uploading models on W&B (#1339)
- On DP and DDP2 unsqueeze is automated now (#1319)
- Did not always create a DataLoader during reinstantiation, but the same type as before (if a subclass of DataLoader) (#1346)
- Did not interfere with a default sampler (#1318)
- Removed default Adam optimizer (#1317)
- Gave warnings for unimplemented required lightning methods (#1317)
- Made
evaluatemethod private >>Trainer._evaluate(...). (#1260) - Simplify the PL examples structure (shallower and more readable) (#1247)
- Changed min-max GPU memory to be on their own plots (#1358)
- Remove
.itemwhich causes sync issues (#1254) - Changed smoothing in TQDM to decrease variability of time remaining between training/eval (#1194)
- Change default logger to a dedicated one (#1064)
Deprecated
- Deprecated Trainer argument
print_nan_grads(#1097) - Deprecated Trainer argument
show_progress_bar(#1108)
Removed
- Removed duplicated module
pytorch_lightning.utilities.arg_parsefor loading CLI arguments (#1167) - Removed wandb logger's
finalizemethod (#1193) - Dropped
torchvisiondependency in tests and added own MNIST dataset class instead (#986)
Fixed
- Fixed
model_checkpointwhen saving all models (#1359) Trainer.add_argparse_argsclassmethod fixed. Now it adds a type for the arguments (#1147)- Fixed bug related to type cheking of
ReduceLROnPlateaulr schedulers(#1114) - Fixed a bug to ensure lightning checkpoints to be backward compatible (#1132)
- Fixed a bug that created an extra dataloader with active
reload_dataloaders_every_epoch(#1181) - Fixed all warnings and errors in the docs build process (#1191)
- Fixed an issue where
val_percent_check=0would not disable validation (#1251) - Fixed average of incomplete
TensorRunningMean(#1309) - Fixed
WandbLogger.watchwithwandb.init()(#1311) - Fixed an issue with early stopping that would prevent it from monitoring training metrics when validation is disabled / not implemented (#1235)
- Fixed a bug that would cause
trainer.test()to run on the validation set when overloadingvalidation_epoch_endandtest_end(#1353) - Fixed
WandbLogger.watch- use of the watch method without importingwandb(#1311) - Fixed
WandbLoggerto be used with 'ddp' - allow reinits in sub-processes (#1149, #1360) - Made
training_epoch_endbehave likevalidation_epoch_end(#1357) - Fixed
fast_dev_runrunning validation twice (#1365) - Fixed pickle error from quick patch
__code__(#1352) - Fixed memory leak on GPU0 (#1094, #1349)
- Fixed checkpointing interval (#1272)
- Fixed validation and training loops run the partial dataset (#1192)
- Fixed running
on_validation_endonly on main process in DDP (#1125) - Fixed
load_spawn_weightsonly in proc rank 0 (#1385) - Fixes
use_ampissue (#1145) - Fixes using deprecated
use_ampattribute (#1145) - Fixed Tensorboard logger error: lightning_logs directory not exists in multi-node DDP on nodes with rank != 0 (#1375)
- Fixed
Unimplemented backend XLAerror on TPU (#1387)
Contributors
@alexeykarnachev, @amoudgl, @areshytko, @asafmanor, @awaelchli, @bkkaggle, @bmartinn, @Borda, @borisdayma, @cmpute, @djbyrne, @ethanwharris, @gerardrbentley, @jbschiratti, @jeremyjordan, @justusschock, @monney, @mpariente, @pertschuk, @rmrao, @S-aiueo32, @shubhamagarwal92, @SkafteNicki, @sneiman, @tullie, @vanpelt, @williamFalcon, @xingzhaolee
If we forgot someone due to not matching commit email with GitHub account, let us know :]