on colab pytorch_lightning v1.2 throws valueerror when following setup step !python training/run_experiment.py --max_epochs=3 #9

ravindrabharathi · 2021-02-19T10:07:56Z

while following the setup steps for colab (https://github.com/full-stack-deep-learning/fsdl-text-recognizer-2021-labs/blob/main/setup/readme.md) , install pytorch_lightning step gets the latest v1.2 .
This version results in the following Error when trying !python training/run_experiment.py --max_epochs=3

If pytorch_lightning 1.1.8 is used (!pip install pytorch_lightning==1.1.8) , the test step works without issues as shown in the image in readme

I haven't explored further to check what might be causing the issue between the two versions (or if it is already a known issue )

Links to colab notebooks with pytorch-lightning v1.2 and v1.1.8
v1.2 : https://colab.research.google.com/drive/1DvfGtym_oZRg2q5R78gWm6997LEZj4Ma?usp=sharing
v1.1.8 : https://colab.research.google.com/drive/1DBjpKEMTJ9w6U3rNltLcHsw976AvNX9j?usp=sharing

  File "training/run_experiment.py", line 90, in <module>
    main()
  File "training/run_experiment.py", line 85, in main
    trainer.fit(lit_model, datamodule=data)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 513, in fit
    self.dispatch()
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 553, in dispatch
    self.accelerator.start_training(self)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 111, in start_training
    self._results = trainer.run_train()
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 614, in run_train
    self.run_sanity_check(self.lightning_module)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 863, in run_sanity_check
    _, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/trainer.py", line 732, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/trainer/evaluation_loop.py", line 164, in evaluation_step
    output = self.trainer.accelerator.validation_step(args)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 178, in validation_step
    return self.training_type_plugin.validation_step(*args)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 128, in validation_step
    return self.lightning_module.validation_step(*args, **kwargs)
  File "/content/fsdl-text-recognizer-2021-labs/lab1/text_recognizer/lit_models/base.py", line 61, in validation_step
    self.val_acc(logits, y)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/metric.py", line 152, in forward
    self.update(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/metric.py", line 199, in wrapped_func
    return update(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/classification/accuracy.py", line 139, in update
    preds, target, threshold=self.threshold, top_k=self.top_k, subset_accuracy=self.subset_accuracy
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/functional/accuracy.py", line 25, in _accuracy_update
    preds, target, mode = _input_format_classification(preds, target, threshold=threshold, top_k=top_k)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/classification/helpers.py", line 439, in _input_format_classification
    top_k=top_k,
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/classification/helpers.py", line 296, in _check_classification_inputs
    _basic_input_validation(preds, target, threshold, is_multiclass)
  File "/usr/local/lib/python3.6/dist-packages/pytorch_lightning/metrics/classification/helpers.py", line 74, in _basic_input_validation
    raise ValueError("The `preds` should be probabilities, but values were detected outside of [0,1] range.")
ValueError: The `preds` should be probabilities, but values were detected outside of [0,1] range.```

The text was updated successfully, but these errors were encountered:

AlexHandy1 · 2021-02-19T15:41:08Z

+1

wayfarerjing · 2021-02-19T22:40:13Z

Same here. Looks like there's a compatibility issue with PL 1.2 >= 1.2:
Lightning-Universe/lightning-bolts#551

Daniel8hen · 2021-02-20T11:29:40Z

+1

numanai · 2021-02-20T13:44:54Z

+1

Tianqiao-Yvonne · 2021-02-21T16:19:37Z

+1

sergeyk · 2021-02-22T23:07:22Z

Thanks for the reports and the fix! Pushed to main branch, closing.

eng-amrahmed mentioned this issue Feb 21, 2021

Lab 3 on Colab -- python training/run_experiment.py --max_epochs=10 --gpus=1 --num_workers=4 --data_class=EMNISTLines --min_overlap=0 --max_overlap=0 --model_class=LineCNNSimple --window_width=28 --window_stride=28 #10

Closed

sergeyk closed this as completed Feb 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on colab pytorch_lightning v1.2 throws valueerror when following setup step !python training/run_experiment.py --max_epochs=3 #9

on colab pytorch_lightning v1.2 throws valueerror when following setup step !python training/run_experiment.py --max_epochs=3 #9

ravindrabharathi commented Feb 19, 2021 •

edited

Loading

AlexHandy1 commented Feb 19, 2021

wayfarerjing commented Feb 19, 2021

Daniel8hen commented Feb 20, 2021

numanai commented Feb 20, 2021

Tianqiao-Yvonne commented Feb 21, 2021

sergeyk commented Feb 22, 2021

on colab pytorch_lightning v1.2 throws valueerror when following setup step !python training/run_experiment.py --max_epochs=3 #9

on colab pytorch_lightning v1.2 throws valueerror when following setup step !python training/run_experiment.py --max_epochs=3 #9

Comments

ravindrabharathi commented Feb 19, 2021 • edited Loading

AlexHandy1 commented Feb 19, 2021

wayfarerjing commented Feb 19, 2021

Daniel8hen commented Feb 20, 2021

numanai commented Feb 20, 2021

Tianqiao-Yvonne commented Feb 21, 2021

sergeyk commented Feb 22, 2021

ravindrabharathi commented Feb 19, 2021 •

edited

Loading