Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ assignees: ''
Please reproduce using the BoringModel!

You can use the following Colab link:
https://colab.research.google.com/github/PytorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/The_BoringModel.ipynb
https://colab.research.google.com/github/PytorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.ipynb
IMPORTANT: has to be public.

or this simple template:
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.py

If you could not reproduce using the BoringModel and still think there's a bug, please post here
but remember, bugs with code are fixed faster!
Expand All @@ -46,9 +46,9 @@ python collect_env_details.py
You can also fill out the list below manually.
-->

- PyTorch Lightning Version (e.g., 1.3.0):
- PyTorch Version (e.g., 1.8)
- Python version:
- PyTorch Lightning Version (e.g., 1.5.0):
- PyTorch Version (e.g., 1.10):
- Python version (e.g., 3.9):
- OS (e.g., Linux):
- CUDA/cuDNN version:
- GPU models and configuration:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/advanced/fault_tolerant_training.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ Performance Impacts
-------------------

Fault-tolerant Training was tested on common and worst-case scenarios in order to measure the impact of the internal state tracking on the total training time.
On tiny models like the `BoringModel and RandomDataset <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py>`_
On tiny models like the `BoringModel and RandomDataset <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.py>`_
which has virtually no data loading and processing overhead, we noticed up to 50% longer training time with fault tolerance enabled.
In this worst-case scenario, fault-tolerant adds an overhead that is noticeable in comparison to the compute time for dataloading itself.
However, for more realistic training workloads where data loading and preprocessing is more expensive, the constant overhead that fault tolerance adds becomes less noticeable or not noticeable at all.
Expand Down
Loading