Skip to content

Commit

Permalink
Data fixes and readme update (#1136)
Browse files Browse the repository at this point in the history
* readme

* batch overfit fix for reproducibility

* batch overfit fix for reproducibility
  • Loading branch information
Scitator committed Mar 25, 2021
1 parent f153d69 commit 5d50190
Show file tree
Hide file tree
Showing 6 changed files with 54 additions and 38 deletions.
8 changes: 5 additions & 3 deletions README.md
Expand Up @@ -29,11 +29,13 @@
</div>

Catalyst is a PyTorch framework for Deep Learning Research and Development.
Catalyst focuses on reproducibility, rapid experimentation, and codebase reuse
It focuses on reproducibility, rapid experimentation, and codebase reuse
so you can create something new rather than write yet another train loop.
<br/> Break the cycle, use Catalyst!
<br/> Break the cycleuse the Catalyst!

Read more about our vision in the [Project Manifest](https://github.com/catalyst-team/catalyst/blob/master/MANIFEST.md). Catalyst is a part of the [PyTorch Ecosystem](https://pytorch.org/ecosystem/). [Catalyst Ecosystem](https://docs.google.com/presentation/d/1D-yhVOg6OXzjo9K_-IS5vSHLPIUxp1PEkFGnpRcNCNU/edit?usp=sharing) consists of:
Read more about our vision in the [Project Manifest](https://github.com/catalyst-team/catalyst/blob/master/MANIFEST.md).
Catalyst is a part of the [PyTorch Ecosystem](https://pytorch.org/ecosystem/).
<br/> [Catalyst Ecosystem](https://docs.google.com/presentation/d/1D-yhVOg6OXzjo9K_-IS5vSHLPIUxp1PEkFGnpRcNCNU/edit?usp=sharing) consists of:
- [Alchemy](https://github.com/catalyst-team/alchemy) - experiments logging & visualization
- [Catalyst](https://github.com/catalyst-team/catalyst) - accelerated deep learning R&D
- [Reaction](https://github.com/catalyst-team/reaction) - convenient deep learning model serving
Expand Down
7 changes: 4 additions & 3 deletions catalyst/data/loader.py
@@ -1,4 +1,5 @@
from typing import Any, Callable, Iterable, Union
from itertools import tee
import queue
import sys
import threading
Expand Down Expand Up @@ -106,7 +107,7 @@ def __init__(self, loader: DataLoader, num_batches: Union[int, float]):
)
num_batches = int(len(loader) * num_batches)

self.iterator = iter(self.origin)
self._iterator = iter(self.origin)
self.iteration_index = 0
self.num_batches = num_batches

Expand All @@ -117,7 +118,7 @@ def __iter__(self):
iterator object
"""
self.iteration_index = 0
self.iterator = iter(self.origin)
self._iterator, self.iterator = tee(self._iterator)
return self

def __next__(self):
Expand All @@ -130,7 +131,7 @@ def __next__(self):
raise StopIteration()
self.iteration_index += 1
if self.iteration_index % self.num_batches == 0:
self.iterator = iter(self.origin)
self._iterator, self.iterator = tee(self._iterator)
batch = next(self.iterator)
return batch

Expand Down
36 changes: 36 additions & 0 deletions catalyst/data/tests/test_loader.py
@@ -0,0 +1,36 @@
# flake8: noqa
import torch
from torch.utils.data import DataLoader, TensorDataset

from catalyst.data.loader import BatchLimitLoaderWrapper


def test_batch_limit1() -> None:
for shuffle in (False, True):
num_samples, num_features = int(1e2), int(1e1)
X, y = torch.rand(num_samples, num_features), torch.rand(num_samples)
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=4, num_workers=1, shuffle=shuffle)
loader = BatchLimitLoaderWrapper(loader, num_batches=1)

batch1 = next(iter(loader))[0]
batch2 = next(iter(loader))[0]
batch3 = next(iter(loader))[0]
assert all(torch.isclose(x, y).all() for x, y in zip(batch1, batch2))
assert all(torch.isclose(x, y).all() for x, y in zip(batch2, batch3))


def test_batch_limit2() -> None:
for shuffle in (False, True):
num_samples, num_features = int(1e2), int(1e1)
X, y = torch.rand(num_samples, num_features), torch.rand(num_samples)
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=4, num_workers=1, shuffle=shuffle)
loader = BatchLimitLoaderWrapper(loader, num_batches=2)

batch1 = next(iter(loader))[0]
batch2 = next(iter(loader))[0]
batch3 = next(iter(loader))[0]
batch4 = next(iter(loader))[0]
assert all(torch.isclose(x, y).all() for x, y in zip(batch1, batch3))
assert all(torch.isclose(x, y).all() for x, y in zip(batch2, batch4))
8 changes: 5 additions & 3 deletions docs/index.rst
Expand Up @@ -10,10 +10,12 @@ PyTorch framework for Deep Learning R&D.
--------------------------------------------------------------------------------

It focuses on reproducibility, rapid experimentation, and codebase reuse
so you can **create** something new rather than write another regular train loop.
so you can **create** something new rather than write yet another train loop.
Break the cycle - use the Catalyst_!

Project manifest_. Part of `PyTorch Ecosystem`_. Part of `Catalyst Ecosystem`_:
Read more about our vision in the `Project Manifest`_. Catalyst is a part of the `PyTorch Ecosystem`_.

`Catalyst Ecosystem`_ consists of:
- Alchemy_ - experiments logging & visualization
- Catalyst_ - accelerated deep learning R&D
- Reaction_ - convenient deep learning models serving
Expand All @@ -25,7 +27,7 @@ Project manifest_. Part of `PyTorch Ecosystem`_. Part of `Catalyst Ecosystem`_:
.. _Alchemy: https://github.com/catalyst-team/alchemy
.. _Catalyst: https://github.com/catalyst-team/catalyst
.. _Reaction: https://github.com/catalyst-team/reaction
.. _manifest: https://github.com/catalyst-team/catalyst/blob/master/MANIFEST.md
.. _`Project Manifest`: https://github.com/catalyst-team/catalyst/blob/master/MANIFEST.md
.. _Catalyst at AI Landscape: https://landscape.lfai.foundation/selected=catalyst

Getting started
Expand Down
31 changes: 3 additions & 28 deletions examples/README.md
Expand Up @@ -2,34 +2,9 @@

## Python API

1. [demo notebook](<./notebooks/demo 21xx.ipynb>) [![Open In Colab](<https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catalyst-team/catalyst/blob/master/examples/notebooks/demo 21xx.ipynb>)
- minimal examples
- Runner customization
- DL and RL pipelines
1. [classification tutorial](./notebooks/classification-tutorial.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catalyst-team/catalyst/blob/master/examples/notebooks/classification-tutorial.ipynb)
- dataset preparation (raw images -> train/valid/infer splits)
- augmentations usage example
- pretrained model finetuning
- various classification metrics
- metrics visualizaiton
- FocalLoss and OneCycle usage examples
- class imbalance handling
- model inference
1. [segmentation tutorial](notebooks/segmentation-tutorial.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catalyst-team/catalyst/blob/master/examples/notebooks/segmentation-tutorial.ipynb)
- car segmentation dataset
- augmentations with [albumentations](https://github.com/albu/albumentations) library
- training in FP16 with [NVIDIA Apex](https://github.com/NVIDIA/apex)
- using segmentation models from `catalyst/contrib/models/segmentation`
- training with multiple criterion (Dice + IoU + BCE) example
- Lookahead + RAdam optimizer usage example
- tensorboard logs visualization
- predictions visualization
- Test-time augmentations with [ttach](https://github.com/qubvel/ttach) library
1. [Pruning tutorial](notebooks/Pruning.ipynb)[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catalyst-team/catalyst/blob/master/examples/notebooks/Pruning.ipynb)
- Pruning intro
- Lottery ticket hypothesis
- Catalyst pruning callback
- Loading training result from logs
Catalyst Python API examples can be found in the
[minimal examples](https://github.com/catalyst-team/catalyst#minimal-examples)
and [notebook section](https://github.com/catalyst-team/catalyst#notebooks).

----

Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements.txt
Expand Up @@ -3,7 +3,7 @@ numpy>=1.16.4
torch>=1.3.0

# Config API
PyYAML
PyYAML>=5.1

# for future development:
# tensorboardX provides tensorboard support for any framework
Expand Down

0 comments on commit 5d50190

Please sign in to comment.