Skip to content

Commit

Permalink
Clean up transforms docs
Browse files Browse the repository at this point in the history
  • Loading branch information
willprice committed Jan 7, 2019
1 parent 755f269 commit 2986515
Show file tree
Hide file tree
Showing 3 changed files with 154 additions and 142 deletions.
60 changes: 41 additions & 19 deletions docs/source/transforms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ transforms followed by a :class:`CollectFrames` transform and a
Optical flow stored as flattened :math:`(u, v)` pairs like
:math:`(u_0, v_1, u_1, v_1, \ldots, u_n, v_n)` that are then stacked into the channel
dimension would be dealt with like so:
dimension would be dealt with like so:

.. code-block:: python
Expand All @@ -58,11 +58,11 @@ Video Datatypes

torchvideo represents videos in a variety of formats:

- PIL video: A list of a PIL Images, this is useful for applying image data
- *PIL video*: A list of a PIL Images, this is useful for applying image data
augmentations
- tensor video: A :class:`torch.Tensor` of shape :math:`(C, T, H, W)` for feeding a
- *tensor video*: A :class:`torch.Tensor` of shape :math:`(C, T, H, W)` for feeding a
network.
- NDArray video: A :class:`numpy.ndarray` of shape either :math:`(T, H, W, C)` or
- *NDArray video*: A :class:`numpy.ndarray` of shape either :math:`(T, H, W, C)` or
:math:`(C, T, H, W)`. The reason for the multiple channel shapes is that most
loaders load in :math:`(T, H, W, C)` format, however tensors formatted for input
into a network typically are formatted in :math:`(C, T, H, W)`. Permuting the
Expand All @@ -71,81 +71,103 @@ torchvideo represents videos in a variety of formats:
to the other.


----

Transforms on PIL Videos
------------------------

These transforms all take an iterator/iterable of :class:`PIL.Image.Image` and produce
an iterator of :class:`PIL.Image.Image`. To draw image out of the transform you should
compose your sequence of PIL Video transforms with :class:`CollectFrames`.
an iterator of :class:`PIL.Image.Image`. To materialize the iterator the you should
compose your sequence of PIL video transforms with :class:`CollectFrames`.


CenterCropVideo
~~~~~~~~~~~~~~~

.. autoclass:: CenterCropVideo
:special-members: __call__

RandomCropVideo
~~~~~~~~~~~~~~~
.. autoclass:: RandomCropVideo
:special-members: __call__

RandomHorizontalFlipVideo
~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: RandomHorizontalFlipVideo
:special-members: __call__

ResizeVideo
~~~~~~~~~~~
.. autoclass:: ResizeVideo
:special-members: __call__

MultiScaleCropVideo
~~~~~~~~~~~~~~~~~~~
.. autoclass:: MultiScaleCropVideo
:special-members: __call__

RandomResizedCropVideo
~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: RandomResizedCropVideo
:special-members: __call__

TimeApply
~~~~~~~~~
.. autoclass:: TimeApply
:special-members: __call__

----


Transforms on Torch.\*Tensor videos
-----------------------------------

The input to these transforms should be a tensor of shape :math:`(C, T, H, W)`
These transform are applicable to `torch.*Tensor` videos only. The input to these
transforms should be a tensor of shape :math:`(C, T, H, W)`.

NormalizeVideo
~~~~~~~~~~~~~~
.. autoclass:: NormalizeVideo
:special-members: __call__

TimeToChannel
~~~~~~~~~~~~~
.. autoclass:: TimeToChannel
:special-members: __call__

----


Conversion transforms
---------------------

These transforms are for converting between different video representations. Typically
your transformation pipeline will operate on iterators of ``PIL`` images which
will then be aggregated by ``CollectFrames`` and then coverted to a tensor via
``PILVideoToTensor``.


CollectFrames
~~~~~~~~~~~~~
.. autoclass:: CollectFrames
:special-members: __call__

PILVideoToTensor
~~~~~~~~~~~~~~~~
.. autoclass:: PILVideoToTensor
:special-members: __call__

NDArrayToPILVideo
~~~~~~~~~~~~~~~~~
.. autoclass:: NDArrayToPILVideo
:special-members: __call__

----


Functional Transforms
---------------------

Functional transforms give you fine-grained control of the transformation pipeline. As
opposed to the transformations above, functional transforms don’t contain a random
number generator for their parameters.

.. currentmodule:: torchvideo.transforms.functional


normalize
~~~~~~~~~
.. autofunction:: normalize

time_to_channel
~~~~~~~~~~~~~~~
.. autofunction:: time_to_channel
22 changes: 11 additions & 11 deletions src/torchvideo/transforms/functional.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,22 @@

def normalize(
tensor: torch.Tensor, mean: Sequence, std: Sequence, inplace: bool = False
):
"""Normalize a tensor video with mean and standard deviation
.. note::
This transform acts in-place, i.e., it mutates the input tensor.
) -> torch.Tensor:
r"""Channel-wise normalize a tensor video of shape :math:`(C, T, H, W)` with mean
and standard deviation
See :class:`~torchvideo.transforms.NormalizeVideo` for more details.
Args:
tensor: Tensor video of size :math:`(C, T, H, W)` to be normalized.
mean: Sequence of means for each channel :math:`c`
std: Sequence of standard deviations for each channel :math:`c`.
mean: Sequence of means, :math:`M`, for each channel :math:`c`.
std: Sequence of standard deviations, :math:`\Sigma`, for each channel
:math:`c`.
inplace: Whether to normalise the tensor without cloning or not.
Returns:
Tensor: Normalised Tensor video.
Channel-wise normalised tensor video,
:math:`t'_c = \frac{t_c - M_c}{\Sigma_c}`
"""
channel_count = tensor.shape[0]
Expand All @@ -41,15 +41,15 @@ def normalize(
return tensor


def time_to_channel(tensor: torch.Tensor):
"""Reshape video tensor of shape :math:`(C, T, H, W)` into
def time_to_channel(tensor: torch.Tensor) -> torch.Tensor:
r"""Reshape video tensor of shape :math:`(C, T, H, W)` into
:math:`(C \times T, H, W)`
Args:
tensor: Tensor video of size :math:`(C, T, H, W)`
Returns:
Tensor of shape :math:`(T \times C, H, W)`
Tensor of shape :math:`(C \times T, H, W)`
"""
tensor_ndim = len(tensor.size())
Expand Down

0 comments on commit 2986515

Please sign in to comment.