Clean up transforms docs

torchvideo · Jan 7, 2019 · 2986515 · 2986515
1 parent 755f269
commit 2986515
Show file tree

Hide file tree

Showing 3 changed files with 154 additions and 142 deletions.
diff --git a/docs/source/transforms.rst b/docs/source/transforms.rst
@@ -36,7 +36,7 @@ transforms followed by a :class:`CollectFrames` transform and a
 
 Optical flow stored as flattened :math:`(u, v)` pairs like
 :math:`(u_0, v_1, u_1, v_1, \ldots, u_n, v_n)` that are then stacked into the channel
- dimension would be dealt with like so:
+dimension would be dealt with like so:
 
 .. code-block:: python
 
@@ -58,11 +58,11 @@ Video Datatypes
 
 torchvideo represents videos in a variety of formats:
 
-- PIL video: A list of a PIL Images, this is useful for applying image data
+- *PIL video*: A list of a PIL Images, this is useful for applying image data
   augmentations
-- tensor video: A :class:`torch.Tensor` of shape :math:`(C, T, H, W)` for feeding a
+- *tensor video*: A :class:`torch.Tensor` of shape :math:`(C, T, H, W)` for feeding a
   network.
-- NDArray video: A :class:`numpy.ndarray` of shape either :math:`(T, H, W, C)` or
+- *NDArray video*: A :class:`numpy.ndarray` of shape either :math:`(T, H, W, C)` or
   :math:`(C, T, H, W)`. The reason for the multiple channel shapes is that most
   loaders load in :math:`(T, H, W, C)` format, however tensors formatted for input
   into a network typically are formatted in :math:`(C, T, H, W)`. Permuting the
@@ -71,81 +71,103 @@ torchvideo represents videos in a variety of formats:
   to the other.
 
 
+----
 
 Transforms on PIL Videos
 ------------------------
 
 These transforms all take an iterator/iterable of :class:`PIL.Image.Image` and produce
-an iterator of :class:`PIL.Image.Image`. To draw image out of the transform you should
-compose your sequence of PIL Video transforms with :class:`CollectFrames`.
+an iterator of :class:`PIL.Image.Image`. To materialize the iterator the you should
+compose your sequence of PIL video transforms with :class:`CollectFrames`.
 
 
 CenterCropVideo
 ~~~~~~~~~~~~~~~
 
 .. autoclass:: CenterCropVideo
-    :special-members: __call__
 
 RandomCropVideo
 ~~~~~~~~~~~~~~~
 .. autoclass:: RandomCropVideo
-    :special-members: __call__
 
 RandomHorizontalFlipVideo
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 .. autoclass:: RandomHorizontalFlipVideo
-    :special-members: __call__
 
 ResizeVideo
 ~~~~~~~~~~~
 .. autoclass:: ResizeVideo
-    :special-members: __call__
 
 MultiScaleCropVideo
 ~~~~~~~~~~~~~~~~~~~
 .. autoclass:: MultiScaleCropVideo
-    :special-members: __call__
 
 RandomResizedCropVideo
 ~~~~~~~~~~~~~~~~~~~~~~
 .. autoclass:: RandomResizedCropVideo
-    :special-members: __call__
 
 TimeApply
 ~~~~~~~~~
 .. autoclass:: TimeApply
-    :special-members: __call__
+
+----
 
 
 Transforms on Torch.\*Tensor videos
 -----------------------------------
 
-The input to these transforms should be a tensor of shape :math:`(C, T, H, W)`
+These transform are applicable to `torch.*Tensor` videos only. The input to these
+transforms should be a tensor of shape :math:`(C, T, H, W)`.
 
 NormalizeVideo
 ~~~~~~~~~~~~~~
 .. autoclass:: NormalizeVideo
-    :special-members: __call__
 
 TimeToChannel
 ~~~~~~~~~~~~~
 .. autoclass:: TimeToChannel
-    :special-members: __call__
+
+----
+
 
 Conversion transforms
 ---------------------
 
+These transforms are for converting between different video representations. Typically
+your transformation pipeline will operate on iterators of ``PIL`` images which
+will then be aggregated by ``CollectFrames`` and then coverted to a tensor via
+``PILVideoToTensor``.
+
+
 CollectFrames
 ~~~~~~~~~~~~~
 .. autoclass:: CollectFrames
-    :special-members: __call__
 
 PILVideoToTensor
 ~~~~~~~~~~~~~~~~
 .. autoclass:: PILVideoToTensor
-    :special-members: __call__
 
 NDArrayToPILVideo
 ~~~~~~~~~~~~~~~~~
 .. autoclass:: NDArrayToPILVideo
-    :special-members: __call__
+
+----
+
+
+Functional Transforms
+---------------------
+
+Functional transforms give you fine-grained control of the transformation pipeline. As
+opposed to the transformations above, functional transforms don’t contain a random
+number generator for their parameters.
+
+.. currentmodule:: torchvideo.transforms.functional
+
+
+normalize
+~~~~~~~~~
+.. autofunction:: normalize
+
+time_to_channel
+~~~~~~~~~~~~~~~
+.. autofunction:: time_to_channel
diff --git a/src/torchvideo/transforms/functional.py b/src/torchvideo/transforms/functional.py
@@ -5,22 +5,22 @@
 
 def normalize(
     tensor: torch.Tensor, mean: Sequence, std: Sequence, inplace: bool = False
-):
-    """Normalize a tensor video with mean and standard deviation
-
-    .. note::
-        This transform acts in-place, i.e., it mutates the input tensor.
+) -> torch.Tensor:
+    r"""Channel-wise normalize a tensor video of shape :math:`(C, T, H, W)` with mean
+    and standard deviation
 
     See :class:`~torchvideo.transforms.NormalizeVideo` for more details.
 
     Args:
         tensor: Tensor video of size :math:`(C, T, H, W)` to be normalized.
-        mean: Sequence of means for each channel :math:`c`
-        std: Sequence of standard deviations for each channel :math:`c`.
+        mean: Sequence of means, :math:`M`, for each channel :math:`c`.
+        std: Sequence of standard deviations, :math:`\Sigma`, for each channel
+            :math:`c`.
         inplace: Whether to normalise the tensor without cloning or not.
 
     Returns:
-        Tensor: Normalised Tensor video.
+        Channel-wise normalised tensor video,
+        :math:`t'_c = \frac{t_c - M_c}{\Sigma_c}`
 
     """
     channel_count = tensor.shape[0]
@@ -41,15 +41,15 @@ def normalize(
     return tensor
 
 
-def time_to_channel(tensor: torch.Tensor):
-    """Reshape video tensor of shape :math:`(C, T, H, W)` into
+def time_to_channel(tensor: torch.Tensor) -> torch.Tensor:
+    r"""Reshape video tensor of shape :math:`(C, T, H, W)` into
     :math:`(C \times T, H, W)`
 
     Args:
         tensor: Tensor video of size :math:`(C, T, H, W)`
 
     Returns:
-        Tensor of shape :math:`(T \times C, H, W)`
+        Tensor of shape :math:`(C \times T, H, W)`
 
     """
     tensor_ndim = len(tensor.size())