In [1]:
import torch

In [2]:
img_t = torch.randn(3, 5, 5) # shape [channels, rows, columns]
weights = torch.tensor([0.2126, 0.7152, 0.0722])

- 上記のweightsの意味はhttps://en.wikipedia.org/wiki/Luma_(video)

In [3]:
batch_t = torch.randn(2, 3, 5, 5) # shape [batch, channels, rows, columns]

- RGB channels are always in dimension -3. The lazy, unweighted mean can thus be:

In [4]:
img_gray_naive = img_t.mean(-3)
batch_gray_naive = batch_t.mean(-3)
img_gray_naive.shape, batch_gray_naive.shape

(torch.Size([5, 5]), torch.Size([2, 5, 5]))

- But now we have the weights.
- PyTorch will allow us to multiply things that are the same shape, as well as **shapes where one operand is of size 1 in a given dimension**.
- It also **appends leading dimensions of size 1 automatically**.
- This is a feature called *broadcasting*.
- **`batch_t` of shape (2, 3, 5, 5) is multiplied by `unsqueezed_weights` of shape (3, 1, 1)**, resulting in a tensor of shape (2, 3, 5, 5), from which we can then sum the 3rd dimension from the end (the three channels):(**大事**)

In [5]:
unsqueezed_weights = weights.unsqueeze(-1).unsqueeze_(-1)
img_weights = (img_t * unsqueezed_weights)
batch_weights = (batch_t * unsqueezed_weights)
img_gray_weighted = img_weights.sum(-3)
batch_gray_weighted = batch_weights.sum(-3)
batch_weights.shape, batch_t.shape, unsqueezed_weights.shape

(torch.Size([2, 3, 5, 5]), torch.Size([2, 3, 5, 5]), torch.Size([3, 1, 1]))

- `unsqueeze`や`unsqueeze_`の区別はまだ分かっていない。
- PyTorch function `einsum`が上記のことができる：

In [6]:
img_gray_weighted_fancy = torch.einsum('...chw,c->...hw', img_t, weights)
batch_gray_weighted_fancy = torch.einsum('...chw,c->...hw', batch_t, weights)
batch_gray_weighted_fancy.shape

torch.Size([2, 5, 5])

- `einsum`の1つ目引数の意味はあんまり分かっていない。
- As often in Python, broadcasting - a form of summarizing unnamed things - is done using three dots `...`; but don't worry too much about `einsum`, because we will not use it in the following.
- dimensionの意味が分からなくなる課題はまだ解決してない：**this has caught the eye of practitioners, and so it has been suggested that the dimension be given a name instead**.

### Named Tensorの実践

In [7]:
weights_named = torch.tensor([0.2126, 0.7152, 0.0722], names=['channels'])
weights_named

  """Entry point for launching an IPython kernel.


tensor([0.2126, 0.7152, 0.0722], names=('channels',))

In [8]:
img_named = img_t.refine_names(..., 'channels', 'rows', 'columns')
batch_named = batch_t.refine_names(..., 'channels', 'rows', 'columns')
print("img named:", img_named.shape, img_named.names)
print("batch named:", batch_named.shape, batch_named.names)

img named: torch.Size([3, 5, 5]) ('channels', 'rows', 'columns')
batch named: torch.Size([2, 3, 5, 5]) (None, 'channels', 'rows', 'columns')


- `align_as` returns a tensor with **missing dimensions added** and **existing ones permuted to the right order**:

In [9]:
weights_aligned = weights_named.align_as(img_named)
weights_aligned.shape, weights_aligned.names

(torch.Size([3, 1, 1]), ('channels', 'rows', 'columns'))

In [10]:
gray_named = (img_named * weights_aligned).sum('channels')
gray_named.shape, gray_named.names

(torch.Size([5, 5]), ('rows', 'columns'))

- If we try to combine dimensions with different names, we get an error:

In [11]:
gray_named = (img_named[..., :3] * weights_named).sum('channels')

RuntimeError: Error when attempting to broadcast dims ['channels', 'rows', 'columns'] and dims ['channels']: dim 'columns' and dim 'channels' are at the same position from the right but do not match.

- 上記の`weighted_named`には`columns`や`rows` dimensionがまだ追加されていない。
- **If we want to use tensors outside functions that operate on named tensors, we need to drop the names by renaming them to None**:

In [12]:
gray_plain = gray_named.rename(None)
gray_plain.shape, gray_plain.names

(torch.Size([5, 5]), (None, None))

- まだexperimentalなので、本の中にはunnamed tensorを使うまま。