<a href="https://colab.research.google.com/github/soso6079/Deep-Learning-with-PyTorch/blob/main/ch3_It_starts_with_a_tensor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ch 3. 텐서 구조체
---
신경망에서의 정보 처리는 결국 부동소수점 형태로 이뤄진다. 이를 위해 데이터를 연산 가능한 형태로 인코딩하고 연산 결과는 해석할 수 있는 형태로 디코딩해야한다.   
   
파이토치의 기본 자료구조는 텐서(Tensor)다. 텐서는 파이썬의 리스트와 비슷해보이지만, 내부 동작은 매우 다르다.
![image](https://user-images.githubusercontent.com/76675506/188612947-117b191f-b8e3-48df-8aaf-6c08caa8c010.png)

위 그림처럼 텐서는 연속적으로 메모리에 할당된다. 반면에 리스트는 메모리에 따로따로 저장된다.
  

In [63]:
import torch

tensor([1., 1., 1.])

In [68]:
points = torch.tensor([4.0, 1.0, 5.0, 3.0, 2.0, 1.0])
points

tensor([4., 1., 5., 3., 2., 1.])

In [69]:
float(points[0]), float(points[1])

(4.0, 1.0)

In [70]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

In [71]:
points.shape

torch.Size([3, 2])

In [72]:
points = torch.zeros(3, 2)
points

tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])

In [73]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

In [74]:
points[0, 1]

tensor(1.)

In [75]:
points[0]

tensor([4., 1.])

### 이름이 있는 텐서
---
텐서를 다룰 때에는 차원의 순서를 기억해서 인덱싱해야 한다.   하지만 텐서를 다루다보면, 어느 차원에 어떤 데이터가 들어있는지 헷갈리기 쉽다.   
파이토치에서는 텐서를 생성할 때 `names` 인자를 이용해서 차원 별로 이름을 정할 수 있다.

In [76]:
import torch
_ = torch.tensor([0.2126, 0.7152, 0.0722], names=['c'])

  


In [77]:
img_t = torch.randn(3, 5, 5) # 
weights = torch.tensor([0.2126, 0.7152, 0.0722])

In [78]:
batch_t = torch.randn(2, 3, 5, 5) # [batch, channels, rows, columns]

In [79]:
img_gray_naive = img_t.mean(-3)
batch_gray_naive = batch_t.mean(-3)
img_gray_naive.shape, batch_gray_naive.shape

(torch.Size([5, 5]), torch.Size([2, 5, 5]))

In [80]:

unsqueezed_weights = weights.unsqueeze(-1).unsqueeze_(-1) # weitghts의 shape은 (3,1) / unqueezed_weights의 shape은  (3,1,1)
img_weights = (img_t * unsqueezed_weights) # img_t의 shape은 (3,1) / img_weights의 shape은 (3,1,1)
batch_weights = (batch_t * unsqueezed_weights) # batch_t의 shape은 (3,1) / batch_weights의 shape은 (3,1,1)
#img_gray_weighted = img_weights.sum(-3)
#batch_gray_weighted = batch_weights.sum(-3)
batch_weights.shape, batch_t.shape, unsqueezed_weights.shape

(torch.Size([2, 3, 5, 5]), torch.Size([2, 3, 5, 5]), torch.Size([3, 1, 1]))

In [82]:
weights_named = torch.tensor([0.2126, 0.7152, 0.0722], names=['channels'])
weights_named

tensor([0.2126, 0.7152, 0.0722], names=('channels',))

In [83]:
img_named = img_t.refine_names(..., 'channels', 'rows','columns')
batch_named = batch_t.refine_names(..., 'channels', 'rows', 'columns')
print("img named:", img_named.shape, img_named.names)
print("batch named:", batch_named.shape, batch_named.names)

img named: torch.Size([3, 5, 5]) ('channels', 'rows', 'columns')
batch named: torch.Size([2, 3, 5, 5]) (None, 'channels', 'rows', 'columns')


`align_as`를 이용하면 다른 텐서의 이름을 그대로 이어받을 수 있다.

In [84]:
weights_aligned = weights_named.align_as(img_named)
weights_aligned.shape, weights_aligned.names

(torch.Size([3, 1, 1]), ('channels', 'rows', 'columns'))

텐서 연산을 할 때 채널의 이름을 이용해 지정된 차원끼리 연산을 적용할 수 있다.

In [85]:
gray_named = (img_named * weights_aligned).sum('channels')
gray_named.shape, gray_named.names # 왜 채널이 사라지는지 이해 못했음

(torch.Size([5, 5]), ('rows', 'columns'))

In [86]:
kk = img_named * weights_aligned
kk. shape, kk.names # 확인

(torch.Size([3, 5, 5]), ('channels', 'rows', 'columns'))

In [87]:
print(kk, '\n', gray_named)

tensor([[[-1.7495e-01,  1.1685e-01,  4.8922e-01, -1.0347e-01, -2.1159e-01],
         [-1.5507e-02,  2.1725e-01,  1.4960e-01,  5.9206e-02,  2.2728e-01],
         [-1.2239e-01, -4.8214e-02, -6.1721e-02, -2.4194e-01,  3.6431e-01],
         [ 2.9162e-01,  2.5637e-01,  1.8303e-01, -4.9120e-01, -1.4423e-01],
         [ 1.5832e-01, -1.8739e-01, -2.8693e-02, -4.6539e-01, -1.7598e-01]],

        [[ 2.3943e-01,  6.4769e-01, -8.8357e-01,  4.6026e-01,  3.7912e-01],
         [-8.6857e-01,  2.1674e-01, -4.1043e-02, -3.4069e-01, -4.7513e-01],
         [-3.4466e-01, -4.9491e-01, -5.4171e-01,  1.0127e+00,  1.3811e+00],
         [ 5.1091e-01,  8.9465e-01,  9.8213e-01, -2.9651e-01,  8.2737e-01],
         [-5.5427e-01, -9.7001e-01,  1.2455e+00, -8.0214e-02, -2.1663e-01]],

        [[-1.3702e-02,  2.0164e-02,  1.3150e-01,  1.1523e-01,  1.0444e-01],
         [-2.7445e-02, -6.4883e-02, -8.9699e-02, -6.8444e-02,  2.7753e-02],
         [-7.0511e-02,  5.7304e-02,  1.8403e-02,  7.1753e-02,  8.0925e-02],
        

이름이 다른 차원에 대해 연산을 진행하면 아래와 같은 오류가 발생한다.

In [89]:
gray_named = (img_named[..., :3] * weights_named).sum('channels')

RuntimeError: ignored

# 저장소(Storage) 관점의 텐서
---


In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points.storage()

In [None]:
poinrts_storage = points.storage()
poinrts_storage[0]

In [None]:
a = torch.zeros(3,2)

a.zero_()

a

In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points.storage()

In [None]:
points_storage = points.storage()
points_storage[0]

In [None]:
points.storage()[1]

In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points_storage = points.storage()
points_storage[0] = 2.0
points

In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
second_point = points[1]
second_point.storage_offset()

In [None]:
second_point.size()

In [None]:
second_point.shape

In [None]:
points.stride()

In [None]:
second_point = points[1]
second_point.size()

In [None]:
second_point.storage_offset()

In [None]:
second_point.stride()

In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
second_point = points[1]
second_point[0] = 10.0
points

In [None]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
second_point = points[1].clone()
second_point[0] = 10.0
points

In [90]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points

tensor([[4., 1.],
        [5., 3.],
        [2., 1.]])

In [91]:
points_t = points.t()
points_t

tensor([[4., 5., 2.],
        [1., 3., 1.]])

In [92]:
id(points.storage()) == id(points_t.storage())

True

In [93]:
points.stride()

(2, 1)

In [94]:
points_t.stride()

(1, 2)

In [95]:
some_t = torch.ones(3, 4, 5)
transpose_t = some_t.transpose(0, 2)
some_t.shape

torch.Size([3, 4, 5])

In [96]:
transpose_t.shape

torch.Size([5, 4, 3])

In [97]:
some_t.stride()

(20, 5, 1)

In [98]:
transpose_t.stride()

(1, 5, 20)

In [99]:
points.is_contiguous()

True

In [100]:
points_t.is_contiguous()

False

In [101]:
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])
points_t = points.t()
points_t

tensor([[4., 5., 2.],
        [1., 3., 1.]])

In [102]:
points_t.storage()

 4.0
 1.0
 5.0
 3.0
 2.0
 1.0
[torch.storage._TypedStorage(dtype=torch.float32, device=cpu) of size 6]

In [103]:
points_t.stride()

(1, 2)

In [104]:
points_t_cont = points_t.contiguous()
points_t_cont

tensor([[4., 5., 2.],
        [1., 3., 1.]])

In [105]:
points_t_cont.stride()

(3, 1)

In [106]:
points_t_cont.storage()

 4.0
 5.0
 2.0
 1.0
 3.0
 1.0
[torch.storage._TypedStorage(dtype=torch.float32, device=cpu) of size 6]

In [107]:
double_points = torch.ones(10, 2, dtype=torch.double)
short_points = torch.tensor([[1, 2], [3, 4]], dtype=torch.short)

In [108]:
short_points.dtype

torch.int16

In [109]:
double_points = torch.zeros(10, 2).double()
short_points = torch.ones(10, 2).short()

In [110]:
double_points = torch.zeros(10, 2).to(torch.double)
short_points = torch.ones(10, 2).to(dtype=torch.short)

In [111]:
points_64 = torch.rand(5, dtype=torch.double)  # <1>
points_short = points_64.to(torch.short)
points_64 * points_short  # works from PyTorch 1.3 onwards

tensor([0., 0., 0., 0., 0.], dtype=torch.float64)

In [112]:
# reset points back to original value
points = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]])

In [113]:
some_list = list(range(6))
some_list[:]     # <1>
some_list[1:4]   # <2>
some_list[1:]    # <3>
some_list[:4]    # <4>
some_list[:-1]   # <5>
some_list[1:4:2] # <6>

[1, 3]

In [114]:
points[1:]       # <1>
points[1:, :]    # <2>
points[1:, 0]    # <3>
points[None]     # <4>

tensor([[[4., 1.],
         [5., 3.],
         [2., 1.]]])

In [115]:
points = torch.ones(3, 4)
points_np = points.numpy()
points_np

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]], dtype=float32)

In [116]:
points = torch.from_numpy(points_np)

In [117]:
torch.save(points, '../data/p1ch3/ourpoints.t')

FileNotFoundError: ignored

In [None]:
with open('../data/p1ch3/ourpoints.t','wb') as f:
   torch.save(points, f)

In [None]:
points = torch.load('../data/p1ch3/ourpoints.t')

In [None]:
with open('../data/p1ch3/ourpoints.t','rb') as f:
   points = torch.load(f)

In [None]:
import h5py

f = h5py.File('../data/p1ch3/ourpoints.hdf5', 'w')
dset = f.create_dataset('coords', data=points.numpy())
f.close()

In [None]:
f = h5py.File('../data/p1ch3/ourpoints.hdf5', 'r')
dset = f['coords']
last_points = dset[-2:]

In [118]:
last_points = torch.from_numpy(dset[-2:])
f.close()

NameError: ignored

In [119]:
points_gpu = torch.tensor([[4.0, 1.0], [5.0, 3.0], [2.0, 1.0]], device='cuda')

In [120]:
points_gpu = points.to(device='cuda')

In [121]:
points_gpu = points.to(device='cuda:0')

In [122]:
points = 2 * points  # <1>
points_gpu = 2 * points.to(device='cuda')  # <2>

In [123]:
points_gpu = points_gpu + 4

In [124]:
points_cpu = points_gpu.to(device='cpu')

In [125]:
points_gpu = points.cuda()  # <1>
points_gpu = points.cuda(0)
points_cpu = points_gpu.cpu()

In [126]:
a = torch.ones(3, 2)
a_t = torch.transpose(a, 0, 1)

a.shape, a_t.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

In [127]:
a = torch.ones(3, 2)
a_t = a.transpose(0, 1)

a.shape, a_t.shape

(torch.Size([3, 2]), torch.Size([2, 3]))