### Tensor 생성
- 자주 사용하는 함수가 `rand`, `zeros`, `ones`, 각 함수의 첫 인자는 dimension임

- Dimension 적을 때 tuple로 해도 되고 depackage된 상태로 해도 됨

- `torch.zeros` : returns a tensor filled with the scalar value 0, with the shape defined by the variable argument `size`

- `torch.zeros_like` : returns a tensor filled with the scalar value 0, with the same size as argument `input`

- `torch.ones` : returns a tensor filled with the scalar value 1, with the shape defined by the variable argument `size`

- `torch.arange` : returns a 1D tensor of size (end - start) / step with values from the interval [start, end)

- `torch.linspace` : `np.linspace`랑 동일

- `torch.eye` : returns a 2D tensor with ones on the diagonal and zeros elsewhere

- `torch.empty` : returns a tensor filled with uninitialized data

- `torch.full` : returns a tensor of `size` filled with `fill_value`

- `torch.permute` : returns a view of the original tensor `input` with its dimensions permuted

- `torch.rand` : returns a tensor filled with random numbers from a uniform distribution on the interval [0, 1)

- `torch.randint` : returns a tensor filled with random integers generated uniformly between [`low` and `high`)

- `torch.randn` : returns a tensor filled with random numbers from a standard normal distribution

- `torch.randperm` : returns a random permutation of integers from 0 to n - 1

In [15]:
import torch
import numpy as np

x = torch.rand((3, 4))
x

tensor([[0.0451, 0.8324, 0.2319, 0.9732],
        [0.9537, 0.0203, 0.4733, 0.1110],
        [0.1005, 0.9470, 0.2391, 0.9604]])

In [6]:
ex = torch.zeros(2, 4)
ex

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [7]:
ex = torch.zeros_like(x)
ex

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [10]:
ex = torch.arange(0, 10, 1)
ex

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [14]:
ex = torch.linspace(0, 10, 10)
ex

tensor([ 0.0000,  1.1111,  2.2222,  3.3333,  4.4444,  5.5556,  6.6667,  7.7778,
         8.8889, 10.0000])

In [16]:
ex = np.linspace(0, 10, 10)
ex

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

In [17]:
ex = torch.eye(3)
ex

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [18]:
ex = torch.full((2, 2), 10)
ex

tensor([[10, 10],
        [10, 10]])

In [19]:
ex = torch.randperm(5)
ex

tensor([4, 1, 3, 0, 2])

### Tensor 데이터 타입
- `torch.Tensor` vs `torch.tensor` : `.tensor` infers the `dtype` automatically while `.Tensor` returns a `torch.FloatTensor`

- `torch.Tensor` vs `torch.cuda.Tensor` : `.Tensor` occupies CPU memory while `.cuda.Tensor` occupies GPU memory

- `torch.tensor(data, dtype, device, requires_grad)` : Constrcuts a tensor with data -- `torch.tensor()` 안에는 array 형태의 data가 들어가야함 (list, tuple, np.ndarray)

In [79]:
a = torch.tensor((1, 2))
a

tensor([1, 2])

In [80]:
b = torch.Tensor(1, 2) # torch.Tensor에 저렇게 인자를 넣으면 size가 1, 2인 텐서를 만들어달라는 뜻
b

tensor([[0., 0.]])

In [81]:
c = torch.Tensor((1, 2)) # 하지만 tuple 형식으로 넣게되면 torch.tensor과 동일하게 data 값으로 인식함
c

tensor([1., 2.])

In [20]:
# size가 2 x 3 인 Float type 텐서 생성

torch.cuda.FloatTensor(2, 3)

tensor([[0., 0., 0.],
        [0., 0., 0.]], device='cuda:0')

In [21]:
# 특정 list를 Float type tensor로 변환

torch.cuda.FloatTensor([2, 3])

tensor([2., 3.], device='cuda:0')

In [22]:
# dtype 형 변환

x = torch.cuda.FloatTensor([2, 3])
x.type_as(torch.cuda.IntTensor())

tensor([2, 3], device='cuda:0', dtype=torch.int32)

In [25]:
# CPU 텐서로도 형 변환이 자유로운 듯

x = torch.cuda.FloatTensor([2, 3])
x_cpu = x.type_as(torch.IntTensor())
x_cpu

tensor([2, 3], dtype=torch.int32)

### Numpy <-> Tensor

In [28]:
x_np = np.ndarray(shape=(2, 3), dtype=int, buffer=np.array([1, 2, 3, 4, 5, 6]))
x_np

array([[1, 2, 3],
       [4, 5, 6]])

In [29]:
x_ts = torch.from_numpy(x_np)
x_ts

tensor([[1, 2, 3],
        [4, 5, 6]])

In [31]:
x_ts[0, 0] = 0
x_ts

tensor([[0, 2, 3],
        [4, 5, 6]])

In [32]:
x_np

array([[0, 2, 3],
       [4, 5, 6]])

- 위에서 보면 알 수 있듯이 tensor로 변환한 뒤에 value를 바꾸면 numpy array도 value가 바뀐다는 것을 볼 수 있음

In [33]:
x_np_ = x_ts.numpy()
x_np_

array([[0, 2, 3],
       [4, 5, 6]])

In [34]:
x_np_[0, 0] = 1
x_np_

array([[1, 2, 3],
       [4, 5, 6]])

In [35]:
x_ts

tensor([[1, 2, 3],
        [4, 5, 6]])

In [36]:
x_np

array([[1, 2, 3],
       [4, 5, 6]])

In [39]:
print(f'Numpy : {id(x_np)} | Tensor : {id(x_ts)} | Numpy from Tensor : {id(x_np_)}')

Numpy : 139912717178416 | Tensor : 139912716822480 | Numpy from Tensor : 139912717279568


- 위에서 보면 알 수 있다시피 numpy array로 부터 바뀐 tensor를 `.numpy()`를 통해 변환을 해주고 값을 바꿔도 모든 value들이 바뀐다는 것을 알 수 있음

### CPU tensor <-> GPU tensor
- `torch.Tensor.cuda()` : Returns a copy of the object in CUDA memory

- `torch.Tensor.cpu()` : Returns a copy of the object in CPU memory

In [40]:
x = torch.Tensor([
    [1, 2, 3], [4, 5, 6]
    ])

print(x)
x_gpu = x.cuda()

print(x_gpu)


tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[1., 2., 3.],
        [4., 5., 6.]], device='cuda:0')


In [41]:
# x 값 변경

x[0, 0] = 10
print(x)
print(x_gpu)

tensor([[10.,  2.,  3.],
        [ 4.,  5.,  6.]])
tensor([[1., 2., 3.],
        [4., 5., 6.]], device='cuda:0')


In [42]:
x_cpu = x_gpu.cpu()
print(x_cpu)
print(x_gpu)

tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[1., 2., 3.],
        [4., 5., 6.]], device='cuda:0')


In [43]:
x_gpu[0, 0] = 9
print(x_cpu)
print(x_gpu)

tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[9., 2., 3.],
        [4., 5., 6.]], device='cuda:0')


- 위에서 볼 수 있다시피 `.cuda()`와 `.cpu()`는 object와 동일한 value를 갖는 새로운 텐서를 gpu 또는 cpu에 새롭게 매모리를 할당해줌

- 따라서, original object의 값이 변해도 gpu 또는 cpu로 옮겨진 tensor의 값은 변하지 않음

### Tensor 사이즈 확인
- `torch.Tensor.size(dim=None)` : Returns the size of the self tensor

In [44]:
x = torch.cuda.FloatTensor(10, 3, 3)
print(x.size())
print(x.size(dim=0))
print(x.size(dim=1))
print(x.size(dim=2))

torch.Size([10, 3, 3])
10
3
3


### Indexing, Masking
- Indexing을 해주는 `torch.index_select()` 가 있긴 하지만 불편함

- Masking은 `torch.masked_select(input, mask)`를 통해서 하고, BERT 같은데서 쓰임

In [45]:
x = torch.rand(4, 3)
print(x)

torch.index_select(x, 0, torch.LongTensor([0, 2])) 
# x에서 dimension=0 방향으로 첫번째와 두번째를 선택
# python에서 x[0, :]이랑 x[2, :] 뽑는 것과 동일

tensor([[0.0882, 0.3231, 0.7715],
        [0.2158, 0.0334, 0.8595],
        [0.6967, 0.0745, 0.3257],
        [0.6830, 0.5659, 0.7472]])


tensor([[0.0882, 0.3231, 0.7715],
        [0.6967, 0.0745, 0.3257]])

In [52]:
print(x)
print(x[:, 0])
print(x[0, :])
print(x[0:3, 0:2])

tensor([[0.0882, 0.3231, 0.7715],
        [0.2158, 0.0334, 0.8595],
        [0.6967, 0.0745, 0.3257],
        [0.6830, 0.5659, 0.7472]])
tensor([0.0882, 0.2158, 0.6967, 0.6830])
tensor([0.0882, 0.3231, 0.7715])
tensor([[0.0882, 0.3231],
        [0.2158, 0.0334],
        [0.6967, 0.0745]])


- `torch.masked_select()` : Returns a new 1D tensor which indices the input tensor according to the boolean mask

In [54]:
x = torch.randn(2, 3)
print(x)
mask = torch.BoolTensor([[0, 0, 1], [0, 1, 0]])
torch.masked_select(x, mask)

tensor([[ 0.5593,  0.6676,  1.7465],
        [-1.8597, -1.3053, -0.3348]])


tensor([ 1.7465, -1.3053])

### Join 기능
- `torch.cat(tensors, dim)` : Concatenate the given sequence of tensors in the given dimension (cf. All tensors must either have the same shape)

- `torch.stack(tensors, dim)` : Concatenates a sequence of tensors along a new dimension

- `torch.dstack(tensors)` : Stack tensors in sequence depthwise

- `torch.hstack(tensors)` : Stack tensors in sequence horizontally (column-wise)

- `torch.vstack(tensors)` : Stack tensors in sequence vertically (row-wise)

In [56]:
# torch.cat

x = torch.cuda.FloatTensor([
    [1, 2, 3], [4, 5, 6]
])
y = torch.cuda.FloatTensor([
    [-1, -2, -3], [-4, -5, -6]
])

print(f'x : {x.size()} | y : {y.size()}')

z1 = torch.cat([x, y], dim=0) # dimension 0으로 붙히니깐 ('2' x 3) + ('2' x 3) -> (4 x 3)
print(z1.size())
print(z1)

x : torch.Size([2, 3]) | y : torch.Size([2, 3])
torch.Size([4, 3])
tensor([[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [-1., -2., -3.],
        [-4., -5., -6.]], device='cuda:0')


In [57]:
z2 = torch.cat([x, y], dim=1) # -> 2 x 6
print(z2.size())
print(z2)

torch.Size([2, 6])
tensor([[ 1.,  2.,  3., -1., -2., -3.],
        [ 4.,  5.,  6., -4., -5., -6.]], device='cuda:0')


In [60]:
# torch.stack

x = torch.rand(2, 3)
print(x.size())
print(x)

x_stack = torch.stack([x, x, x], dim=0)
print(x_stack.size())
print(x_stack)

torch.Size([2, 3])
tensor([[0.4671, 0.3110, 0.8043],
        [0.9557, 0.7416, 0.6666]])
torch.Size([3, 2, 3])
tensor([[[0.4671, 0.3110, 0.8043],
         [0.9557, 0.7416, 0.6666]],

        [[0.4671, 0.3110, 0.8043],
         [0.9557, 0.7416, 0.6666]],

        [[0.4671, 0.3110, 0.8043],
         [0.9557, 0.7416, 0.6666]]])


In [63]:
# torch.dstack 

x = torch.rand(3)
y = torch.rand(3)
print(x, x.size())
print(y, y.size())
print('after dstack')
z = torch.dstack((x, y))
print(z, z.size())

tensor([0.1254, 0.3987, 0.8328]) torch.Size([3])
tensor([0.4943, 0.9332, 0.7796]) torch.Size([3])
after dstack
tensor([[[0.1254, 0.4943],
         [0.3987, 0.9332],
         [0.8328, 0.7796]]]) torch.Size([1, 3, 2])


In [64]:
x = torch.rand(3, 1)
y = torch.rand(3, 1)
print(x, x.size())
print(y, y.size())
z = torch.dstack((x, y))
print(z, z.size())

tensor([[0.3629],
        [0.2877],
        [0.5181]]) torch.Size([3, 1])
tensor([[0.6605],
        [0.2257],
        [0.5281]]) torch.Size([3, 1])
tensor([[[0.3629, 0.6605]],

        [[0.2877, 0.2257]],

        [[0.5181, 0.5281]]]) torch.Size([3, 1, 2])


In [65]:
# torch.hstack, torch.vstack

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
print(x)
print(y)
z = torch.hstack((x, y))
print(z, z.size())

tensor([1, 2, 3])
tensor([4, 5, 6])
tensor([1, 2, 3, 4, 5, 6]) torch.Size([6])


In [66]:
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
print(x)
print(y)
z = torch.vstack((x, y))
print(z, z.size())

tensor([1, 2, 3])
tensor([4, 5, 6])
tensor([[1, 2, 3],
        [4, 5, 6]]) torch.Size([2, 3])


### Slicing
- `torch.chunk(input, chunks)` : Splits a tensor into a specific number of chunks - 몇 개로 나눌건지 주는 함수

- `torch.split(tensor, split_size)` : Splits the tensor into chunks - Chunk의 크기를 몇으로 할건지 주는 함수

In [67]:
x = torch.arange(10).reshape(5, 2)
print(x)

torch.split(x, 2) # dimension 0을 기준으로 size가 2인 chunk로 나누어라

tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])


(tensor([[0, 1],
         [2, 3]]),
 tensor([[4, 5],
         [6, 7]]),
 tensor([[8, 9]]))

In [68]:
torch.split(x, [1, 4]) # dimension 0을 기준으로 1덩어리, 4덩어리가 되도록 나누어라

(tensor([[0, 1]]),
 tensor([[2, 3],
         [4, 5],
         [6, 7],
         [8, 9]]))

### Squeezing
- `torch.squeeze(input, dim)` : Returns a tensor with all the dimensions of `input` of size 1 removed - size가 1인 dimension을 제거해줌, dim으로 특정 dimension만 squeeze 가능

- `torch.unsqueeze(input, dim)` : Returns a new tensor with a dimension of size one inserted at the specified position

In [70]:
# torch.unsqueeze

x = torch.tensor([1, 2, 3, 4])
print(x, x.size())

z = torch.unsqueeze(x, 0) # 0번째 차원에 dimension of size 1 추가
print(z, z.size())

tensor([1, 2, 3, 4]) torch.Size([4])
tensor([[1, 2, 3, 4]]) torch.Size([1, 4])


In [71]:
print(x, x.size())

z = torch.unsqueeze(x, 1) # 1번째 차원에 dimension of size 1 추가
print(z, z.size())

tensor([1, 2, 3, 4]) torch.Size([4])
tensor([[1],
        [2],
        [3],
        [4]]) torch.Size([4, 1])


In [72]:
# torch.squeeze
# input : (A x 1 x B x C x 1) -> output : (A x B x C)

x = torch.rand(2, 1, 2, 1, 2)

print(x, x.size())

z = torch.squeeze(x)
print(z, z.size())

tensor([[[[[0.8429, 0.4166]],

          [[0.3356, 0.1121]]]],



        [[[[0.1181, 0.3395]],

          [[0.0169, 0.7799]]]]]) torch.Size([2, 1, 2, 1, 2])
tensor([[[0.8429, 0.4166],
         [0.3356, 0.1121]],

        [[0.1181, 0.3395],
         [0.0169, 0.7799]]]) torch.Size([2, 2, 2])


In [73]:
z = torch.squeeze(x, 1)
print(z, z.size())

tensor([[[[0.8429, 0.4166]],

         [[0.3356, 0.1121]]],


        [[[0.1181, 0.3395]],

         [[0.0169, 0.7799]]]]) torch.Size([2, 2, 1, 2])


### Initialization 분포 초기화 Tensor
- `torch.nn.init.uniform(tensor, a, b)` : fills the input tensor with values from the uniform distribution $U(a, b)$

- `torch.nn.init.normal(tensor, mean, std)` : fills the input tensor with values drawn from the normal distribution $N(\mu, \sigma ^2)$

- `torch.nn.init.contstant(tensor, val)` : fills the input tensor with the value `val`

- `torch.nn.init.orthogonal(tensor, gain)` : fills the input tensor with a orthogonal matrix

In [75]:
import torch.nn.init as init 

x1 = init.uniform_(torch.Tensor(3, 4), a=0, b=10)
x1

tensor([[3.5851, 5.1769, 0.1231, 1.4692],
        [5.1343, 5.6181, 1.4939, 3.2051],
        [4.5952, 9.1087, 1.3015, 9.8075]])

In [76]:
x2 = init.normal_(torch.Tensor(5, 5), mean=0, std=2)
x2

tensor([[ 0.7395,  2.5433,  0.5734, -1.9829,  3.3812],
        [-2.1930, -1.5088,  1.6891,  0.3959, -3.2373],
        [-0.1134,  1.1003,  0.8385,  2.4468, -3.4867],
        [-0.8667,  0.4876, -1.0692, -0.8668,  2.3320],
        [-2.1749, -0.4572, -0.4527, -1.0853, -3.2933]])

In [78]:
x3 = init.constant_(torch.Tensor(1, 4), val=10)
x3

tensor([[10., 10., 10., 10.]])

In [83]:
x4 = init.orthogonal_(torch.Tensor(3, 3), gain=1)
x4

tensor([[-0.4484,  0.2523, -0.8575],
        [-0.8012,  0.3120,  0.5107],
        [ 0.3964,  0.9160,  0.0623]])

In [85]:
# 즉 init.orthogonal_의 인자 gain은 각 열벡터, 행벡터들의 크기를 의미함.

print(0.4484**2 + 0.2523**2 + 0.8575**2)

print(0.4484**2 + 0.8012**2 + 0.3964**2)


1.0000241
1.0001169600000002


### Math Operation 기본
- `torch.mm(input, mat2)` : 행렬 곱 연산 -- 두 행렬 A, B가 있을 때 A@B로 해도 됨

- `torch.bmm(input, mat2)` : batch - 행렬 곱 연산 -- 행렬이 (batch size x a x b), (batch size x b x c) 형태일 때 효율적으로 계산함

- `torch.ceil(input)` : Returns a new tensor with the ceil of the elements of input, the smallest integer greater than or equal to each element -- 그냥 올림, 내림은 `torch.floor()`

- `torch.clamp(input, min, max)` : Clamp all elements in input in to range [min, max] and return a resulting tensor

- `torch.eq(input, other)` : Return a boolean tenjsor that is True where `input` is equal to `other` and False elsewhere -- 두 텐서 내의 동일 위치 value들이 같은지 아닌지 비교

- `torch.equal(input, other)` : Return True if two tensors have the same size and elements, False otherwise -- 두 텐서가 크기랑 value까지 모두 동일한지 아닌지

- `torch.dot(input, tensor)` : Computes the dot product of two tensors -- 내적

- `torch.mv(input, vec)` : Compute matrix-vector product -- 행렬 x 벡터

- `torch.matmul(input, other)` : Matrix product of two tensors -- 자동으로 argument를 보고 mm, mv, dot 연산을 해줌

In [86]:
# torch.mm 

x = torch.rand(2, 2)
y = torch.rand(2, 3)

print(x)
print(y)
print(torch.mm(x, y))
print(x@y)

tensor([[0.3211, 0.8334],
        [0.4580, 0.5006]])
tensor([[0.3253, 0.3294, 0.9299],
        [0.7600, 0.8916, 0.5764]])
tensor([[0.7379, 0.8488, 0.7789],
        [0.5295, 0.5973, 0.7145]])
tensor([[0.7379, 0.8488, 0.7789],
        [0.5295, 0.5973, 0.7145]])


In [90]:
import time
# torch.bmm

x = torch.randn(128, 1024, 256)
y = torch.randn(128, 256, 1024)

start_time = time.perf_counter()
z = torch.matmul(x, y)
print(z.size())
end_time = time.perf_counter()
print(f'{int(round((end_time - start_time) * 1000))} ms')

start_time = time.perf_counter()
z = torch.bmm(x, y)
print(z.size())
end_time = time.perf_counter()
print(f'{int(round((end_time - start_time) * 1000))} ms')

torch.Size([128, 1024, 1024])
279 ms
torch.Size([128, 1024, 1024])
268 ms


In [92]:
# torch.ceil(), torch.floor(), torch.round()

x = torch.randn(1, 7)
print(x)
print(f'반올림 : {torch.round(x)}')
print(f'올림 : {torch.ceil(x)}')
print(f'내림 : {torch.floor(x)}')

tensor([[-0.5512, -2.2915,  0.9664,  0.0504, -0.0793, -0.1433,  0.4839]])
반올림 : tensor([[-1., -2.,  1.,  0., -0., -0.,  0.]])
올림 : tensor([[-0., -2.,  1.,  1., -0., -0.,  1.]])
내림 : tensor([[-1., -3.,  0.,  0., -1., -1.,  0.]])


In [95]:
# torch.clamp(), torch.clip() == 두개 그냥 똑같은 함수임 clip is aliase of clamp

x = torch.randn(8)
print(x)
print(torch.clip(x, min=-1, max=1))
print(torch.clamp(x, min=-1.1, max=0))

tensor([-0.8317, -0.6619, -0.6736,  1.2548,  0.0311,  1.9451, -0.1193,  0.7805])
tensor([-0.8317, -0.6619, -0.6736,  1.0000,  0.0311,  1.0000, -0.1193,  0.7805])
tensor([-0.8317, -0.6619, -0.6736,  0.0000,  0.0000,  0.0000, -0.1193,  0.0000])


In [None]:
# torch.eq(), torch.equal() -- eq는 각 value가 맞는지 boolean 반환, equal은 tensor 자체가 서로 같은 건지 확인

