# Pytorch Tensors 

In-depth intro to ```torch.Tensor```

In [2]:
import torch 
import math

# Creating tensors 

simplest way to create a tensor: 

In [3]:
x = torch.empty(3,4)
print(type(x))
print(x)

<class 'torch.Tensor'>
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])


- This tensor is 2-dimensional with 3 row and 4 colums.
- ```torch.Tensor``` is an aliance for ```torch.FloatTensor```. PyTorch tensors are populated with 32-bit floating point nums by default. 

**Terminology**: 

- 1-dimensional tensor = vector 

- 2-dimensional vector = *matrix*

- anything with more than two dimensions = tensor 

initializing tensors with zeros, ones, random vals: 

In [4]:
zeros = torch.zeros(2,3)
print(zeros)

ones = torch.ones(2, 3)
print(ones)

torch.manual_seed(1729)
random = torch.rand(2, 3)
print(random)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])


# Random Tensors & Seeding 

For reproducibility reasons we can seed the random number generator:

In [7]:
torch.manual_seed(1729)
random1 = torch.rand(2, 3)
print(random1)

torch.manual_seed(1729)
random2 = torch.rand(2, 3)
print(random2)

torch.manual_seed(1729)
random3 = torch.rand(2, 3)
print(random3)

random4 = torch.rand(2,3)
print(random4)

tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])
tensor([[0.3126, 0.3791, 0.3087],
        [0.0736, 0.4216, 0.0691]])
tensor([[0.2332, 0.4047, 0.2162],
        [0.9927, 0.4128, 0.5938]])


# Tensor Shapes 

Creating two tensors with the same *shape* - same number of dimensions and same number of cells in each dimension. ```torch.*_like()`` methods achieve this: 

In [11]:
x = torch.empty(2,2,3)
print(x.shape)
print(x)

empty_like_x = torch.empty_like(x)
print(empty_like_x.shape)
print(empty_like_x)

zeros_like_x = torch.zeros_like(x)
print(zeros_like_x.shape)
print(zeros_like_x)

ones_like_x = torch.ones_like(x)
print(ones_like_x.shape)
print(ones_like_x)

rand_like_x = torch.rand_like(x)
print(rand_like_x.shape)
print(rand_like_x)

torch.Size([2, 2, 3])
tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])
torch.Size([2, 2, 3])
tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])
torch.Size([2, 2, 3])
tensor([[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]])
torch.Size([2, 2, 3])
tensor([[[1., 1., 1.],
         [1., 1., 1.]],

        [[1., 1., 1.],
         [1., 1., 1.]]])
torch.Size([2, 2, 3])
tensor([[[0.6128, 0.1519, 0.0453],
         [0.5035, 0.9978, 0.3884]],

        [[0.6929, 0.1703, 0.1384],
         [0.4759, 0.7481, 0.0361]]])


# Tensor Data Types 

Setting the datatype of a tensor: 

In [12]:
a = torch.ones((2,3), dtype=torch.int16)
print(a)

b = torch.rand((2,3), dtype=torch.float64) * 20
print(b)

c = b.to(torch.int32)
print(c)

tensor([[1, 1, 1],
        [1, 1, 1]], dtype=torch.int16)
tensor([[ 0.9956,  1.4148,  5.8364],
        [11.2406, 11.2083, 11.6692]], dtype=torch.float64)
tensor([[ 0,  1,  5],
        [11, 11, 11]], dtype=torch.int32)


# Math & Logic with PytTorch Tensors 

basic arithmetic with PyTorch tensors. How tensors interact with simple scalars:

In [16]:
ones = torch.zeros(2,2) + 1
twos = torch.ones(2,2) * 2
threes = (torch.ones(2,2) * 7 - 1) / 2
fours = twos ** 2
sqrt2s = twos ** 0.5

print(ones)
print(twos)
print(threes)
print(fours)
print(sqrt2s)

tensor([[1., 1.],
        [1., 1.]])
tensor([[2., 2.],
        [2., 2.]])
tensor([[3., 3.],
        [3., 3.]])
tensor([[4., 4.],
        [4., 4.]])
tensor([[1.4142, 1.4142],
        [1.4142, 1.4142]])


More arithmetic done with tensor math:

In [19]:
powers2 = twos ** torch.tensor([[1, 2], [3, 4]])
print(powers2)

fives = ones + fours 
print(fives)

dozens = threes * fours 
print(dozens)

tensor([[ 2.,  4.],
        [ 8., 16.]])
tensor([[5., 5.],
        [5., 5.]])
tensor([[12., 12.],
        [12., 12.]])


# Tensor Broadcasting

In general tensors of differnt shape cannot perform binary operations on each other. The exception to this is Broadcasting:

In [20]:
rand = torch.rand(2,4)
doubled = rand * (torch.ones(1,4) * 2)

print(rand)
print(doubled)


tensor([[0.6146, 0.5999, 0.5013, 0.9397],
        [0.8656, 0.5207, 0.6865, 0.3614]])
tensor([[1.2291, 1.1998, 1.0026, 1.8793],
        [1.7312, 1.0413, 1.3730, 0.7228]])


the (1 , 4) tensor is being multiplied by *both* rows of the (2, 4) tensor

This situation shows up in Deep Learning when we need to multiply a tensor of learning weights by a *batch* of input tensors.

rules for broadcasting: 
- Each tensor must have atleast one dimension - no empty tensors 
- Comparing the dimension sizes of the two tensors, *going from last to first*:
- - Each dimension must be equal, *or*
- - one of the dimensions must be size 1, *or*
- - The dimension does not exist in one of the tensors

Examples of broadcasting: 

In [22]:
a = torch.ones(4, 3, 2)

b = a * torch.rand(  3, 2) # 3rd & 2nd dims are identical to a, dim 1 is abscent 
print(b)

c = a * torch.rand(  3, 1) # 3rd dim = 1, 2nd dim identical ro a 
print(c)

d = a * torch.rand(  1, 2) # 3rd dim is identical to a, 2nd dim = 1
print(d)

tensor([[[0.0381, 0.2138],
         [0.5395, 0.3686],
         [0.4007, 0.7220]],

        [[0.0381, 0.2138],
         [0.5395, 0.3686],
         [0.4007, 0.7220]],

        [[0.0381, 0.2138],
         [0.5395, 0.3686],
         [0.4007, 0.7220]],

        [[0.0381, 0.2138],
         [0.5395, 0.3686],
         [0.4007, 0.7220]]])
tensor([[[0.8217, 0.8217],
         [0.2612, 0.2612],
         [0.7375, 0.7375]],

        [[0.8217, 0.8217],
         [0.2612, 0.2612],
         [0.7375, 0.7375]],

        [[0.8217, 0.8217],
         [0.2612, 0.2612],
         [0.7375, 0.7375]],

        [[0.8217, 0.8217],
         [0.2612, 0.2612],
         [0.7375, 0.7375]]])
tensor([[[0.8328, 0.8444],
         [0.8328, 0.8444],
         [0.8328, 0.8444]],

        [[0.8328, 0.8444],
         [0.8328, 0.8444],
         [0.8328, 0.8444]],

        [[0.8328, 0.8444],
         [0.8328, 0.8444],
         [0.8328, 0.8444]],

        [[0.8328, 0.8444],
         [0.8328, 0.8444],
         [0.8328, 0.8444]]])


# More Math with Tensors 

major categories of tensor operations:

In [26]:
# commons functions
a = torch.rand(2, 4) * 2 - 1
print("common functions: ")
print(torch.abs(a))
print(torch.ceil(a))
print(torch.floor(a))
print(torch.clamp(a, -0.5, 0.5))

# trig functions and their inverses
angles = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
sines = torch.sin(angles)
inverses = torch.asin(sines)
print('\nSine and arcsine:')
print(angles)
print(sines)
print(inverses)

# bitwise operations
print('\nBitwise XOR:')
b = torch.tensor([1, 5, 11])
c = torch.tensor([2, 7, 10])
print(torch.bitwise_xor(b, c))

# comparisons:
print('\nBroadcasted, element-wise, equality comparisons:')
d = torch.tensor([[1., 2.], [3., 4.]])
e = torch.ones(1, 2) 
print(torch.eq(d, e)) # returns tensor of type bool

# reductions:
print('\nReduction ops:')
print(torch.max(d)) # returns single element tensor
print(torch.max(d).item()) # extracts the value from the returned tensor 
print(torch.mean(d)) # average
print(torch.std(d)) # stdandard deviation 
print(torch.prod(d)) # product of all numbers 
print(torch.unique(torch.tensor([1, 2, 1, 2, 1, 2]))) # filter unique elements

# vector and linear alg operations 
v1 = torch.tensor([1., 0., 0.])     # x unit vector 
v2 = torch.tensor([0., 1., 0.])     # y unit vector
m1 = torch.rand(2, 2)               # random matrix
m2 = torch.tensor([[3., 0.], [0., 3.]]) # three tmes identity matrix

print('\nVectors & Matrices:')
print(torch.cross(v2, v1)) # negative of z unit vector (v1 x v2 == -v2 x v1)
print(m1)
m3 = torch.matmul(m1, m2)
print(m3)                          # 3 times m1
print(torch.svd(m3))               # singular value decomposition 

common functions: 
tensor([[0.2648, 0.8928, 0.9773, 0.0365],
        [0.9614, 0.3090, 0.1713, 0.8608]])
tensor([[1., 1., -0., 1.],
        [1., 1., -0., -0.]])
tensor([[ 0.,  0., -1.,  0.],
        [ 0.,  0., -1., -1.]])
tensor([[ 0.2648,  0.5000, -0.5000,  0.0365],
        [ 0.5000,  0.3090, -0.1713, -0.5000]])

Sine and arcsine:
tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7854, 1.5708, 0.7854])

Bitwise XOR:
tensor([3, 2, 1])

Broadcasted, element-wise, equality comparisons:
tensor([[ True, False],
        [False, False]])

Reduction ops:
tensor(4.)
4.0
tensor(2.5000)
tensor(1.2910)
tensor(24.)
tensor([1, 2])

Vectors & Matrices:
tensor([ 0.,  0., -1.])
tensor([[0.4648, 0.4491],
        [0.6265, 0.9411]])
tensor([[1.3944, 1.3473],
        [1.8796, 2.8234]])
torch.return_types.svd(
U=tensor([[-0.4918, -0.8707],
        [-0.8707,  0.4918]]),
S=tensor([3.8902, 0.3610]),
V=tensor([[-0.5970, -0.8023],
        [-0.8023,  0.5970]]))


Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/Cross.cpp:67.)
  print(torch.cross(v2, v1)) # negative of z unit vector (v1 x v2 == -v2 x v1)


In place modification of tensors. Good for optimization and freeing up memory: 

In [27]:
a = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
print(f'a: {a}')
print(torch.sin(a))         # this operation is a new tensor in memory 
print(a)                    # a has not changed 

b = torch.tensor([0, math.pi / 4, math.pi / 2, 3 * math.pi / 4])
print(f'\nb: {b}')
print(torch.sin_(b))       # b has changed
print(b)                   
 

a: tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7854, 1.5708, 2.3562])

b: tensor([0.0000, 0.7854, 1.5708, 2.3562])
tensor([0.0000, 0.7071, 1.0000, 0.7071])
tensor([0.0000, 0.7071, 1.0000, 0.7071])


arithmetic operations work similarly: 

In [30]:
a = torch.ones(2, 2)
b = torch.rand(2, 2)

print('before')
print(a)
print(b)
print('\nAfter adding:')
print(a.add_(b))
print(a)
print(b)
print('\nAfter Multiplying:')
print(b.mul_(b))
print(b)

before
tensor([[1., 1.],
        [1., 1.]])
tensor([[0.7765, 0.3534],
        [0.7016, 0.6826]])

After adding:
tensor([[1.7765, 1.3534],
        [1.7016, 1.6826]])
tensor([[1.7765, 1.3534],
        [1.7016, 1.6826]])
tensor([[0.7765, 0.3534],
        [0.7016, 0.6826]])

After Multiplying:
tensor([[0.6030, 0.1249],
        [0.4923, 0.4660]])
tensor([[0.6030, 0.1249],
        [0.4923, 0.4660]])


in place arithmetic functions are methods on the ```torch.Tensor``` object not attached to the ```torch``` module like many other functions (```torch.sin()```) 

from ```a.add_(b) the *calling* tensor is the one that gets changed in place. 

other option for placing result of computation in an existing allocated tensor. Many methods and functions seen so far (including creation methods) - have an ```out``` arg that lets us specify a tensor to receive the output. if the ```out``` tensor is the correct shape and ```dtype``` this can happen without a new memory allocation: 

In [9]:
a = torch.rand(2, 2)
b = torch.rand(2, 2)
c = torch.rand(2, 2)
old_id = id(c)

print(f'original c: {c}')
d = torch.matmul(a, b, out=c)
print(f'changed c: {c}')        # contents of c changed

assert c is d                   # test c & d are the same obj, not just contains equal values  
assert id(c), old_id            # ensure new matrix c is the same object as the old one

torch.rand(2, 2, out=c)         # works for creation 
print(f'changed c again: {c}')  # c has changed again 
assert id(c), old_id            # still the same obj 

original c: tensor([[0.5627, 0.1740],
        [0.4151, 0.4759]])
changed c: tensor([[0.5866, 0.4727],
        [0.6570, 0.3235]])
changed c again: tensor([[0.1544, 0.3413],
        [0.5993, 0.6036]])


# Copying Tensors 
Like any python object, assigning a tensor to a variable makes the variable a *label* of the tensor, and does not copy it. example:

In [13]:
a = torch.ones(2, 2)
b = a 

a[0][1] = 561           # make a change to tensor a 
print(f'change present in python variable b:\n{b}')                # b is also altered 

change present in python variable b:
tensor([[  1., 561.],
        [  1.,   1.]])


Separate copy of the data using ```clone()```:

In [18]:
a = torch.ones(2, 2)
b = a.clone()

assert b is not a           # different objects in memory
print(torch.eq(a,b))        # b has same contents as a 

a[0][1] = 561               # a changes 
print(f'tensor variable b remains unchanged:\n{b}')

tensor([[True, True],
        [True, True]])
tensor variable b remains unchanged:
tensor([[1., 1.],
        [1., 1.]])


**Important note when using ```clone```** if source tensor has autograd, then the clone will to. Many cases this is what we want. If the model has multiple computation paths in its ```forward()``` method, and *both* the original tensor and its clone contribute to the model's output. To enable learning we want autograd turned on for both tensors. 

On other hand, if were doing computation where *neither* the original tensor or its clone need to track gradients, then as long as the source tensor has autograd turned off it is okay. 

*Third case* when performing a computation in models ```forward()``` function, where gradients are turned on for everything by default, but what to pull out some values mid-stream to generate metrics. In this case, we *don't* want the cloned copy to track gradients - performance improved with autograd's history tracking turned off. For this operation we can use ```.detach()``` method on source tensor:

In [9]:
a = torch.rand(2, 2, requires_grad=True)         # turn on autograd 
print(f'original tensor:\n{a}')

b = a.clone()
print(f'cloned tensor:\n{b}')

c = a.detach().clone()
print(f'cloned tensor detached:\n{c}')

print(f'original tensor:\n{a}')

original tensor:
tensor([[0.8553, 0.8856],
        [0.5618, 0.9886]], requires_grad=True)
cloned tensor:
tensor([[0.8553, 0.8856],
        [0.5618, 0.9886]], grad_fn=<CloneBackward0>)
cloned tensor detached:
tensor([[0.8553, 0.8856],
        [0.5618, 0.9886]])
original tensor:
tensor([[0.8553, 0.8856],
        [0.5618, 0.9886]], requires_grad=True)


### explanation
- ```a``` is created with ```requires_grad=True``` turned on. the property is present when printing ```a```. autograd and computation history is turned on. 

- we clone ```a``` and label it ```b```. When printing ```b``` we can see its tracking and computation history. It has inherited ```a```'s autograd settings, and added to the computation history. 

- ```c``` is a clone of ```a``` using ```.detach()```. when printing we see no computation history and no ```requires_grad=True```