**Pytorch**

Pytorch can be especially effective if you have a gpu available, and have cuda installed. This makes it possible to do large-scale matrix calculations e.g. (matrix multiplications) more quickly than if you try to do them on your cpu.



In [5]:
import torch
import numpy as np

if torch.cuda.is_available():
    dev="cuda:0"
    print("cuda available")
else:
    dev="cpu"
    print("cuda not available")
print(torch.cuda.get_device_name())
t = torch.cuda.get_device_properties(0).total_memory
c = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
print(t)
print(c)
print(a)

cuda not available


AssertionError: Torch not compiled with CUDA enabled

**Tensors**

In Pytorch, tensors are basic objects to work with. They are basically multidimensional arrays.

In [2]:
T=torch.tensor([[1.,2.,3.],[4.,5.,6.]])
print(T)
print(T.size())

T=torch.tensor([[[1.,2.,3.],[4.,5.,6.]],[[7.,8.,9.],[10.,11.,12.]]])
print(T)
print(T.size())

print(T[0,1,0])

tensor([[1., 2., 3.],
        [4., 5., 6.]])
torch.Size([2, 3])
tensor([[[ 1.,  2.,  3.],
         [ 4.,  5.,  6.]],

        [[ 7.,  8.,  9.],
         [10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor(4.)


**Basic tensor vector operations**

We can add tensors.

In [3]:
import numpy as np
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)

L2=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T2=torch.tensor(L2)

print(T1)
print(T1.size())

print(T2)
print(T2.size())

T3=T1+T2
print(T3)

tensor([[[3., 2.],
         [4., 6.]],

        [[2., 6.],
         [5., 5.]],

        [[8., 9.],
         [5., 6.]]], dtype=torch.float64)
torch.Size([3, 2, 2])
tensor([[[8., 8.],
         [1., 8.]],

        [[4., 5.],
         [3., 6.]],

        [[3., 2.],
         [6., 8.]]], dtype=torch.float64)
torch.Size([3, 2, 2])
tensor([[[11., 10.],
         [ 5., 14.]],

        [[ 6., 11.],
         [ 8., 11.]],

        [[11., 11.],
         [11., 14.]]], dtype=torch.float64)


We can peform scalar multiplication

In [4]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)
print(T1)
T2=5.*T1
print(T2)

tensor([[[1., 2.],
         [1., 8.]],

        [[6., 4.],
         [6., 7.]],

        [[1., 4.],
         [3., 2.]]], dtype=torch.float64)
tensor([[[ 5., 10.],
         [ 5., 40.]],

        [[30., 20.],
         [30., 35.]],

        [[ 5., 20.],
         [15., 10.]]], dtype=torch.float64)


**Coordinatewise multiplication**

In [5]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T2=torch.tensor(L2)
print(T2)

T3=T1*T2
print(T3)

tensor([[[4., 3.],
         [1., 5.]],

        [[2., 2.],
         [6., 4.]],

        [[2., 6.],
         [4., 4.]]], dtype=torch.float64)
tensor([[[5., 9.],
         [1., 3.]],

        [[5., 6.],
         [1., 7.]],

        [[1., 0.],
         [5., 6.]]], dtype=torch.float64)
tensor([[[20., 27.],
         [ 1., 15.]],

        [[10., 12.],
         [ 6., 28.]],

        [[ 2.,  0.],
         [20., 24.]]], dtype=torch.float64)


**Matrix multiplication**

A matrix is a 2-d tensor, and mm is used for matrix multiplication.

In [6]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(2,4))
T2=torch.tensor(L2)
print(T2)

T3=torch.mm(T1,T2)
print(T3)

tensor([[4., 9.],
        [6., 2.],
        [8., 7.]], dtype=torch.float64)
tensor([[2., 8., 8., 6.],
        [1., 7., 9., 8.]], dtype=torch.float64)
tensor([[ 17.,  95., 113.,  96.],
        [ 14.,  62.,  66.,  52.],
        [ 23., 113., 127., 104.]], dtype=torch.float64)


**Batch matrix multiplication**

Batch matrix multiplication refers to matrix multiplicaton of two "batches" of matrices.

A batch of K  MxN matrices is a tensor that is K x M x N.

We can batch multiply by another batch, which would be a K x N x P tensor.

In [7]:
L1=np.random.choice([float(i) for i in range(10)],size=(5,3,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(5,2,4))
T2=torch.tensor(L2)
print(T2)

T3=torch.bmm(T1,T2)
print(T3)

tensor([[[2., 6.],
         [2., 1.],
         [6., 1.]],

        [[0., 9.],
         [6., 2.],
         [3., 8.]],

        [[6., 5.],
         [3., 7.],
         [6., 8.]],

        [[3., 0.],
         [2., 2.],
         [7., 1.]],

        [[3., 6.],
         [8., 3.],
         [7., 7.]]], dtype=torch.float64)
tensor([[[9., 4., 1., 7.],
         [0., 4., 5., 1.]],

        [[4., 9., 1., 5.],
         [4., 1., 5., 3.]],

        [[9., 8., 9., 6.],
         [4., 1., 3., 7.]],

        [[1., 5., 0., 6.],
         [7., 5., 0., 1.]],

        [[1., 6., 3., 4.],
         [8., 5., 1., 1.]]], dtype=torch.float64)
tensor([[[18., 32., 32., 20.],
         [18., 12.,  7., 15.],
         [54., 28., 11., 43.]],

        [[36.,  9., 45., 27.],
         [32., 56., 16., 36.],
         [44., 35., 43., 39.]],

        [[74., 53., 69., 71.],
         [55., 31., 48., 67.],
         [86., 56., 78., 92.]],

        [[ 3., 15.,  0., 18.],
         [16., 20.,  0., 14.],
         [14., 40.,  0., 43.]],

   

**matmul**

matmul allows for more general matrix products.

In [3]:
L1=np.random.choice([float(i) for i in range(10)],size=(5,3,2))
T1=torch.tensor(L1)
L2=np.random.choice([float(i) for i in range(10)],size=(5,2,4))
T2=torch.tensor(L2)
T3=torch.matmul(T1,T2)
print(T3.size())

L1=np.random.choice([float(i) for i in range(10)],size=(10,5,3,2,4))
T1=torch.tensor(L1)
L2=np.random.choice([float(i) for i in range(10)],size=(4,10))
T2=torch.tensor(L2)

T3=torch.matmul(T1,T2)
print(T3.size())

torch.Size([5, 3, 4])
torch.Size([10, 5, 3, 2, 10])


We can do some timings to see the advantages of doing calculations on the gpu.

In [4]:
import time
import torch
import numpy as np

K=2500
L=1000
M=1000

X=np.random.normal(0,1,(K,L))
Y=np.random.normal(0,1,(L,M))

start_time=time.perf_counter()
np.matmul(X,Y)
end_time=time.perf_counter()
print(end_time-start_time)

XT=torch.tensor(X)
YT=torch.tensor(Y)

XG=XT.to(dev)
YG=YT.to(dev)

start_time=time.time()
ZG=torch.matmul(XG,YG)
end_time=time.time()
print(end_time-start_time)

0.12348449998535216
2.861968517303467


**Outer product**

In [89]:
v1 = torch.arange(1, 4)    # Size 3
v2 = torch.arange(1, 3)    # Size 2
r = torch.ger(v1, v2)
print(v1)
print(v2)
print(r)

tensor([1, 2, 3])
tensor([1, 2])
tensor([[1, 2],
        [2, 4],
        [3, 6]])
