**Pytorch**

Pytorch can be especially effective if you have a gpu available, and have cuda installed. This makes it possible to do large-scale matrix calculations e.g. (matrix multiplications) more quickly than if you try to do them on your cpu.



In [1]:
import torch
import numpy as np

if torch.cuda.is_available():
    dev="cuda:0"
    print("cuda available")
else:
    dev="cpu"
    print("cuda not available")
print(torch.cuda.get_device_name())
t = torch.cuda.get_device_properties(0).total_memory
c = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
print(t)
print(c)
print(a)

cuda not available


RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

**Tensors**

In Pytorch, tensors are basic objects to work with. They are basically multidimensional arrays.

In [2]:
T=torch.tensor([[1.,2.,3.],[4.,5.,6.]])
print(T)
print(T.size())

T=torch.tensor([[[1.,2.,3.],[4.,5.,6.]],[[7.,8.,9.],[10.,11.,12.]]])
print(T)
print(T.size())

print(T[0,1,0])

tensor([[1., 2., 3.],
        [4., 5., 6.]])
torch.Size([2, 3])
tensor([[[ 1.,  2.,  3.],
         [ 4.,  5.,  6.]],

        [[ 7.,  8.,  9.],
         [10., 11., 12.]]])
torch.Size([2, 2, 3])
tensor(4.)


**Basic tensor vector operations**

We can add tensors.

In [5]:
import numpy as np
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)

L2=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T2=torch.tensor(L2)

print(T1)
print(T1.size())

print(T2)
print(T2.size())

T3=T1+T2
print(T3)

tensor([[[6., 7.],
         [9., 3.]],

        [[8., 9.],
         [6., 4.]],

        [[5., 7.],
         [9., 5.]]], dtype=torch.float64)
torch.Size([3, 2, 2])
tensor([[[8., 0.],
         [9., 0.]],

        [[0., 0.],
         [8., 3.]],

        [[7., 3.],
         [4., 8.]]], dtype=torch.float64)
torch.Size([3, 2, 2])
tensor([[[14.,  7.],
         [18.,  3.]],

        [[ 8.,  9.],
         [14.,  7.]],

        [[12., 10.],
         [13., 13.]]], dtype=torch.float64)


We can peform scalar multiplication

In [6]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)
print(T1)
T2=5.*T1
print(T2)

tensor([[[6., 5.],
         [6., 9.]],

        [[3., 1.],
         [4., 7.]],

        [[2., 1.],
         [4., 5.]]], dtype=torch.float64)
tensor([[[30., 25.],
         [30., 45.]],

        [[15.,  5.],
         [20., 35.]],

        [[10.,  5.],
         [20., 25.]]], dtype=torch.float64)


**Coordinatewise multiplication**

In [7]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(3,2,2))
T2=torch.tensor(L2)
print(T2)

T3=T1*T2
print(T3)

tensor([[[3., 6.],
         [7., 4.]],

        [[3., 9.],
         [4., 3.]],

        [[7., 7.],
         [4., 6.]]], dtype=torch.float64)
tensor([[[5., 4.],
         [0., 5.]],

        [[1., 3.],
         [5., 5.]],

        [[4., 1.],
         [5., 9.]]], dtype=torch.float64)
tensor([[[15., 24.],
         [ 0., 20.]],

        [[ 3., 27.],
         [20., 15.]],

        [[28.,  7.],
         [20., 54.]]], dtype=torch.float64)


**Matrix multiplication**

A matrix is a 2-d tensor, and mm is used for matrix multiplication.

In [8]:
L1=np.random.choice([float(i) for i in range(10)],size=(3,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(2,4))
T2=torch.tensor(L2)
print(T2)

T3=torch.mm(T1,T2)
print(T3)

tensor([[9., 4.],
        [7., 3.],
        [0., 5.]], dtype=torch.float64)
tensor([[6., 0., 2., 3.],
        [1., 4., 7., 7.]], dtype=torch.float64)
tensor([[58., 16., 46., 55.],
        [45., 12., 35., 42.],
        [ 5., 20., 35., 35.]], dtype=torch.float64)


**Batch matrix multiplication**

Batch matrix multiplication refers to matrix multiplicaton of two "batches" of matrices.

A batch of K  MxN matrices is a tensor that is K x M x N.

We can batch multiply by another batch, which would be a K x N x P tensor.

In [15]:
L1=np.random.choice([float(i) for i in range(10)],size=(5,3,2))
T1=torch.tensor(L1)
print(T1)
L2=np.random.choice([float(i) for i in range(10)],size=(5,2,4))
T2=torch.tensor(L2)
print(T2)

T3=torch.bmm(T1,T2)
print(T3)

tensor([[[4., 3.],
         [1., 3.],
         [5., 9.]],

        [[0., 9.],
         [3., 5.],
         [1., 8.]],

        [[6., 0.],
         [6., 9.],
         [5., 8.]],

        [[0., 8.],
         [3., 0.],
         [8., 4.]],

        [[2., 0.],
         [2., 7.],
         [2., 9.]]], dtype=torch.float64)
tensor([[[1., 0., 1., 8.],
         [1., 8., 0., 4.]],

        [[2., 0., 4., 4.],
         [0., 6., 9., 9.]],

        [[0., 1., 0., 6.],
         [5., 2., 5., 3.]],

        [[6., 8., 9., 8.],
         [8., 6., 7., 2.]],

        [[2., 1., 1., 3.],
         [5., 2., 7., 3.]]], dtype=torch.float64)
tensor([[[  7.,  24.,   4.,  44.],
         [  4.,  24.,   1.,  20.],
         [ 14.,  72.,   5.,  76.]],

        [[  0.,  54.,  81.,  81.],
         [  6.,  30.,  57.,  57.],
         [  2.,  48.,  76.,  76.]],

        [[  0.,   6.,   0.,  36.],
         [ 45.,  24.,  45.,  63.],
         [ 40.,  21.,  40.,  54.]],

        [[ 64.,  48.,  56.,  16.],
         [ 18.,  24.,  27.,

**matmul**

matmul allows for more general matrix products.

In [9]:
L1=np.random.choice([float(i) for i in range(10)],size=(5,3,2))
T1=torch.tensor(L1)
L2=np.random.choice([float(i) for i in range(10)],size=(5,2,4))
T2=torch.tensor(L2)
T3=torch.matmul(T1,T2)
print(T3.size())

L1=np.random.choice([float(i) for i in range(10)],size=(10,5,3,2,4))
T1=torch.tensor(L1)
L2=np.random.choice([float(i) for i in range(10)],size=(4,10))
T2=torch.tensor(L2)

T3=torch.matmul(T1,T2)
print(T3.size())

torch.Size([5, 3, 4])
torch.Size([10, 5, 3, 2, 10])


We can do some timings to see the advantages of doing calculations on the gpu.

In [17]:
import time
import torch
import numpy as np

K=2500
L=10000
M=1000

X=np.random.normal(0,1,(K,L))
Y=np.random.normal(0,1,(L,M))

start_time=time.time()
np.matmul(X,Y)
end_time=time.time()
print(end_time-start_time)

XT=torch.tensor(X)
YT=torch.tensor(Y)

XG=XT.to(dev)
YG=YT.to(dev)

start_time=time.time()
ZG=torch.matmul(XG,YG)
end_time=time.time()
print(end_time-start_time)

0.6701779365539551
0.0


**Outer product**

In [89]:
v1 = torch.arange(1, 4)    # Size 3
v2 = torch.arange(1, 3)    # Size 2
r = torch.ger(v1, v2)
print(v1)
print(v2)
print(r)

tensor([1, 2, 3])
tensor([1, 2])
tensor([[1, 2],
        [2, 4],
        [3, 6]])
