#Preparation
##Uploading or Using Notebook
You need to signup and apply for access before you can start using Google Colab.
Once you have access, you can either upload your own notebook using File → Upload Notebook or simply enter your codes in the cells.
##Activating GPU
To enable GPU backend for your notebook, go to Edit → Notebook Settings and set Hardware accelerator to GPU.


##Installing Pytorch
We are going to use pytorch for tensor operations in GPU. Install pytorch using the following command. Doing it once is sufficient for a session.

In [2]:
# http://pytorch.org/
!pip install torch


Looking in indexes: https://pypi.org/simple, https://legacy.pypi.org/simple
Collecting torch
[?25l  Downloading https://files.pythonhosted.org/packages/df/a4/7f5ec6e9df1bf13f1881353702aa9713fcd997481b26018f35e0be85faf7/torch-0.4.0-cp27-cp27mu-manylinux1_x86_64.whl (484.0MB)
[K    100% |████████████████████████████████| 484.0MB 25kB/s 
tcmalloc: large alloc 1073750016 bytes == 0x55aead97c000 @  0x7f951e5aa1c4 0x55ae51a3e0d8 0x55ae51b27d5d 0x55ae51a5177a 0x55ae51a56462 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a4eb3a 0x55ae51a56e1f 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a56462 0x55ae51a56462 0x55ae51a4eb3a 0x55ae51a56e1f 0x55ae51a56462 0x55ae51a4eb3a 0x55ae51a56e1f 0x55ae51a4eb3a 0x55ae51a56e1f 0x55ae51a4eb3a 0x55ae51a5682e 0x55ae51a4eb3a 0x55ae51a7f50f 0x55ae51a7a202
[?25hInstalling collected packages: torch
Successfully installed torch-0.4.0


# Variable Initialization
We are going to initializa a big matrix in CPU and another equally sized matrix in GPU

In [3]:
import torch
import time
import numpy as np
from torch.autograd import Variable

x_cpu = np.random.rand(10000,10000)
x_gpu = Variable(torch.from_numpy(x_cpu)).cuda(0)
print 'GPU matrix size:',x_gpu.shape
print 'CPU matrix size:',x_cpu.shape


GPU matrix size: torch.Size([10000, 10000])
CPU matrix size: (10000, 10000)


# CPU vs. GPU Comparison for Matrix Multiplication

In [15]:
# Compute in CPU
oldtime = time.time()
z_cpu = x_cpu.dot(x_cpu.T)
print 'Matrix-Matrix product time in CPU:',time.time()-oldtime,'seconds'

# Compute in GPU
oldtime = time.time()
z_cpu = torch.matmul(x_gpu,torch.t(x_gpu))
print 'Matrix-Matrix product time in GPU:',time.time()-oldtime,'seconds'

Matrix-Matrix product time in CPU: 36.3066530228 seconds
Matrix-Matrix product time in GPU: 0.000468969345093 seconds


#CPU vs. GPU Comparison for Random Row-Column Multiplication

In [8]:
from itertools import izip

m,n = x_cpu.shape
idx_a = np.random.choice(np.arange(m),50000)
idx_b = np.random.choice(np.arange(m),50000)

# Compute in CPU
oldtime = time.time()
for i,j in izip(idx_a,idx_b):
  row = x_cpu[i,:][None,:]
  col = x_cpu[:,j][:,None]
  z_cpu = row.dot(col)
print "Random row-column multiplication in CPU:",time.time()-oldtime,'seconds'

# Compute in GPU
oldtime = time.time()
for i,j in izip(idx_a,idx_b):
  row = x_gpu[i,:].unsqueeze(0)
  col = x_gpu[:,j].unsqueeze(1)
  z_gpu = torch.matmul(row,col)
print "Random row-column multiplication (unsqueeze) in GPU:",time.time()-oldtime,'seconds'

# Compute in GPU
oldtime = time.time()
for i,j in izip(idx_a,idx_b):
  row = x_gpu[i,:].view(1,-1)
  col = x_gpu[:,j].view(-1,1)
  z_gpu = torch.matmul(row,col)
print "Random row-column multiplication (view) in GPU:",time.time()-oldtime,'seconds'
print "View is a bit slower"

# Compute in GPU
oldtime = time.time()
for i,j in izip(idx_a,idx_b):
  row = x_gpu[i,:].unsqueeze(0)
  col = x_gpu[:,j].unsqueeze(1)
  z_gpu = torch.mm(row,col)
print "Random row-column multiplication (mm) in GPU:",time.time()-oldtime,'seconds'
  


Random row-column multiplication in CPU: 10.2146840096 seconds
Random row-column multiplication (unsqueeze) in GPU: 7.92783904076 seconds
Random row-column multiplication (view) in GPU: 8.39130401611 seconds
View is a bit slower
Random row-column multiplication (mm) in GPU: 7.79804587364 seconds
