<a href="https://colab.research.google.com/github/abulhasanat/NLP-Experiments/blob/master/Experiment_with_GPU%20and%20CPU.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np 
import torch

In [2]:
# If there's a GPU available...
if torch.cuda.is_available():    

    # Tell PyTorch to use the GPU.    
    device = torch.device("cuda")

    print('There are %d GPU(s) available.' % torch.cuda.device_count())

    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

device =torch.device('cuda' if torch.cuda.is_available() else 'cpu')

There are 1 GPU(s) available.
We will use the GPU: Tesla P100-PCIE-16GB


In [24]:
device =torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cuda')

In [3]:
%%time
a = np.random.randn(1000,1000)
result = np.matmul(a,a)
del a, result

CPU times: user 146 ms, sys: 10.3 ms, total: 156 ms
Wall time: 107 ms


In [4]:
%%time
z = torch.randn(1000,1000)
result = torch.matmul(z,z)
del z, result

CPU times: user 37.8 ms, sys: 223 µs, total: 38 ms
Wall time: 89.8 ms


In [5]:
%%time
b = np.random.randn(10000,10000)
result = np.matmul(b,b)
del b, result

CPU times: user 1min 37s, sys: 555 ms, total: 1min 37s
Wall time: 51.8 s


In [6]:
%%time
y = torch.randn(10000,10000)
result = torch.matmul(y,y)
del y, result

CPU times: user 23.4 s, sys: 349 µs, total: 23.4 s
Wall time: 23.4 s


In [25]:
%%time
x = torch.randn(10000,10000).to(device)
result = torch.matmul(x,x)
del x, result

CPU times: user 1.02 s, sys: 2 ms, total: 1.02 s
Wall time: 1.03 s


In [26]:
%%time
x = torch.randn(10000,10000).cuda()
result = torch.matmul(x,x)
del x, result

CPU times: user 1.02 s, sys: 1.91 ms, total: 1.02 s
Wall time: 1.02 s


In [8]:
%%time
w = torch.randn(10000,10000).cuda()
result = torch.matmul(w,w)
del w, result


CPU times: user 998 ms, sys: 155 ms, total: 1.15 s
Wall time: 1.15 s


In [9]:
%%time
v = torch.randn(20000,20000).cuda()
result = torch.matmul(v,v)
del v, result


CPU times: user 3.91 s, sys: 511 ms, total: 4.42 s
Wall time: 4.42 s


In [10]:
%%time
v = torch.randn(30000,30000).cuda()
result = torch.matmul(v,v)
del v, result

CPU times: user 8.69 s, sys: 806 ms, total: 9.5 s
Wall time: 9.5 s


In [11]:
%%time
v = torch.randn(40000,40000).cuda()
result = torch.matmul(v,v)
del v, result

CPU times: user 16 s, sys: 2.46 s, total: 18.5 s
Wall time: 18.8 s


In [12]:
import tensorflow as tf

# Get the GPU device name.
device_name = tf.test.gpu_device_name()

# The device name should look like the following:
if device_name == '/device:GPU:0':
    print('Found GPU at: {}'.format(device_name))
else:
    raise SystemError('GPU device not found')

Found GPU at: /device:GPU:0


In [18]:
%%time
g1 = tf.random.Generator.from_seed(1)
v=g1.normal(shape=[10000, 10000])
# v=tf.random(1000,1000)
result=tf.matmul(v,v)
del v,result

CPU times: user 2.93 ms, sys: 0 ns, total: 2.93 ms
Wall time: 4.53 ms


In [22]:
%%time
with tf.device('/gpu:0'):
  g1 = tf.random.Generator.from_seed(1)
  v=g1.normal(shape=[10000, 10000])
  # v=tf.random(1000,1000)
  result=tf.matmul(v,v)
  del v,result

CPU times: user 3.14 ms, sys: 5 µs, total: 3.15 ms
Wall time: 2.93 ms


Based on the above two execution times, we can conclude that Tensorflow uses GPU by default.

Lets see the execution time with forcefully allocating the CPU device.


In [28]:
%%time
with tf.device('/cpu:0'):
  g1 = tf.random.Generator.from_seed(1)
  v=g1.normal(shape=[10000, 10000])
  # v=tf.random(1000,1000)
  result=tf.matmul(v,v)
  del v,result

CPU times: user 54.9 s, sys: 57.6 ms, total: 55 s
Wall time: 28 s
