In [1]:
import torch

x = torch.randn(10000, 10000).to("cuda")
w = torch.randn(10000, 10000).to("cuda")
# ensure that context initialization finish before you start measuring time
torch.cuda.synchronize()

%time y = x.mm(w.t()); torch.cuda.synchronize()

CPU times: user 351 ms, sys: 304 ms, total: 654 ms
Wall time: 666 ms


In [2]:
x = torch.randn(10000, 10000).to("cpu")
w = torch.randn(10000, 10000).to("cpu")

%time y = x.mm(w.t())

CPU times: user 16.3 s, sys: 185 ms, total: 16.5 s
Wall time: 16.5 s


## Numpy Tutorial


In [3]:
import numpy as np

The simplest object we can create is a vector. arange creates a row vector of 12 integers.

In [4]:
x = np.arange(12)
x

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])


We can get the array instance shape through the shape property.

In [5]:
x.shape

(12,)

We can also get the total number of elements in the array instance through the size property.

In [6]:
x.size

12

The reshape function change the shape of the line vector x to (3, 4), which is a matrix of 3 rows and 4 columns.

In [7]:
x = x.reshape((3, -1))
print(x)
print(x.shape)
print(x.size)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(3, 4)
12


We can use -1 to fill in defaults. x.reshape((3, 4)) is equivalent to x.reshape((-1, 4)) and x.reshape((3, -1)).


Typically we want all zeros. To create a tensor of shape (2, 3, 4)

In [None]:
np.zeros((2, 3, 4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

We can also specify the value of each element in the array that needs to be created through a Python list.

In [None]:
y = np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
y

array([[2, 1, 4, 3],
       [1, 2, 3, 4],
       [4, 3, 2, 1]])

In some cases, we need to randomly generate the value of each element in the array. This is especially common when we intend to use the array as a parameter in a neural network. The following creates an array with a shape of (3,4). Each of its elements is randomly sampled in a normal distribution with zero mean and unit variance.

In [8]:
np.random.normal(0, 1, size=(3, 4))

array([[ 0.22638064, -1.49421897, -0.8972499 , -0.12045226],
       [ 0.90210785,  0.7828864 ,  0.65972897, -0.00487247],
       [ 0.99485529, -0.55444371,  1.59524717,  1.21401386]])

### Operations

Common standard arithmetic operators (+,-,/,\*,\*\*) have all been lifted to element-wise operations for identically-shaped tensors.

In [9]:
x = np.array([1, 2, 4, 8])
y = np.ones_like(x) * 2
print('x =', x)
print('y =', y)
print('x ** y', x ** y) 
print('x + y', x + y)
print('x - y', x - y)
print('x * y', x * y)
print('x / y', x / y)

x = [1 2 4 8]
y = [2 2 2 2]
x ** y [ 1  4 16 64]
x + y [ 3  4  6 10]
x - y [-1  0  2  6]
x * y [ 2  4  8 16]
x / y [0.5 1.  2.  4. ]


Many more operations can be applied element-wise, such as exponentiation:

In [10]:
np.exp(x)

array([2.71828183e+00, 7.38905610e+00, 5.45981500e+01, 2.98095799e+03])

In addition to computations by element, we can also use the dot function for matrix operations. To perform matrix multiplication we define x as a matrix of 3 rows and 4 columns, and y is transposed into a matrix of 4 rows and 3 columns.

In [11]:
x = np.arange(12).reshape((3,4))
y = np.array([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
np.dot(x, y.T)

array([[ 18,  20,  10],
       [ 58,  60,  50],
       [ 98, 100,  90]])

We can also merge multiple arrays. For that, we need to tell the system along which dimension to merge. The example below merges two matrices along dimension 0 (along rows) and dimension 1 (along columns) respectively.

In [16]:
x = np.array([1, 2, 4, 8])
y = np.ones_like(x) * 2
x = x.reshape((4,1))
y = y.reshape((4,1))
np.concatenate((x, y), axis=0)

array([[1],
       [2],
       [4],
       [8],
       [2],
       [2],
       [2],
       [2]])

In [17]:
x

array([[1],
       [2],
       [4],
       [8]])

We can construct binary arrays by a logical statement. Take x == y as an example. If x and y are equal for some entry, the new array has a value of 1 at the same position; otherwise, it is 0.

In [18]:
x == y

array([[False],
       [ True],
       [False],
       [False]])

Summing over the array yields an array with one element.

In [19]:
x.sum()

15

In [20]:
x.mean()

3.75

### Broadcast Mechanism

If shapes of arrays differ a broadcasting mechanism is used (see NumPy): first, copy the elements appropriately so that both arrays have the same shape, then carry out operations by element.

In [21]:
a = np.arange(3).reshape((3, 1))
b = np.arange(2).reshape((1, 2))
a, b

(array([[0],
        [1],
        [2]]), array([[0, 1]]))

Since a and b are (3x1) and (1x2) matrices respectively, their shapes do not match up if we want to add them. array addresses this by 'broadcasting' the entries of both matrices into a larger (3x2) matrix as follows: for matrix a it replicates the columns, for matrix b it replicates the rows before adding up both element-wise.

In [22]:
a + b

array([[0, 1],
       [1, 2],
       [2, 3]])

### Indexing and Slicing

Elements in an array can be accessed by its index. In good Python tradition the first element has index 0 and ranges are specified to include the first but not the last. By this logic 1:3 selects the second and third element. Let's try this out by selecting the respective rows in a matrix.

In [23]:
x = np.arange(12).reshape(3,4)
x

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Beyond reading we can also write elements of a matrix.

In [24]:
x[1, 2] = 9
x

array([[ 0,  1,  2,  3],
       [ 4,  5,  9,  7],
       [ 8,  9, 10, 11]])


If we want to assign multiple elements the same value, we simply index all of them and then assign them the value.

In [26]:
x[0:2, :] = 12
x

array([[12, 12, 12, 12],
       [12, 12, 12, 12],
       [ 8,  9, 10, 11]])

### Saving Memory

We allocated new memory for each operation. For example, if we write y = x + y, we will dereference the matrix that y used to point to and instead point it at the newly allocated memory. After running y = y + x, we'll find that id(y) points to a different location. That's because Python first evaluates y + x, allocating new memory for the result and then subsequently redirects y to point at this new location.

In [28]:
x = np.array([1, 2, 4, 8])
y = np.ones_like(x) * 2

before = id(y)
y = y + x
id(y) == before
print(id(y))
print(before)

140271585506928
140271444828688



In-place operations in Numpy are easy. We can assign the result of an operation to a previously allocated array with slice notation, e.g.,



In [29]:
z = np.zeros_like(y)
print('id(z):', id(z))
z[:] = x + y
print('id(z):', id(z))

id(z): 140271444829248
id(z): 140271444829248


If the value of x is not reused in subsequent programs, we can also use x[:] = x + y or x += y to reduce the memory overhead of the operation.

In [30]:
before = id(x)
x += y
id(x) == before

True