In [1]:
%config InteractiveShell.ast_node_interactivity = 'all'

In [2]:
import numpy as np

# Array creation

Numpy arrays are n-dimensional arrays (**ndarray**) that contains the elements of the same dtype. The most frequent dtypes are **float**, **int**.

In [3]:
x = np.array([1, 2, 3], dtype=int)

In [4]:
x.ndim
x.shape
x.dtype

1

(3,)

dtype('int64')

In [5]:
y = np.array([1, 2, 3], dtype=float)
y.ndim
y.shape
y.dtype

1

(3,)

dtype('float64')

Numpy arrays can have n-dimensions. We can construct simple 1-dimensional arrays i.e. vectors or 2-dimensional arrays i.e. matrices but also arrays with more dimensions.

In [6]:
v = np.array([1, 2, 3])
v
v.ndim

array([1, 2, 3])

1

In [7]:
A = np.array([[1, 2, 3], [4, 5, 6]])
A
A.ndim # number of dimensions
A.shape # number of elements along each dimension
A.size # number of elements in total (product of size)

array([[1, 2, 3],
       [4, 5, 6]])

2

(2, 3)

6

In [8]:
nd = np.array([
    [
        [1, 2, 3],
        [4, 5, 6]
    ],
    [
        [7, 8, 9],
        [10, 11, 12]
    ]
])
nd.ndim
nd.shape
nd.size

3

(2, 2, 3)

12

To declare an array, use np.array([]). For 3-dimensional array, a good way of interpreting this object is to think of a list of matrices. The first dimension is when you choose the matrix you want to study in the list. The second dimension corresponds to the row of the matrix. The third dimension is the column of the matrix. For instance, to get the 2nd row, 3rd column of the 2nd matrix (A2[2, 3] = 12) you can write

In [9]:
nd[1, 1, 2]

12

Here are some useful functions to create arrays:

+ np.linspace(start, stop, num) ==> returns num elements evenly spaced between start and stop.
+ np.reshape(a, newshape) ==> gives a new shape to an array without changing its data. The new shape should be compatible with the original shape (i.e. same size). If the newshape is an integer, then the result will be a 1-D array of that length. If it is a tuple, it specifies the length of each dimension. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
+ np.arange(start, stop, step) ==> returns evenly spaced values within a given interval \[start, stop(. Contrary to np.linspace(.), this function does not include the stop value.
+ np.zeros(shape) ==> returns a ndarray of zeros with specified shape.
+ np.ones(shape) ==> same for ones.
+ np.random.randint(low, high, shape) ==> If high is unspecified returns a ndarray of random integers between [0, low(. If high is specified between [low, high( (excluding high).
+ np.random.rand(d0, d1, ...) ==> Create an array of the given shape (here not a tuple) and populate it with random samples from a uniform distribution over [0, 1).

In [10]:
# Example of a 1000x1000 matrix in uniform distribution.
A = np.random.rand(1000, 1000)
print(np.min(A), np.max(A), np.mean(A), sep=' ; ')

3.086842231825315e-07 ; 0.9999972863071543 ; 0.5005174364890724


In [11]:
# Example of 1000x1000 matrix with integers between 1 (inclusive) and 10 (exclusive)
A = np.random.randint(1, 10, (1000, 1000))
print(np.min(A), np.max(A), np.mean(A), sep=' ; ')

1 ; 9 ; 5.00146


In [12]:
# Example of 1000x1000 matrix with integers between 0 (inclusive) and 15 (exclusive)
A = np.random.randint(low=15, size=(1000, 1000))
print(np.min(A), np.max(A), np.mean(A), sep=' ; ')

0 ; 14 ; 7.007531


In [13]:
# Example of vectors of 1000 evenly spaced numbers between 0 (inclusive) and 10 (inclusive)
v = np.linspace(0, 10, 1000)
print(np.min(v), np.max(v), np.size(v), sep=' ; ')

0.0 ; 10.0 ; 1000


In [14]:
# Matrix of 5 * 2 with consecutive elements from 1 to 10
A = np.arange(1, 11, 1).reshape((5, 2)) # use np.arange() then ndarray.reshape()
A

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

# Understanding axes

You can aggregate array to compute means, standard deviation etc. along the chosen dimension. When you specify the axis, you specify ACROSS which dimension you want to compute the indicator.

For instance, if you want to compute the mean of the elements accross the 3rd dimension i.e. accross the columns of the matrices. You will thus get an array of 2 * 2 * 1 elements i.e. for each considered matrix, for each row you compute the mean accross the columns. Or you can reduce the results to a 2*2 matrix where the first row of the matrix contains the result of the first matrix and the second row of the obtained matrix the results for the second matrix. By default, results are reduced!

In [15]:
mu = np.mean(nd, axis=2, keepdims=True)
mu

mu2 = np.mean(nd, axis=2)
mu2

array([[[ 2.],
        [ 5.]],

       [[ 8.],
        [11.]]])

array([[ 2.,  5.],
       [ 8., 11.]])

In [16]:
# If you do not specify the axis, it computes the mean over the flattened array (i.e. accross all
# the elements)
np.mean(nd)

6.5

In [17]:
# Accross axis=0 means you average each element of the two matrices like 
# M(1,1) = [A1(1,1) + A2(1, 1)]/2
np.mean(nd, axis=0)
# You obtain a new matrix except if you keep dim where you have a list of 1 matrix

array([[4., 5., 6.],
       [7., 8., 9.]])

What if you want to compute the mean of each matrix (like the mean of all elements of matrix1 and the mean of all elements of matrix 2)? You can specify a tuple of axis.

In [18]:
np.mean(nd, axis=(1, 2)) 
# Matrix 1 : (1+2+3+4+5+6)/6 = 21/6 = 3.5

array([3.5, 9.5])

Thus np.mean(v, axis=(0, 1, 2)) <==> np.mean(v) without specifying axis i.e. on the flattened vector

In [19]:
np.mean(nd, axis=(0, 1, 2)) 
np.mean(nd)

6.5

6.5

# Reshaping arrays

We have already seen ndarray.reshape() method. There are also ways to expand the dimensions of array.

In [23]:
v = np.array([1, 2, 3, 4, 5, 6])
print(v.ndim, v.shape, v.size)
# v is a 1D array of 6 elements
v

1 (6,) 6


array([1, 2, 3, 4, 5, 6])

In [24]:
# We can convert it to a column vector
v2 = v.reshape((v.size, 1))
print(v2.ndim, v2.shape, v2.size)
v2

2 (6, 1) 6


array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

In [29]:
# We could have done the same with np.newaxis
v3 = v[:, np.newaxis] # Here you are adding a second dimension 
print(v3.ndim, v3.shape, v3.size)
v2
# Technically here you are saying: take me all the elements of the first dimension of the 1D array (v)
# so 1, 2 etc. and add an empty second dimension which gives (1, .), (2, .) etc. i.e. a column vector.

2 (6, 1) 6


array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

In [34]:
# We could have build row vectors as well
rv1 = v[np.newaxis, :]
rv2 = v.reshape((1, v.size))
print(rv1.shape, rv2.shape)
rv1

(1, 6) (1, 6)


array([[1, 2, 3, 4, 5, 6]])

In [35]:
# We could have achieved the same with np.expand_dims()
rv3 = np.expand_dims(v, axis=0)
print(rv3.shape)
rv3

(1, 6)


array([[1, 2, 3, 4, 5, 6]])

In [36]:
cv1 = np.expand_dims(v, axis=1)
cv1

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

# Operations with arrays

If arrays have the same size, operations in numpy are element-wise : *, /, +, -. For instance

In [43]:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[10, 20, 30], [1, 2, 3]])

In [44]:
a*b

array([[10, 40, 90],
       [ 4, 10, 18]])

In [45]:
a/b

array([[0.1, 0.1, 0.1],
       [4. , 2.5, 2. ]])

In [46]:
a+b

array([[11, 22, 33],
       [ 5,  7,  9]])

In [47]:
a-b

array([[ -9, -18, -27],
       [  3,   3,   3]])

**Broadcasting**

There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar) or between arrays of two different sizes. For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. NumPy understands that the multiplication should happen with each cell. That concept is called broadcasting. Broadcasting is a mechanism that allows NumPy to perform operations on arrays of different shapes. The dimensions of your array must be compatible, for example, when the dimensions of both arrays are equal or when one of them is 1. If the dimensions are not compatible, you will get a ValueError.

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when:

+ they are equal, or
+ one of them is 1

If these conditions are not met, a ValueError: operands could not be broadcast together exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the size that is not 1 along each axis of the inputs. Arrays do not need to have the same number of dimensions. For example, if you have a 256x256x3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values.  When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other.

In [52]:
a * 3 # Multiplication by a scalar

array([[ 3,  6,  9],
       [12, 15, 18]])

In [57]:
a.shape
z = np.array([100, 200, 300]) # it is a 1d array of 3 elements: shape = (3, )
z.shape
# z is converted to be an array of 2D with shape (2, 3) where z[1, :] = z[0, :]
a*z

# equivalent of
z2 = np.array([[100, 200, 300], [100, 200, 300]])
z2.shape
a*z2

(2, 3)

(3,)

array([[ 100,  400,  900],
       [ 400, 1000, 1800]])

(2, 3)

array([[ 100,  400,  900],
       [ 400, 1000, 1800]])

In [66]:
a = np.arange(0, 31, 10)
b = np.arange(1, 4, 1)
a
b

array([ 0, 10, 20, 30])

array([1, 2, 3])

In [67]:
# a is (4,1)
# (4,1) + (3,) => (4, 3)
# a is gonna be converted to 4, 3
# b is gonna be converted to 4, 3
# 3 Columns of 0 10 20 30 + 4 rows of 1, 2, 3
# 1 2 3
# 11 12 13
# 21 22 23
# 31 32 33
a[:, np.newaxis] + b

array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23],
       [31, 32, 33]])

In [87]:
# Matrix operations can be performed using @
v = np.arange(1, 13, 1).reshape((12, 1))
w = v.copy()
z = v.T @ w # returns a (1*1) 2-D array
z
type(z)
z.shape
zscalar = z.item() # to get the scalar of a 1-sized ndarray
zscalar
type(zscalar)

array([[650]])

numpy.ndarray

(1, 1)

650

int

In [88]:
A = v @ w.T
A.shape
A.size
A

(12, 12)

144

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12],
       [  2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24],
       [  3,   6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36],
       [  4,   8,  12,  16,  20,  24,  28,  32,  36,  40,  44,  48],
       [  5,  10,  15,  20,  25,  30,  35,  40,  45,  50,  55,  60],
       [  6,  12,  18,  24,  30,  36,  42,  48,  54,  60,  66,  72],
       [  7,  14,  21,  28,  35,  42,  49,  56,  63,  70,  77,  84],
       [  8,  16,  24,  32,  40,  48,  56,  64,  72,  80,  88,  96],
       [  9,  18,  27,  36,  45,  54,  63,  72,  81,  90,  99, 108],
       [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120],
       [ 11,  22,  33,  44,  55,  66,  77,  88,  99, 110, 121, 132],
       [ 12,  24,  36,  48,  60,  72,  84,  96, 108, 120, 132, 144]])