NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering.

The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional ndarray, and a large library of functions that operate efficiently on these data structures.

How to import NumPy

In [1]:
import numpy as np

Reading the example code


In [2]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

In [3]:
a.shape

(2, 3)

NumPy arrays have some restrictions.

    All elements of the array must be of the same type of data.

    Once created, the total size of the the array can’t change.

    The shape must be “rectangular”, not “jagged”; e.g., each row of a two-dimensional array must have the 
    same number of columns.

Note

As with built-in Python sequences, NumPy arrays are “0-indexed”: the first element of the array is accessed using index 0, not 1.

In [4]:
a = np.array([1, 2, 3, 4, 5, 6])
a

array([1, 2, 3, 4, 5, 6])

In [5]:
a[0]

1

In [6]:
a[6]

IndexError: index 6 is out of bounds for axis 0 with size 6

In [None]:
a[6] = 7

In [None]:
a[0] = 10
a

One major difference is that slice indexing of a list copies the elements into a new list, but slicing an array returns a view: an object that refers to the data in the original array. The original array can be mutated using the view.

In [7]:
b = a[3:]
b

array([4, 5, 6])

In [8]:
id(b) == id(a[3:])

False

In [9]:
b[0] = 40
a

array([ 1,  2,  3, 40,  5,  6])

In [10]:
id(b[:]) == id(a[3:])

True

In [11]:
b

array([40,  5,  6])

In [12]:
a[3:]

array([40,  5,  6])

In [13]:
id(b[1]) == id(a[3])

True

In [14]:
id(a[3])

140444629159696

In [15]:
id(b[0])

140444629158736

In [16]:
id(b[1])

140444629160048

In [17]:
id(a[4])

140444629156688

In [18]:
id(b[0]) == id(a[4])

True

When you run this code, you'll find that id(b[0]) == id(a[4]) is True, confirming that b[0] and a[4] share the same memory address because b is a view of a[3:]. because b is a view of a[3:]

In [19]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [20]:
a[1, 3]

8

Another difference between an array and a list of lists is that an element of the array can be accessed by specifying the index along each axis within a single set of square brackets, separated by commas. For instance, the element 8 is in row 1 and column 3

Array attributes

This section covers the ndim, shape, size, and dtype attributes of an array.

The number of dimensions of an array is contained in the ndim attribute.

In [21]:
a.ndim

2

The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension.

In [22]:
a.shape

(3, 4)

In [23]:
len(a.shape) == a.ndim

True

The fixed, total number of elements in array is contained in the size attribute.

In [24]:
a.size

12

In [25]:
import math
a.size == math.prod(a.shape)

True

Arrays are typically “homogeneous”, meaning that they contain elements of only one “data type”. The data type is recorded in the dtype attribute.

In [26]:
a.dtype

dtype('int64')

How to create a basic array
This section covers np.zeros(), np.ones(), np.empty(), np.arange(), np.linspace()

Besides creating an array from a sequence of elements, you can easily create an array filled with 0’s

In [27]:
np.zeros(2)

array([0., 0.])

Or an array filled with 1’s

In [28]:
np.ones(2)

array([1., 1.])

even an empty array! The function empty creates an array whose initial content is random and depends on the state of the memory. The reason to use empty over zeros (or something similar) is speed - just make sure to fill every element afterwards!

In [29]:
np.empty(2)

array([1., 1.])

You can create an array with a range of elements

In [30]:
np.arange(4)

array([0, 1, 2, 3])

You can also use np.linspace() to create an array with values that are spaced linearly in a specified interval

In [31]:
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

Adding, removing, and sorting elements

In [32]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

In [33]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In [574]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [575]:
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [581]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

In [582]:
np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

Can you reshape an array?

In [583]:
a = np.arange(6)

In [584]:
b = a.reshape(3, 2)
b

array([[0, 1],
       [2, 3],
       [4, 5]])

With np.reshape, you can specify a few optional parameters

In [40]:
np.reshape(a, newshape=(1, 6), order='C')

array([[0, 1, 2, 3, 4, 5]])

How to convert a 1D array into a 2D array (how to add a new axis to an array)

You can use np.newaxis and np.expand_dims to increase the dimensions of your existing array.

In [41]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [42]:
a2 = a[np.newaxis, :]
a2.shape

(1, 6)

You can use np.newaxis to add a new axis

In [43]:
col_vector = a[:, np.newaxis]
col_vector.shape

(6, 1)

You can also expand an array by inserting a new axis at a specified position with np.expand_dims

In [44]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [45]:
b = np.expand_dims(a, axis=1)
b.shape

(6, 1)

In [46]:
c = np.expand_dims(a, axis=0)
c.shape

(1, 6)

Indexing and slicing

In [47]:
data = np.array([1, 2, 3])

In [48]:
data[1]

2

In [49]:
data[0:2]

array([1, 2])

In [50]:
data[1:]

array([2, 3])

In [51]:
data[-2:]

array([2, 3])

In [52]:
a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [53]:
print(a[a < 5])

[1 2 3 4]


In [54]:
five_up = (a >= 5)
print(a[five_up])

[ 5  6  7  8  9 10 11 12]


In [55]:
divisible_by_2 = a[a%2==0]
print(divisible_by_2)

[ 2  4  6  8 10 12]


In [56]:
c = a[(a > 2) & (a < 11)]
print(c)

[ 3  4  5  6  7  8  9 10]


In [161]:
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


In [162]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [166]:
b = np.nonzero(a < 5)
print(b)

(array([0, 0, 0, 0]), array([0, 1, 2, 3]))


can use np.nonzero() to print the indices of elements that are, for example, less than 5

The first array represents the row indices where these values are found, and the second array represents the column indices where the values are found.

How to create an array from existing data

In [180]:
a = np.array([1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [194]:
arr1 = a[3:8]
arr1

array([4, 5, 6, 7, 8])

In [195]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3],
               [4, 4]])

In [196]:
np.vstack((a1, a2))

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

In [197]:
np.hstack((a1, a2))

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

In [208]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3, 4]])

In [257]:
x = np.arange(1, 25).reshape(2, 12)
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [281]:
np.hsplit(x, 3)

[array([[ 1,  2,  3,  4],
        [13, 14, 15, 16]]),
 array([[ 5,  6,  7,  8],
        [17, 18, 19, 20]]),
 array([[ 9, 10, 11, 12],
        [21, 22, 23, 24]])]

In [309]:
np.hsplit(x, (3, 4))

[array([[ 1,  2,  3],
        [13, 14, 15]]),
 array([[ 4],
        [16]]),
 array([[ 5,  6,  7,  8,  9, 10, 11, 12],
        [17, 18, 19, 20, 21, 22, 23, 24]])]

Using the copy method will make a complete copy of the array and its data (a deep copy). To use this on your array, you could run

In [311]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [314]:
b2 = a.copy()

In [320]:
b2

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Basic array operations

In [353]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)
data + ones

array([2, 3])

In [354]:
data

array([1, 2])

In [355]:
ones

array([1, 1])

In [356]:
data.shape

(2,)

In [361]:
ones.shape

(2,)

In [365]:
ones.ndim

1

In [366]:
data - ones

array([0, 1])

In [374]:
data * data

array([1, 4])

In [383]:
data / data

array([1., 1.])

In [384]:
a = np.array([1, 2, 3, 4])

a.sum()

10

In [385]:
b = np.array([[1, 1], [2, 2]])

In [386]:
b.sum(axis=0)

array([3, 3])

In [401]:
b.sum(axis=1)

array([2, 4])

Broadcasting

In [428]:
data = np.array([1.0, 2.0])
data * 1.6

array([1.6, 3.2])

How to get unique items and counts

In [429]:
a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20])

In [430]:
unique_values = np.unique(a)
print(unique_values)

[11 12 13 14 15 16 17 18 19 20]


In [431]:
unique_values, indices_list = np.unique(a, return_index=True)
print(indices_list)

[ 0  2  3  4  5  6  7 12 13 14]


In [432]:
unique_values, occurrence_count = np.unique(a, return_counts=True)
print(occurrence_count)

[3 2 2 2 1 1 1 1 1 1]


In [435]:
a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [1, 2, 3, 4]])

In [457]:
unique_values = np.unique(a_2d)
print(unique_values)

[ 1  2  3  4  5  6  7  8  9 10 11 12]


In [462]:
unique_rows = np.unique(a_2d, axis=0)
print(unique_rows)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [463]:
unique_rows, indices, occurrence_count = np.unique(
     a_2d, axis=0, return_counts=True, return_index=True)
print(unique_rows)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [466]:
print(indices)

[0 1 2]


In [476]:
print(occurrence_count)

[2 1 1]


Transposing

In [482]:
arr = np.arange(6).reshape((2, 3))
arr

array([[0, 1, 2],
       [3, 4, 5]])

In [483]:
arr.transpose()

array([[0, 3],
       [1, 4],
       [2, 5]])

In [484]:
arr.T

array([[0, 3],
       [1, 4],
       [2, 5]])

How to reverse an array

In [488]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

In [500]:
reversed_arr = np.flip(arr)
print('Reversed Array: ', reversed_arr)

Reversed Array:  [8 7 6 5 4 3 2 1]


In [501]:
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [527]:
reversed_arr = np.flip(arr_2d)
print(reversed_arr)

[[12 11 10  9]
 [ 8  7  6  5]
 [ 4  3  2  1]]


In [528]:
reversed_arr_rows = np.flip(arr_2d, axis=0)
print(reversed_arr_rows)

[[ 9 10 11 12]
 [ 5  6  7  8]
 [ 1  2  3  4]]


In [529]:
reversed_arr_columns = np.flip(arr_2d, axis=1)
print(reversed_arr_columns)

[[ 4  3  2  1]
 [ 8  7  6  5]
 [12 11 10  9]]


In [530]:
arr_2d[1] = np.flip(arr_2d[1])
print(arr_2d)

[[ 1  2  3  4]
 [ 8  7  6  5]
 [ 9 10 11 12]]


Reshaping and flattening multidimensional arrays

In [535]:
x = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

.flatten() and .ravel(). The primary difference between the two is that the new array created using ravel() is actually a reference to the parent array (i.e., a “view”). This means that any changes to the new array will affect the parent array as well. Since ravel does not create a copy, it’s memory efficient.

In [536]:
x.flatten()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [537]:
a1 = x.flatten()
a1[0] = 99
print(x)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [565]:
print(a1)

[ 1  2  3  4  5 99  7  8  9 10 11 12]


In [566]:
a2 = x.ravel()
a2[0] = 98
print(x)

[[98  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [567]:
print(a2)

[98  2  3  4  5  6  7  8  9 10 11 12]
