In [1]:
import numpy as np

### The NumPy Array Object
A full list of attributes with descriptions is available in the ndarray docstring, which can be accessed by calling help(np.ndarray) in the Python interpreter or np.ndarray? in an IPython console.

<img src="images/ndarray.png" width="500">

In [3]:
data = np.array([[1, 2], [3, 4], [5, 6]])
data

array([[1, 2],
       [3, 4],
       [5, 6]])

In [4]:
data.shape

(3, 2)

In [5]:
data = np.array([1, 2, 3], dtype=float)
data

array([1., 2., 3.])

In [8]:
data.ndim

1

In [9]:
data.shape

(3,)

In [6]:
np.sqrt(np.array([-1, 0, 1]))

  np.sqrt(np.array([-1, 0, 1]))


array([nan,  0.,  1.])

In [7]:
np.sqrt(np.array([-1, 0, 1], dtype=complex))

array([0.+1.j, 0.+0.j, 1.+0.j])

***
### Order of Array Data in Memory
Multidimensional arrays are stored as contiguous data in memory. There is freedom of choice in arranging the array elements in this memory segment. Consider a two-dimensional array containing rows and columns: one possible way to store this array as a consecutive sequence of values is to store the rows after each other, and another equally valid approach is to store the columns one after another. The former is called row-major format, and the latter is column-major format. Whether to use row-major or column-major is a matter of conventions. For example, row-major format is used in the C programming language, and Fortran uses the column-major format. A NumPy array can be specified to be stored in row-major format, using the keyword argument `order= 'C'`, and column-major format, using the keyword argument `order= 'F'`, when the array is created or reshaped. The default format is row-major. The 'C' or 'F' ordering of NumPy array is particularly relevant when NumPy arrays are used in interfaces with software written in C and Fortran, which is often required when working with numerical computing with Python.

Row-major and column-major ordering are special cases of strategies for mapping the index used to address an element to the offset for the element in the array’s memory segment. In general, the NumPy array attribute `ndarray.strides` defines exactly how this mapping is done. The `strides` attribute is a tuple of the same length as the number of axes (dimensions) of the array. Each value in `strides` is the factor by which the index for the corresponding axis is multiplied when calculating the memory offset (in bytes) for a given index expression.

For example, consider a C-order array A with shape (2, 3), which corresponds to a two-dimensional array with two and three elements along the first and the second dimensions, respectively. If the data type is int32, then each element uses 4 bytes, and the total memory buffer for the array, therefore, uses 2 × 3 × 4 = 24 bytes. The strides attribute of this array is (4 × 3, 4 × 1) = (12, 4) because each increment of m in A[n, m] increases the memory offset with one item, or 4 bytes. Likewise, each increment of n increases the memory offset with three items or 12 bytes (because the second dimension of the array has length 3). If, on the other hand, the same array was stored in 'F' order, the strides would instead be (4, 8). Using strides to describe the mapping of array index to array memory offset is clever because it can be used to describe different mapping strategies, and many common operations on arrays, such as the transpose, can be implemented by simply changing the strides attribute, which can eliminate the need for moving data around in the memory. Operations that only require changing the strides attribute result in new ndarray objects that refer to the same data as the original array. Such arrays are called views. For efficiency, NumPy strives to create views rather than copies when applying operations on arrays. This is generally a good thing, but it is important to be aware that some array operations result in views rather than new independent arrays because modifying their data also modifies the data of the original array. Several examples of this behavior are presented later in this chapter.

### Creating Arrays
The previous section looked at NumPy’s basic data structure for representing arrays, the ndarray class and the basic attributes of this class. This section focuses on functions from the NumPy library that can create ndarray instances.

Arrays can be generated in several ways, depending on their properties and the applications they are used for. For example, as we saw in the previous section, one way to initialize an ndarray instance is to use the `np.array` function on a Python list, which, for example, can be explicitly defined. However, this method is limited to small arrays. In many situations, it is necessary to generate arrays with elements that follow some given rule, such as filled with constant values, increasing integers, uniformly spaced numbers, random numbers, and so forth. In other cases, we might need to create arrays from data stored in a file. The requirements are many and varied, and the NumPy library provides a comprehensive set of functions for creating arrays in a variety of ways. This section looks at many of these functions in more detail. For a complete list, see the NumPy reference manual or the available docstrings by typing `help(np)` or using the autocompletion `np.<TAB>`. A summary of frequently used array-generating functions is given in Table 2-3.
Table 2-3Summary of NumPy Functions for Generating Arrays

<img src="images/numpy-generating-function.png" width="600" height="900">

#### Arrays Filled with Constant Values

In [10]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [11]:
np.ones(4)

array([1., 1., 1., 1.])

In [12]:
np.full(10, 5.4)

array([5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4])

In [13]:
x1 = np.empty(5)
x1.fill(3.0)
x1

array([3., 3., 3., 3., 3.])

In [14]:
x2 = np.full(5, 3.0)
x2

array([3., 3., 3., 3., 3.])

### Arrays Filled with Incremental Sequences

In [15]:
np.arange(0.0, 11, 1)

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [16]:
np.linspace(0, 10, 11)

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

### Creating Uninitialized Arrays

In [17]:
np.empty(3, dtype=float)

array([1., 0., 0.])

## Creating Arrays with Properties of Other Arrays

In [18]:
def f(x):
    y = np.ones_like(x)
    # compute with x and y
    return y

### creating matrix array

In [19]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [20]:
np.eye(3, k=1)

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

In [21]:
np.eye(3, k=-1)

array([[0., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.]])

In [22]:
np.diag(np.arange(0, 20, 5))

array([[ 0,  0,  0,  0],
       [ 0,  5,  0,  0],
       [ 0,  0, 10,  0],
       [ 0,  0,  0, 15]])

***
## Indexing and Slicing
***

In [24]:
a = np.arange(0, 11)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [39]:
a[-1] # the last element

np.int64(10)

In [30]:
a[1:-1]

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [31]:
a[1:-1:2]

array([1, 3, 5, 7, 9])

In [32]:
a[:5]

array([0, 1, 2, 3, 4])

In [33]:
a[-5:]

array([ 6,  7,  8,  9, 10])

In [34]:
a[::-2]

array([10,  8,  6,  4,  2,  0])

***
## Multidimensional Arrays

In [36]:
f = lambda m, n: n + 10 * m
A = np.fromfunction(f, (6, 6), dtype=int)
A

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [37]:
A[:, 1]  # the second column

array([ 1, 11, 21, 31, 41, 51])

In [38]:
A[1, :]  # the second row

array([10, 11, 12, 13, 14, 15])

In [40]:
A[:3, :3]  # upper half diagonal block matrix

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22]])

In [41]:
A[3:, :3]  # lower left off-diagonal block matrix

array([[30, 31, 32],
       [40, 41, 42],
       [50, 51, 52]])

In [42]:
A[::2, ::2]  # every second element starting from 0, 0

array([[ 0,  2,  4],
       [20, 22, 24],
       [40, 42, 44]])

In [43]:
A[1::2, 1::3]  # every second and third element starting from 1, 1

array([[11, 14],
       [31, 34],
       [51, 54]])

***
## Views
Subarrays extracted from arrays using slice operations are alternative views of the same underlying array data. More specifically, they are arrays that refer to the same data in the memory as the original array, but with a different strides configuration. When elements in a view are assigned new values, the original array’s values are updated.

In [45]:
B = A[1:5, 1:5]
B

array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34],
       [41, 42, 43, 44]])

In [47]:
B[:, :] = 0
A

array([[ 0,  1,  2,  3,  4,  5],
       [10,  0,  0,  0,  0, 15],
       [20,  0,  0,  0,  0, 25],
       [30,  0,  0,  0,  0, 35],
       [40,  0,  0,  0,  0, 45],
       [50, 51, 52, 53, 54, 55]])

In [49]:
C = B[1:3, 1:3].copy()
C

array([[0, 0],
       [0, 0]])

In [50]:
C[:, :] = 1  # this does not affect B since C is a copy of the view B[1:3, 1:3]
C

array([[1, 1],
       [1, 1]])

In [51]:
B

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

***
## Fancy Indexing and Boolean-Valued Indexing
With fancy indexing, an array can be indexed with another NumPy array, a Python list, or a sequence of integers whose values select elements in the indexed array.

In [54]:
A = np.linspace(0, 1, 11)
A

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [55]:
A[np.array([0, 2, 4])]

array([0. , 0.2, 0.4])

In [56]:
A[[0, 2, 4]]  # The same thing can be accomplished by indexing with a Python list

array([0. , 0.2, 0.4])

Another variant of indexing NumPy arrays is to use Boolean-valued index arrays. In this case, each element (with values True or False) indicates whether or not to select the element from the list with the corresponding index. That is, if element n in the indexing array of Boolean values is True, then element n is selected from the indexed array. If the value is False, then element n is not selected. This indexing method is handy when filtering out elements from an array. For example, to select all the elements from the array A (as defined in the preceding section) that exceed the value 0.5, we can use the following combination of the comparison operator applied to a NumPy array and indexing using a Boolean-valued array.

In [57]:
A > 0.5

array([False, False, False, False, False, False,  True,  True,  True,
        True,  True])

In [58]:
A[A > 0.5]

array([0.6, 0.7, 0.8, 0.9, 1. ])

In [59]:
A = np.arange(10)
indices = [2, 4, 6]
B = A[indices]
B[0] = -1  # this does not affect A
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [60]:
A[indices] = -1  # this alters A
A

array([ 0,  1, -1,  3, -1,  5, -1,  7,  8,  9])

And likewise for Boolean-valued indexing.

In [61]:
A = np.arange(10)
B = A[A>5]
B[0] = -1 # this does not affect A
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [62]:
A[A > 5] = -1  # this alters A
A

array([ 0,  1,  2,  3,  4,  5, -1, -1, -1, -1])

[](array_slicing.png)

<!-- ![Image](array_slicing.png) -->
<img src="images/array_slicing.png" width="800" height="600">