# Arithmetic with NumPy Arrays


Arrays are important because they enable you to express batch operations on data
without writing any for loops. NumPy users call this vectorization. Any arithmetic
operations between equal-size arrays applies the operation element-wise

In [1]:
import numpy as np
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [2]:
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [3]:
arr+arr

array([[ 2.,  4.,  6.],
       [ 8., 10., 12.]])

In [4]:
arr-arr

array([[0., 0., 0.],
       [0., 0., 0.]])

Arithmetic operations with scalars propagate the scalar argument to each element in
the array:

In [5]:
1 / arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [6]:
arr ** 0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

Comparisons between arrays of the same size yield boolean arrays:

In [7]:
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])
arr2

array([[ 0.,  4.,  1.],
       [ 7.,  2., 12.]])

In [8]:
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [9]:
arr2>arr

array([[False,  True, False],
       [ True, False,  True]])

Operations between differently sized arrays is called broadcasting and will be discussed
in more detail in Appendix A. Having a deep understanding of broadcasting is
not necessary for most of this book.

# Basic Indexing and Slicing

NumPy array indexing is a rich topic, as there are many ways you may want to select
a subset of your data or individual elements. One-dimensional arrays are simple; on
the surface they act similarly to Python lists:

In [10]:
arr = np.arange(10)

In [11]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]:
arr[5]

5

In [13]:
arr[5:8]

array([5, 6, 7])

In [14]:
arr[5:8] = 12

In [15]:
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

As you can see, if you assign a scalar value to a slice, as in arr[5:8] = 12, the value is
propagated (or broadcasted henceforth) to the entire selection. An important first distinction
from Python’s built-in lists is that array slices are views on the original array.
This means that the data is not copied, and any modifications to the view will be
reflected in the source array.

In [16]:
#To give an example of this, I first create a slice of arr:
arr_slice = arr[5:8]
arr_slice


array([12, 12, 12])

In [17]:
arr_slice[1] = 12345

In [18]:
arr

array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,
           9])

The “bare” slice [:] will assign to all values in an array:

In [19]:
arr_slice[:] = 64
arr

array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

If you are new to NumPy, you might be surprised by this, especially if you have used
other array programming languages that copy data more eagerly. As NumPy has been
designed to be able to work with very large arrays, you could imagine performance
and memory problems if NumPy insisted on always copying data.

If you want a copy of a slice of an ndarray instead of a view, you
will need to explicitly copy the array—for example,
arr[5:8].copy().

With higher dimensional arrays, you have many more options. In a two-dimensional
array, the elements at each index are no longer scalars but rather one-dimensional
arrays:

In [20]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Thus, individual elements can be accessed recursively. But that is a bit too much
work, so you can pass a comma-separated list of indices to select individual elements.
So these are equivalent:

In [21]:
arr2d[0][2]

3

In [22]:
arr2d[0,2]

3

In multidimensional arrays, if you omit later indices, the returned object will be a
lower dimensional ndarray consisting of all the data along the higher dimensions. So
in the 2 × 2 × 3 array arr3d:

In [23]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [24]:
arr3d[0] #arr3d[0] is a 2 × 3 array

array([[1, 2, 3],
       [4, 5, 6]])

In [25]:
#Both scalar values and arrays can be assigned to arr3d[0]:
old_values = arr3d[0].copy()
arr3d[0] = 42
arr3d

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [26]:
arr3d[0] = old_values
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [27]:
#Similarly, arr3d[1, 0] gives you all of the values whose indices start with (1, 0),
#forming a 1-dimensional array:
arr3d[1, 0]

array([7, 8, 9])

In [28]:
x = arr3d[1]
x


array([[ 7,  8,  9],
       [10, 11, 12]])

In [29]:
x[0]

array([7, 8, 9])

Note that in all of these cases where subsections of the array have been selected, the
returned arrays are views.

# Indexing with slices

In [30]:
#Slicing means extracting part of the array. eg: arrayname[StartIndex:EndIndex:-1]

In [31]:
arr

array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

In [32]:
arr[2:3]

array([2])

In [33]:
arr[2::-1] #begin from index2 and till end. -1 indicates that move in reverse order

array([2, 1, 0])

In [34]:
arr[::-1]

array([ 9,  8, 64, 64, 64,  4,  3,  2,  1,  0])

# Slicing in 2D array

In [35]:
arr2d


array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [36]:
arr2d[0:2,0:2]

array([[1, 2],
       [4, 5]])

In [37]:
arr2d[:,0:2]

array([[1, 2],
       [4, 5],
       [7, 8]])

Note: If row indices are not specified, it means all the rows are to be considered. 
    Likewise, if column indices are not specified, all the columns are to be considered. 

When slicing like this, you always obtain array views of the same number of dimensions.

In [38]:
arr2d[1:, :2]

array([[4, 5],
       [7, 8]])

In [39]:
#By mixing integer indexes and slices, you get lower dimensional slices.
arr2d[1, :2] Here 1 is index number ad 0:2 is range of slice for columns

SyntaxError: invalid syntax (<ipython-input-39-9fe4964c421e>, line 2)

In [None]:
#For example, you can select the second row but only the first two columns like so:
arr2d[1, :2]



In [None]:
#Similarly, you can select the third column but only the first two rows like so:
arr2d[:2, 2]


In [None]:
#assigning to a slice expression assigns to the whole selection:
arr2d[:2, 1:] = 0
arr2d

# Boolean Indexing

Let’s consider an example where we have some data in an array and an array of names
with duplicates

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)

In [None]:
names

In [None]:
data

Suppose each name corresponds to a row in the data array and we wanted to select
all the rows with corresponding name 'Bob'. Like arithmetic operations, comparisons
(such as ==) with arrays are also vectorized. Thus, comparing names with the
string 'Bob' yields a boolean array:


In [None]:
names == 'Bob' #the output array is boolean array 

In [None]:
#This boolean array can be passed when indexing the array:
data[names == 'Bob'] #Row number 0 and 3 will be returned which have true value

The boolean array must be of the same length as the array axis it’s indexing. You can
even mix and match boolean arrays with slices or integers

In [None]:
data[names == 'Bob', 2:]

In [None]:
data[names == 'Bob', 3] # 3 is column index here

To select everything but 'Bob', you can either use != or negate the condition using ~:

In [None]:
names != 'Bob'

In [None]:
data[~(names == 'Bob')]

The ~ operator can be useful when you want to invert a general condition:

In [None]:
cond = names == 'Bob'

In [None]:
data[~cond]

In [None]:
#Selecting two of the three names to combine multiple boolean conditions, use
#boolean arithmetic operators like & (and) and | (or):
mask = (names == 'Bob') | (names == 'Will')
mask

In [None]:
data[mask]

Selecting data from an array by boolean indexing always creates a copy of the data,
even if the returned array is unchanged.
The Python keywords and and or do not work with boolean arrays.
Use & (and) and | (or) instead.

In [None]:
#Setting values with boolean arrays works in a common-sense way. To set all of the
#negative values in data to 0 we need only do:
data[data < 0] = 0

In [None]:
data

# Fancy Indexing

Fancy indexing is a term adopted by NumPy to describe indexing using integer arrays.

In [43]:
arr = np.empty((8, 4))

In [44]:
for i in range(8):
    arr[i]=i
arr

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

To select out a subset of the rows in a particular order, you can simply pass a list or
ndarray of integers specifying the desired order:

In [45]:
#To select out a subset of the rows in a particular order, you can simply pass a list or
#ndarray of integers specifying the desired order:
arr[[4, 3, 0, 6]]

array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

In [46]:
#Using negative indices selects rows from the end:
arr[[-3, -5, -7]]

array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])

In [47]:
#Passing multiple index arrays does something slightly different; it selects a one dimensional
#array of elements corresponding to each tuple of indices:
arr = np.arange(32).reshape((8, 4))
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [48]:
arr[[1, 5, 7, 2], [0, 3, 1, 2]] #it is 1st row 0th element, 5th row 3rd element, 7th row 1st element, 2nd row 2nd element

array([ 4, 23, 29, 10])

Here the elements (1, 0), (5, 3), (7, 1), and (2, 2) were selected. Regardless of
how many dimensions the array has (here, only 2), the result of fancy indexing is
always one-dimensional.


In [None]:
#The behavior of fancy indexing in this case is a bit different from what some users might have expected,
#which is the rectangular region formed by selecting a subset of the matrix’s rows and columns.

In [64]:
#Here is one way to get that:
arr[[1, 5, 7, 2]][:, [0, 3,1,2]]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

In [65]:
arr[[1, 5, 7, 2]][:,[0,3]]

array([[ 4,  7],
       [20, 23],
       [28, 31],
       [ 8, 11]])

Keep in mind that fancy indexing, unlike slicing, always copies the data into a new
array.

# Transposing Arrays and Swapping Axes

Transposing is a special form of reshaping that similarly returns a view on the underlying
data without copying anything. Arrays have the transpose method and also the
special T attribute:

In [66]:
arr = np.arange(15).reshape((3, 5))
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [67]:
arr.T

array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

In [68]:
#When doing matrix computations, you may do this very often—for example, when
#computing the inner matrix product using np.dot:
arr = np.random.randn(6, 3)
arr

array([[ 0.4135002 ,  0.49910321,  0.6256253 ],
       [-0.3467565 , -2.18787199,  1.06881388],
       [-0.95208857, -0.26304967,  1.74002566],
       [ 1.21955746, -1.30409417, -0.85660995],
       [ 0.73562463,  0.4584042 , -0.20620752],
       [ 0.93634389, -0.60462572,  0.84335535]])

In [71]:
arr.T

array([[ 0.4135002 , -0.3467565 , -0.95208857,  1.21955746,  0.73562463,
         0.93634389],
       [ 0.49910321, -2.18787199, -0.26304967, -1.30409417,  0.4584042 ,
        -0.60462572],
       [ 0.6256253 ,  1.06881388,  1.74002566, -0.85660995, -0.20620752,
         0.84335535]])

In [81]:
arr.shape

(2, 2, 4)

In [82]:
arr.T.shape

(4, 2, 2)

In [69]:
np.dot(arr.T, arr)

array([[ 4.102899  , -0.60385727, -2.17528626],
       [-0.60385727,  7.38145129, -1.9712302 ],
       [-2.17528626, -1.9712302 ,  6.0490098 ]])

In [79]:
#Example to show how dot product woks
a = np.array([[1,2],[3,4]]) 
print (a)
b = np.array([[11,12],[13,14]]) 
print (b)
np.dot(a,b)    #[[1*11+2*13, 1*12+2*14],[3*11+4*13, 3*12+4*14]]

[[1 2]
 [3 4]]
[[11 12]
 [13 14]]


array([[37, 40],
       [85, 92]])

In [83]:
#For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute
#the axes (for extra mind bending):
arr = np.arange(16).reshape((2, 2, 4))

In [73]:
arr

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

In [84]:
arr.transpose((1, 0, 2))

array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],

       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

In [90]:
arrt = np.arange(16).reshape((2, 4, 2))
arrt

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]]])

In [92]:
arrt.transpose((1, 0, 2))

array([[[ 0,  1],
        [ 8,  9]],

       [[ 2,  3],
        [10, 11]],

       [[ 4,  5],
        [12, 13]],

       [[ 6,  7],
        [14, 15]]])

Here, the axes have been reordered with the second axis first, the first axis second,
and the last axis unchanged.

In [None]:
#Simple transposing with .T is a special case of swapping axes. ndarray has the method
#swapaxes, which takes a pair of axis numbers and switches the indicated axes to rearrange
#the data:
arr

In [94]:
res=arr.swapaxes(1, 2)
res

array([[[ 0,  4],
        [ 1,  5],
        [ 2,  6],
        [ 3,  7]],

       [[ 8, 12],
        [ 9, 13],
        [10, 14],
        [11, 15]]])

swapaxes similarly returns a view on the data without making a copy.