# NumPy - Numerical Python 

The NumPy ndarray is a multidimensional array object, which is fast, flexible container for large datasets in Python. It is a generic multidimensional container for homogeneous data; i.e, all the elements must be of the same type. 

In [1]:
#importing numpy
import numpy as np

In [2]:
#generate some random data
data = np.random.randn(2,3)
data

array([[ 0.08666557, -0.98470085, -0.58026601],
       [-0.34110838,  0.37064544, -0.20878278]])

In [3]:
#performing some mathematical operations with data
data * 10
#all the elements will be multiplied by 10

array([[ 0.86665566, -9.84700846, -5.80266014],
       [-3.4110838 ,  3.70645436, -2.08782784]])

In [4]:
data + data
#the corresponding values in each cells will be added to each other.

array([[ 0.17333113, -1.96940169, -1.16053203],
       [-0.68221676,  0.74129087, -0.41756557]])

In [5]:
#every array has a shape, a tuple indicating the size of each dimension.
data.shape

(2, 3)

In [6]:
#dtype is an object describing the data type of the array.
data.dtype

dtype('float64')

In [7]:
#creating ndarrays
#the easiest way to create an array is to use the array function.
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

array([6. , 7.5, 8. , 0. , 1. ])

In [8]:
#nested sequences, like list of equal-length lists, will be converted into a multidimensional array.
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [9]:
arr2.ndim

2

In [10]:
arr2.shape

(2, 4)

In [11]:
arr1.dtype

dtype('float64')

In [12]:
arr2.dtype

dtype('int64')

zeros and ones create arrays of 0s or 1s, respectively, with a given length or shape.

empty creates an array without initializing its values to any particular value.

In [13]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [14]:
# to create higher dimensional arrays pass tuple as shape.
np.zeros((3, 6))

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [17]:
np.empty((2, 3, 2))
# np.empty will not always return an array of all zeros. In some cases, it may return uninitialized garbage values.

array([[[4.6538621e-310, 0.0000000e+000],
        [0.0000000e+000, 0.0000000e+000],
        [0.0000000e+000, 0.0000000e+000]],

       [[0.0000000e+000, 0.0000000e+000],
        [0.0000000e+000, 0.0000000e+000],
        [0.0000000e+000, 0.0000000e+000]]])

In [18]:
#arange is an array-valued version of the built-in Python range function
np.arange(15)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

array - copies the input array by default.

asarray - convert input to ndarray, but do not copy if the input is already an ndarray.

arange - like the built-in range but returns an ndarray instead of a list.

ones - produce an array of all 1s with given shape and dtype.

ones_like - takes another array and produces a ones array of the same shape and dtype.

similarly zeros and zeros_like can be used to create arrays of 0s.

empty, empty_like - create new arrays by allocating new memory, but do not populate with any values like ones and zeros.

full - produces an array of the given shape and dtype with all values set to the indicated fill value.

full_like - takes another array and produces a filled array of the same shape and dtype.

eye, identity - create a square N X N identity matrix(1s on the diagonal and 0s elsewhere).

In [19]:
#datatypes
arr1 = np.array([1, 2, 3], dtype=np.float64)
arr1.dtype

dtype('float64')

In [20]:
arr2 = np.array([1, 2, 3], dtype=np.int32)
arr2.dtype

dtype('int32')

In [21]:
#You can explicitly convert or cast an array from one dtype to another using ndarray’s astype method:
arr = np.array([1, 2, 3, 4, 5])
arr.dtype

dtype('int64')

In [22]:
float_arr = arr.astype(np.float64)
float_arr.dtype

dtype('float64')

In [23]:
#if some floating-point numbers were to be casted to be of integer stype, the decimal part will be truncated.
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
arr

array([ 3.7, -1.2, -2.6,  0.5, 12.9, 10.1])

In [24]:
arr.astype(np.int32)

array([ 3, -1, -2,  0, 12, 10], dtype=int32)

In [25]:
#array of strings can be converted to numeric form using astype
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
numeric_strings.astype(float)

array([ 1.25, -9.6 , 42.  ])

In [27]:
#one can also use another array's dtype attribute
int_array = np.arange(10)
calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)
int_array.astype(calibers.dtype)

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

calling astype always creates a new array (a copy of the data), even if the new dtype is the same as the old dtype.

vectorization - expressing batch operations on data without any for loops.

In [28]:
#arithmetic with numpy arrays
#any arithmetic opeartion between equal-size arrays applies the operations element-wise.
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [29]:
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [30]:
arr - arr

array([[0., 0., 0.],
       [0., 0., 0.]])

In [31]:
#Arithmetic operations with scalars propagate the scalar argument to each element in the array:
1 / arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [32]:
arr ** 0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

In [33]:
arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])
arr2

array([[ 0.,  4.,  1.],
       [ 7.,  2., 12.]])

In [34]:
arr2 > arr

array([[False,  True, False],
       [ True, False,  True]])

operations between different sized arrays is called broadcasting.

In [36]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [37]:
arr[5]#accesses the 6th element in the array.

5

In [38]:
arr[5:8]#gives elements indexed 5, 6, 7.

array([5, 6, 7])

In [40]:
arr[5:8] = 12#if you assign a scalar value to a slice, as in arr[5:8] = 12 , the value is propagated (or broadcasted henceforth) to the entire selection.
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

An important first distinction from Python’s built-in lists is that array slices are views on the original array.
This means that the data is not copied, and any modifications to the view will be
reflected in the source array.

In [41]:
arr_slice = arr[5:8]
arr_slice

array([12, 12, 12])

In [42]:
arr_slice[1] = 12345
arr

array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,
           9])

In [43]:
#The “bare” slice [:] will assign to all values in an array:
arr_slice[:] = 64
arr

array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])

If you want a copy of a slice of an ndarray instead of a view, you will need to explicitly copy the array—for example, arr[5:8].copy() .

In [44]:
#In a two-dimensional array, the elements at each index are no longer scalars but rather one-dimensional arrays:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[2]

array([7, 8, 9])

In [45]:
arr2d[0][2]

3

In [46]:
arr2d[0, 2]

3

arr2d[0][2] and arr2d[0, 2] are equivalent.

In [47]:
#axis0 - rows
#axis1 - columns

In [48]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [49]:
arr3d.shape

(2, 2, 3)

In [50]:
arr3d[0].shape

(2, 3)

In [51]:
arr3d[0]

array([[1, 2, 3],
       [4, 5, 6]])

In [52]:
arr3d[1, 0]

array([7, 8, 9])

In [55]:
x = arr3d[1]
x

array([[ 7,  8,  9],
       [10, 11, 12]])

In [56]:
x[0]

array([7, 8, 9])

In [57]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [58]:
arr2d[:2]#select the first two rows of arr2d.

array([[1, 2, 3],
       [4, 5, 6]])

In [59]:
arr2d[:2, 1:]#first two rows and all columns except first.

array([[2, 3],
       [5, 6]])

In [60]:
arr2d[1, :2]#second row but first two columns.

array([4, 5])

In [61]:
arr2d[:2, 2]#first two rows and third column.

array([3, 6])

In [62]:
#a colon by itself means to take the entire axis
arr2d[:, :1]

array([[1],
       [4],
       [7]])

In [64]:
arr2d[:2, 1:]=0
arr2d

array([[1, 0, 0],
       [4, 0, 0],
       [7, 8, 9]])

In [79]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)
names

array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')

In [80]:
data

array([[ 0.70744193, -1.4629836 , -0.97147573,  0.96351147],
       [ 0.22735242,  1.55866792,  0.05567429, -0.22373537],
       [-1.09284067, -1.7758624 , -1.11624678,  0.34889049],
       [-2.16291165,  0.75352094,  0.41728717, -0.91640816],
       [-0.64330659,  0.0800317 ,  0.12010307, -0.84001274],
       [-1.54554073,  1.05602725,  1.04564658,  1.47878201],
       [-0.71901137,  0.01588749, -0.14845706, -2.07658033]])

Suppose each name corresponds to a row in the data array and we wanted to select all the rows with corresponding name 'Bob'. Like arithmetic operations, comparisons (such as == ) with arrays are also vectorized. Thus, comparing names with the string 'Bob' yields a boolean array:

In [81]:
names == 'Bob'

array([ True, False, False,  True, False, False, False])

In [82]:
data[names == 'Bob']

array([[ 0.70744193, -1.4629836 , -0.97147573,  0.96351147],
       [-2.16291165,  0.75352094,  0.41728717, -0.91640816]])

In [83]:
data[names=='Bob',2:]

array([[-0.97147573,  0.96351147],
       [ 0.41728717, -0.91640816]])

In [84]:
data[names=='Bob',3]

array([ 0.96351147, -0.91640816])

In [85]:
names!='Bob'

array([False,  True,  True, False,  True,  True,  True])

In [86]:
data[~(names == 'Bob')]

array([[ 0.22735242,  1.55866792,  0.05567429, -0.22373537],
       [-1.09284067, -1.7758624 , -1.11624678,  0.34889049],
       [-0.64330659,  0.0800317 ,  0.12010307, -0.84001274],
       [-1.54554073,  1.05602725,  1.04564658,  1.47878201],
       [-0.71901137,  0.01588749, -0.14845706, -2.07658033]])

selecting data from an array by boolean indexing always creates a copy of the data

use &, | for and, or respectively when working with boolean arrays.

In [89]:
data[data<0]=0
data

array([[0.70744193, 0.        , 0.        , 0.96351147],
       [0.22735242, 1.55866792, 0.05567429, 0.        ],
       [0.        , 0.        , 0.        , 0.34889049],
       [0.        , 0.75352094, 0.41728717, 0.        ],
       [0.        , 0.0800317 , 0.12010307, 0.        ],
       [0.        , 1.05602725, 1.04564658, 1.47878201],
       [0.        , 0.01588749, 0.        , 0.        ]])

fancy indexing - indexing using integer arrays. This unlike slicing, always copies the data into a new array.

In [90]:
arr = np.empty((8,4))

In [91]:
for i in range(8):
    arr[i]=i

In [92]:
arr

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

In [94]:
arr[[4,3,0,6]]#selecting subset of rows in a particular order

array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

In [95]:
arr[[-3,-5,-7]]#negative indices selects rows from the end.

array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])

In [97]:
arr=np.arange(32).reshape((8,4))
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [98]:
arr[[1,5,7,2],[0,3,1,2]]#arr[(1,0)],arr[(5,3)],arr[(7,1)],arr[(2,2)]

array([ 4, 23, 29, 10])

In [100]:
arr[[1,5,7,2]][:,[0,3,1,2]]#arr[(1,0)],arr[(1,3)],arr[(1,1)],arr[(1,2)]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

Transposing is a special form of reshaping that similarly returns a view on the underlying data without copying anything.

In [102]:
arr=np.arange(15).reshape((3,5))
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [103]:
arr.T #transpose

array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute the axes

In [109]:
arr = np.arange(16).reshape((2,2,4))
arr

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

In [110]:
arr.transpose((1,0,2)) #Here, the axes have been reordered with the second axis first, the first axis second, and the last axis unchanged.

array([[[ 0,  1,  2,  3],
        [ 8,  9, 10, 11]],

       [[ 4,  5,  6,  7],
        [12, 13, 14, 15]]])

In [111]:
# simple transposing is a special case of swapping axes.
arr

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]]])

In [116]:
arr.swapaxes(1,2) #takes a pair of axis numbers and switches the indicated axes to rearrange the data
#swapaxes similarly returns a view on the data without making a copy.

array([[[ 0,  4],
        [ 1,  5],
        [ 2,  6],
        [ 3,  7]],

       [[ 8, 12],
        [ 9, 13],
        [10, 14],
        [11, 15]]])

Universal function or ufunc, is a function that performs element-wise operations on data in ndarrays.

In [119]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [121]:
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [122]:
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

sqrt and exp are unary ufuncs.
others such as add or maximum, take two arrays hence called binary ufuncs.

In [123]:
x=np.random.randn(8)
y=np.random.randn(8)

In [124]:
x

array([-0.14593151,  0.61579908, -1.10553818, -0.9857591 ,  1.49345962,
        1.21726956,  1.84969786, -0.77075162])

In [125]:
y

array([-1.47091326,  0.96454953,  2.21823795,  0.83521654, -1.07499621,
        0.91922295,  0.74171973, -1.02548903])

In [127]:
np.maximum(x,y) # Here, numpy.maximum computed the element-wise maximum of the elements in x and y

array([-0.14593151,  0.96454953,  2.21823795,  0.83521654,  1.49345962,
        1.21726956,  1.84969786, -0.77075162])

In [129]:
# while not common, a ufunc can return multiple arrays, modf is an example. it returns the fractional and integral parts of a floating-point array.
arr = np.random.randn(7)*5
arr

array([-1.62874376,  7.79481205, -5.17656222, -4.77006978, -6.90885729,
        2.63734591, -0.27730611])

In [130]:
remainder, whole_part=np.modf(arr)

In [131]:
remainder

array([-0.62874376,  0.79481205, -0.17656222, -0.77006978, -0.90885729,
        0.63734591, -0.27730611])

In [132]:
whole_part

array([-1.,  7., -5., -4., -6.,  2., -0.])

In [133]:
arr

array([-1.62874376,  7.79481205, -5.17656222, -4.77006978, -6.90885729,
        2.63734591, -0.27730611])

In [134]:
np.sqrt(arr)

  """Entry point for launching an IPython kernel.


array([       nan, 2.79191906,        nan,        nan,        nan,
       1.62399074,        nan])

In [135]:
np.sqrt(arr, arr)

  """Entry point for launching an IPython kernel.


array([       nan, 2.79191906,        nan,        nan,        nan,
       1.62399074,        nan])

The np.meshgrid function takes two 1D arrays and produces two 2D matrices corresponding to all pairs of (x, y) in the two arrays. 

In [136]:
points = np.arange(-5, 5, 0.01)#1000 equally spaced points
xs, ys = np.meshgrid(points, points)

In [137]:
xs

array([[-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       ...,
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99]])

In [138]:
ys

array([[-5.  , -5.  , -5.  , ..., -5.  , -5.  , -5.  ],
       [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],
       [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],
       ...,
       [ 4.97,  4.97,  4.97, ...,  4.97,  4.97,  4.97],
       [ 4.98,  4.98,  4.98, ...,  4.98,  4.98,  4.98],
       [ 4.99,  4.99,  4.99, ...,  4.99,  4.99,  4.99]])

In [139]:
z = np.sqrt(xs ** 2 + ys ** 2)
z

array([[7.07106781, 7.06400028, 7.05693985, ..., 7.04988652, 7.05693985,
        7.06400028],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,
        7.05692568],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,
        7.04985815],
       ...,
       [7.04988652, 7.04279774, 7.03571603, ..., 7.0286414 , 7.03571603,
        7.04279774],
       [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,
        7.04985815],
       [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,
        7.05692568]])

The numpy.where function is a vectorized version of the ternary expression x if condition else y.

In [140]:
xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
# Suppose we wanted to take a value from xarr whenever the corresponding value in cond is True , and otherwise take the value from yarr . A list comprehension doing this might look like:
result = [(x if c else y)
         for x, y, c in zip(xarr, yarr, cond)]
result

[1.1, 2.2, 1.3, 1.4, 2.5]

In [141]:
result = np.where(cond, xarr, yarr)
result

array([1.1, 2.2, 1.3, 1.4, 2.5])

Suppose you had a matrix of randomly generated
data and you wanted to replace all positive values with 2 and all negative values with –2. This is very easy to do with np.where :

In [142]:
arr = np.random.randn(4, 4)
arr

array([[-0.51322132,  1.00573657,  1.6371297 ,  2.40084088],
       [ 1.6220982 , -0.21667768,  2.49015211, -0.45971814],
       [-1.60496199, -0.07156599,  0.3169373 ,  0.53704311],
       [ 0.86750941,  0.3023636 , -1.48033741, -1.32960185]])

In [143]:
arr > 0

array([[False,  True,  True,  True],
       [ True, False,  True, False],
       [False, False,  True,  True],
       [ True,  True, False, False]])

In [144]:
np.where(arr > 0, 2, -2)

array([[-2,  2,  2,  2],
       [ 2, -2,  2, -2],
       [-2, -2,  2,  2],
       [ 2,  2, -2, -2]])

In [145]:
np.where(arr > 0, 2, arr)# set only positive values to 2

array([[-0.51322132,  2.        ,  2.        ,  2.        ],
       [ 2.        , -0.21667768,  2.        , -0.45971814],
       [-1.60496199, -0.07156599,  2.        ,  2.        ],
       [ 2.        ,  2.        , -1.48033741, -1.32960185]])

In [147]:
# some aggregate statistics
arr = np.random.randn(5, 4)
arr

array([[-0.09610427, -1.1532312 , -0.4981938 ,  1.15747554],
       [-0.00880928,  1.00144274,  0.04039338, -1.05613097],
       [ 0.80600022, -1.17782397,  0.88261726, -1.08451453],
       [-2.05648025, -0.01868621,  0.53645864, -0.25730216],
       [ 1.10841329,  2.64608266, -1.8370062 , -2.1057782 ]])

In [148]:
arr.mean()

-0.15855886545724257

In [149]:
np.mean(arr)

-0.15855886545724257

In [150]:
arr.sum()

-3.171177309144851

In [152]:
# functions like mean and sum take optional axis argument that computes the statistic over the given axis, resulting in an array with one fewer dimension.
arr.mean(axis=1)# compute mean across the columns

array([-0.14751343, -0.00577603, -0.14343026, -0.44900249, -0.04707211])

In [154]:
arr.sum(axis=0)# compute sum down the rows

array([-0.24698029,  1.29778401, -0.87573071, -3.34625032])

In [156]:
# methods like cumsum and cumprod do not aggregate, instead producing an array of the intermediate results
arr = np.arange(8)
arr

array([0, 1, 2, 3, 4, 5, 6, 7])

In [158]:
arr.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28])

In [163]:
# In myltidimensional arrays, accumulation functions like cumsum return an array of the same size, but with the partial aggragates computed across the indicated axis.
arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
arr

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [164]:
arr.cumsum(axis=0)

array([[ 0,  1,  2],
       [ 3,  5,  7],
       [ 9, 12, 15]])

In [165]:
arr.cumprod(axis=1)

array([[  0,   0,   0],
       [  3,  12,  60],
       [  6,  42, 336]])

Method                        Description

sum -               Sum of all the elements in the                    array or along an axis; zero-                      length arrays have sum 0

mean -              Arithmetic mean; zero-length                      arrays have NaN mean

std, var -           Standard deviation and                            variance, respectively, with                      optional degrees of freedom                        adjustment (default                                denominator n )

min, max -          Minimum and maximum

argmin, argmax -     Indices of minimum and                            maximum elements,respectively  

cumsum -             Cumulative sum of elements                        starting from 0

cumprod -           Cumulative product of elements                    starting from 1

In [166]:
arr = np.random.randn(100)
(arr > 0).sum()# number of positive values

52

In [167]:
bools = np.array([False, False, True, False])
bools.any()# tests whether one or more values in an array is True

True

In [168]:
bools.all()# checks if every value is True

False

In [170]:
arr = np.random.randn(6)
arr

array([ 1.85724984, -0.00432034, -0.43025072,  0.39853188, -0.79271719,
        0.77656557])

In [172]:
arr.sort()
arr

array([-0.79271719, -0.43025072, -0.00432034,  0.39853188,  0.77656557,
        1.85724984])

In [173]:
# you can sort each one-dimensional section of values in a multidimensional array in-place along an axis by passing the axis number to sort
arr = np.random.randn(5, 3)
arr

array([[-0.97001347,  1.01170077, -0.98111389],
       [ 0.41317794, -0.35445073,  1.03535004],
       [-1.0274559 ,  0.1965272 ,  0.57737057],
       [-0.63131558, -0.97727312,  0.78982811],
       [-1.00234743, -0.8900657 ,  0.82331641]])

In [177]:
arr.sort(0)
arr

array([[-1.0274559 , -0.97001347,  0.57737057],
       [-1.00234743, -0.8900657 ,  0.78982811],
       [-0.98111389, -0.63131558,  0.82331641],
       [-0.97727312,  0.1965272 ,  1.01170077],
       [-0.35445073,  0.41317794,  1.03535004]])

In [178]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

In [179]:
np.unique(names)# returns the sorted unique values in an array

array(['Bob', 'Joe', 'Will'], dtype='<U4')

In [180]:
ints = np.array([3, 3, 3, 2, 2, 1, 1, 4, 4])

In [181]:
np.unique(ints)

array([1, 2, 3, 4])

In [182]:
# pure Python alternative to np.unique
sorted(set(names))

['Bob', 'Joe', 'Will']

In [183]:
# np.in1d tests membership of the values in one array in another, returning a boolean array
values = np.array([6, 0, 6, 3, 2, 5, 6])
np.in1d(values, [2, 3, 6])

array([ True, False,  True,  True,  True, False,  True])

np.save and np.load aree the two workhorse functions for efficiently saving and loading array data on disk.Arrays are saved by defalut in an uncompressed binary format with file extension .npy

In [184]:
arr = np.arange(10)
np.save('some_array', arr)

In [185]:
np.load('some_array.npy')

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [187]:
x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23.], [-1, 7], [8, 9]])

In [188]:
x

array([[1., 2., 3.],
       [4., 5., 6.]])

In [189]:
y

array([[ 6., 23.],
       [-1.,  7.],
       [ 8.,  9.]])

In [190]:
x.dot(y)

array([[ 28.,  64.],
       [ 67., 181.]])

In [191]:
# the above expression is equivalent to 
np.dot(x, y)

array([[ 28.,  64.],
       [ 67., 181.]])

In [192]:
np.dot(x, np.ones(3))

array([ 6., 15.])

In [194]:
x @ np.ones(3)# the symbol @ acts as an infix operator that performs matrix multiplication

array([ 6., 15.])

numpy.linalg has a standard set of matrix decompositions

diag - Return the diagonal (or off-diagonal) elements of a square matrix as a 1D array, or convert a 1D array into a square matrix with zeros on the off-diagonal.

dot - matrix multiplication

trace - compute the sum of the diagonal elements

det - compute the matrix determinant

eig - compute the eigenvalues and eigenvectors of a square matrix

inv - compute the inverse of the matrix

solve - solve the linear system Ax = b for x, where A is a square matrix

lstsq - compute the least square solution to Ax = b

In [196]:
#you can get a 4 × 4 array of samples from the standard normal distribution using normal :
samples = np.random.normal(size=(4,4))
samples

array([[-1.94167055, -1.06095182,  0.52449785,  0.43160437],
       [-0.54856977,  0.5477222 , -0.59794556, -0.55935688],
       [ 0.43882416, -0.66297966,  1.89723171,  1.09975076],
       [-1.1138544 ,  0.26696461,  0.56458895,  1.59528987]])

functions in numpy.random

seed - Seed the random number generator

permutation - Return a random permutation of a sequence, or return a permuted range

shuffle - Randomly permute a sequence in-place

rand - Draw samples from a uniform distribution

randint - Draw random integers from a given low-to-high range

randn - Draw samples from a normal distribution with mean 0 and standard deviation 1 (MATLAB-like interface)

binomial - Draw samples from a binomial distribution

normal - Draw samples from a normal (Gaussian) distribution

beta - Draw samples from a beta distribution

chisquare - Draw samples from a chi-square distribution

gamma - Draw samples from a gamma distribution

uniform - Draw samples from a uniform [0, 1) distribution