# Vectors, matrices and multidimensional arrays

[NumPy manual (latest version, ReadTheDocs)](https://numpy.readthedocs.io/en/latest/index.html)

In [1]:
import numpy as np

### NumPy arrays
* **NOT THE SAME AS PYTHON LISTS**. All array elements have same data type; arrays are fixed size. (Need to edit the array? create a new one.)

Attributes:
- _shape_: tuple; contains # of elements for each axis of the array
- _size_: total # of elements
- _ndim_: number of dimensions (axes)
- _nbytes_: number of bytes used for storage
- _dtype_: datatype

In [2]:
data = np.array([[1, 2], [3, 4], [5, 6]])
type(data)

numpy.ndarray

In [3]:
data.ndim, data.shape, data.size, data.dtype, data.nbytes

(2, (3, 2), 6, dtype('int64'), 48)

In [4]:
data

array([[1, 2],
       [3, 4],
       [5, 6]])

### Data types:
* int (integer: 8, 16, 32, 64)
* uint (unsigned integer: 8, 16, 32, 64)
* bool (boolean)
* float (floating-point: 16, 32, 64, 128)
* complex (comple floating-point: 64, 128, 256)

In [5]:
np.array([1, 2, 3], dtype=np.int)

array([1, 2, 3])

In [6]:
np.array([1, 2, 3], dtype=np.float)

array([1., 2., 3.])

In [7]:
np.array([1, 2, 3], dtype=np.complex)

array([1.+0.j, 2.+0.j, 3.+0.j])

In [8]:
data = np.array([1, 2, 3], dtype=np.float)
data.dtype, data

(dtype('float64'), array([1., 2., 3.]))

In [9]:
data = np.array([1, 2, 3], dtype=np.int)
data.dtype, data

(dtype('int64'), array([1, 2, 3]))

In [10]:
data = np.array([1, 2, 3], dtype=np.float)
data.dtype, data

(dtype('float64'), array([1., 2., 3.]))

Once created, dtype cannot be changed. 
Create a copy with "typecasted" new values:

In [11]:
data.astype(np.int)

array([1, 2, 3])

Data types can get "promoted" to support operations:

In [12]:
d1 = np.array([1, 2, 3], dtype=float)
d2 = np.array([1, 2, 3], dtype=complex)
(d1+d2).dtype

dtype('complex128')

Some cases may require creation of arrays set to appropriate data types. The default is 'float'.



In [13]:
np.sqrt(np.array([-1, 0, 1]))

  """Entry point for launching an IPython kernel.


array([nan,  0.,  1.])

In [14]:
np.sqrt(np.array([-1, 0, 1], dtype=complex))

array([0.+1.j, 0.+0.j, 1.+0.j])

### Real and imaginary parts
* All numpy arrays (not just complex vals) have real & imaginary attributes.

In [15]:
data = np.array([1, 2, 3], dtype=complex)

In [16]:
print(data,"\n",data.real,"\n",data.imag)

[1.+0.j 2.+0.j 3.+0.j] 
 [1. 2. 3.] 
 [0. 0. 0.]


### Array data in memory
* Stored as contiguous data in memory. In the case of 2D arrays,
* Two options:
    - __Row-major__ (row-wise storage; C std.)
    - __Column-major__ (column-wise storage; Fortran std.)
* Numpy default is __row-major__.
* To specifiy, use keyword argument *order='C'* or *order='F'*

### Creating Arrays
![array-gen-funcs](pics/array-gen-funcs.png)
![array-gen-funcs2](pics/array-gen-funcs2.png)

### Arrays created from lists and other array-like objects

In [17]:
data = np.array([1, 2, 3, 4])

In [18]:
data.ndim, data.shape

(1, (4,))

In [19]:
data = np.array([[1, 2], [3, 4]])

In [20]:
data.ndim, data.shape

(2, (2, 2))

### Arrays filled with constant values

In [21]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [22]:
data = np.ones(4)
data

array([1., 1., 1., 1.])

In [24]:
x1 = 5.4 * data; x1

array([5.4, 5.4, 5.4, 5.4])

numpy __full()__: create array filled with ones, then muliply array with desired fill value.

In [26]:
x2 = np.full(10, 5.4)
x2

array([5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4, 5.4])

numpy __empty()__: unitialized data

In [29]:
x1 = np.empty(5); x1

array([6.94147542e-310, 6.94147542e-310, 1.58101007e-322, 0.00000000e+000,
       2.37151510e-322])

In [30]:
x1.fill(3.0); x1

array([3., 3., 3., 3., 3.])

### Arrays filled with sequences

In [31]:
# np.arange (3rd arg = increment)
np.arange(0.0, 10, 1)

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [32]:
# np.linspace (3rd arg = #total points)
np.linspace(0, 10, 11)

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [33]:
# np.logspace
np.logspace(0, 2, 4)  # 4 data points between 10**0=1 to 10**2=100

array([  1.        ,   4.64158883,  21.5443469 , 100.        ])

### Mesh-grid arrays
* Given two 1D coordinate arrays, generate 2D coordinate array.
* Often used when plotting function over two variables (ex: contour plots).

In [34]:
x = np.array([-1, 0, 1])
y = np.array([-2, 0, 2])


In [35]:
X, Y = np.meshgrid(x, y)

In [36]:
X

array([[-1,  0,  1],
       [-1,  0,  1],
       [-1,  0,  1]])

In [37]:
Y

array([[-2, -2, -2],
       [ 0,  0,  0],
       [ 2,  2,  2]])

In [38]:
(X + Y) ** 2

array([[9, 4, 1],
       [1, 0, 1],
       [1, 4, 9]])

In [40]:
np.mgrid?

[0;31mType:[0m        MGridClass
[0;31mString form:[0m <numpy.lib.index_tricks.MGridClass object at 0x7fc7f813bfd0>
[0;31mFile:[0m        ~/anaconda3/lib/python3.6/site-packages/numpy/lib/index_tricks.py
[0;31mDocstring:[0m  
`nd_grid` instance which returns a dense multi-dimensional "meshgrid".

An instance of `numpy.lib.index_tricks.nd_grid` which returns an dense
(or fleshed out) mesh-grid when indexed, so that each returned argument
has the same shape.  The dimensions and number of the output arrays are
equal to the number of indexing dimensions.  If the step length is not a
complex number, then the stop is not inclusive.

However, if the step length is a **complex number** (e.g. 5j), then
the integer part of its magnitude is interpreted as specifying the
number of points to create between the start and stop values, where
the stop value **is inclusive**.

Returns
----------
mesh-grid `ndarrays` all of the same dimensions

See Also
--------
numpy.lib.index_tricks.nd_grid : c

In [41]:
np.ogrid?

[0;31mType:[0m        OGridClass
[0;31mString form:[0m <numpy.lib.index_tricks.OGridClass object at 0x7fc7f80fee48>
[0;31mFile:[0m        ~/anaconda3/lib/python3.6/site-packages/numpy/lib/index_tricks.py
[0;31mDocstring:[0m  
`nd_grid` instance which returns an open multi-dimensional "meshgrid".

An instance of `numpy.lib.index_tricks.nd_grid` which returns an open
(i.e. not fleshed out) mesh-grid when indexed, so that only one dimension
of each returned array is greater than 1.  The dimension and number of the
output arrays are equal to the number of indexing dimensions.  If the step
length is not a complex number, then the stop is not inclusive.

However, if the step length is a **complex number** (e.g. 5j), then
the integer part of its magnitude is interpreted as specifying the
number of points to create between the start and stop values, where
the stop value **is inclusive**.

Returns
----------
mesh-grid `ndarrays` with only one dimension :math:`\neq 1`

See Also
--------


### Uninitialized arrays
* Use __np.empty__ instead of np.zeros if wanting to avoid initialization step. (Saves time when building large arrays.)

In [42]:
np.empty(3, dtype=np.float)

array([1., 2., 3.])

### Creating arrays with properties of other arrays

Typical use case: a function that takes arrays of unspecified type & size as arguments & requires working arrays of the same type & size.

* __np.ones_like__
* __np.zeros_like__
* __np.full_like__
* __np.empty_like__

In [43]:
def f(x):
    y = np.ones_like(x)
    return y

a = [1,2,3,4]
f(a)

array([1, 1, 1, 1])

### Creating matrix arrays

* __np.identity()__: creates square matrix with ones on diagonal, zero elsewhere.

In [44]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

* __np.eye()__: ones on diagonal, optionally offset

In [45]:
np.eye(3, k=1)

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

In [46]:
np.eye(3, k=-1)

array([[0., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.]])

* __diag()__: arbitrary 1D array on the diagonal of a matrix

In [47]:
np.diag(np.arange(0, 20, 5))

array([[ 0,  0,  0,  0],
       [ 0,  5,  0,  0],
       [ 0,  0, 10,  0],
       [ 0,  0,  0, 15]])

## Index and slicing

### One-dimensional arrays
![array-slice-funcs](pics/array-slice-funcs.png)

In [48]:
a = np.arange(0, 11)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [49]:
a[0]  # the first element

0

In [50]:
a[-1] # the last element

10

In [51]:
a[4]  # the fifth element, at index 4

4

In [52]:
a[1:-1]

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [53]:
a[1:-1:2]

array([1, 3, 5, 7, 9])

In [54]:
a[:5] # first five elements

array([0, 1, 2, 3, 4])

In [55]:
a[-5:] # last five elements

array([ 6,  7,  8,  9, 10])

In [56]:
a[::-2] # every 2nd value in reverse order

array([10,  8,  6,  4,  2,  0])

## Multidimensional arrays

In [57]:
f = lambda m, n: n + 10 * m

In [58]:
A = np.fromfunction(f, (6, 6), dtype=int)
A

array([[ 0,  1,  2,  3,  4,  5],
       [10, 11, 12, 13, 14, 15],
       [20, 21, 22, 23, 24, 25],
       [30, 31, 32, 33, 34, 35],
       [40, 41, 42, 43, 44, 45],
       [50, 51, 52, 53, 54, 55]])

In [59]:
A[:, 1]  # the second column

array([ 1, 11, 21, 31, 41, 51])

In [60]:
A[1, :]  # the second row

array([10, 11, 12, 13, 14, 15])

In [61]:
A[:3, :3]  # upper left 3x3

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22]])

In [62]:
A[3:, :3]  # lower left 3x3

array([[30, 31, 32],
       [40, 41, 42],
       [50, 51, 52]])

In [63]:
A[::2, ::2]  # every second element starting from 0, 0

array([[ 0,  2,  4],
       [20, 22, 24],
       [40, 42, 44]])

In [64]:
A[1::2, 1::3]  # every (2nd,3rd) element starting from 1, 1

array([[11, 14],
       [31, 34],
       [51, 54]])

### Views
* Subarray extractions using slice ops are view of same underlying data. (Refer to same data, but using different "strides".)

In [65]:
B = A[1:5, 1:5]
B

array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34],
       [41, 42, 43, 44]])

In [66]:
B[:, :] = 0
A

array([[ 0,  1,  2,  3,  4,  5],
       [10,  0,  0,  0,  0, 15],
       [20,  0,  0,  0,  0, 25],
       [30,  0,  0,  0,  0, 35],
       [40,  0,  0,  0,  0, 45],
       [50, 51, 52, 53, 54, 55]])

In [67]:
C = B[1:3, 1:3].copy()
C

array([[0, 0],
       [0, 0]])

In [68]:
C[:, :] = 1
C

array([[1, 1],
       [1, 1]])

In [69]:
B

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

### Fancy indexing and Boolean-valued indexing
* Arrays can be indexed using another array, a list, or sequence of integers.

In [70]:
A = np.linspace(0, 1, 11)
A

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [71]:
A[np.array([0, 2, 4])]

array([0. , 0.2, 0.4])

In [72]:
A[[0, 2, 4]]

array([0. , 0.2, 0.4])

Boolean-based indexing - great for filtering!

In [73]:
A > 0.5 

array([False, False, False, False, False, False,  True,  True,  True,
        True,  True])

In [74]:
A[A > 0.5]

array([0.6, 0.7, 0.8, 0.9, 1. ])

* arrays from fancy & boolean indexing are new, independent arrays - not just views of existing arrays.

In [75]:
A = np.arange(10)
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [76]:
indices = [2, 4, 6]
B = A[indices]
B

array([2, 4, 6])

In [77]:
B[0] = -1  # this does not affect A
B

array([-1,  4,  6])

In [78]:
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [79]:
A[indices] = -1
A

array([ 0,  1, -1,  3, -1,  5, -1,  7,  8,  9])

In [80]:
A = np.arange(10)

In [81]:
B = A[A > 5]

In [82]:
B[0] = -1  # this does not affect A

In [83]:
A

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [84]:
A[A > 5] = -1

In [85]:
A

array([ 0,  1,  2,  3,  4,  5, -1, -1, -1, -1])

### Reshaping and resizing ops
![reshape](pics/reshape-ops.png)
![indexing viz](pics/indexing-viz.png)

Reshaping doesn't modify underlying data, only changes *stride* attribute

In [86]:
data = np.array([[1, 2], [3, 4]])

In [87]:
np.reshape(data, (1, 4))

array([[1, 2, 3, 4]])

In [88]:
data.reshape(4)

array([1, 2, 3, 4])

In [89]:
data

array([[1, 2],
       [3, 4]])

In [90]:
data.flatten()

array([1, 2, 3, 4])

In [91]:
data.flatten().shape

(4,)

__np.ravel()__ = special case of reshape. It collapses all array dimensions & returns a flattened 1D array with length = total number of original array elements. __flatten()__ does the same thing, but returns a copy instead of a view.

In [92]:
data.ravel()

array([1, 2, 3, 4])

__np.newaxis()__ = add axis to existing array.

In [94]:
data = np.arange(0, 5)
data

array([0, 1, 2, 3, 4])

In [95]:
data[:, np.newaxis]

array([[0],
       [1],
       [2],
       [3],
       [4]])

In [96]:
data[np.newaxis, :]

array([[0, 1, 2, 3, 4]])

__np.hstack()__ = horizontal stacking

__np.vstack()__ = vertical stacking rows into a matrix

__np.concatenate()__ = similar to stack, but accepts an _axis_ keyword arg.

In [97]:
data = np.arange(5)
data

array([0, 1, 2, 3, 4])

In [98]:
np.vstack((data, data, data)) # stack vertically along axis 0

array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

In [99]:
np.hstack((data, data, data)) # stack horizontally along axis 0

array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4])

In [100]:
data = data[:, np.newaxis]
data

array([[0],
       [1],
       [2],
       [3],
       [4]])

In [101]:
np.hstack((data, data, data))

array([[0, 0, 0],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [4, 4, 4]])

### Vectorized expressions

Designed to avoid need for "*for*" loops. __Broadcasting__ = a scalar being distributed and an operation being applied to each element in an array.

![broadcasting](pics/broadcasting.png)

### Arithmetic operations
![arithmetic-ops](pics/arithmetic-ops.png)

In [102]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
x+y

array([[ 6,  8],
       [10, 12]])

In [103]:
y-x

array([[4, 4],
       [4, 4]])

In [104]:
x*y

array([[ 5, 12],
       [21, 32]])

In [105]:
y/x

array([[5.        , 3.        ],
       [2.33333333, 2.        ]])

In [106]:
x*2

array([[2, 4],
       [6, 8]])

In [107]:
2**x

array([[ 2,  4],
       [ 8, 16]])

In [108]:
y/2

array([[2.5, 3. ],
       [3.5, 4. ]])

In [109]:
(y/2).dtype

dtype('float64')

If a math operation is performed on incompatible (size or shape) arrays, a __ValueError__ is raised.

In [110]:
x = np.array([1, 2, 3, 4]).reshape(2,2)
x

array([[1, 2],
       [3, 4]])

In [111]:
z = np.array([1, 2, 3, 4])
z

array([1, 2, 3, 4])

In [112]:
# incompatible size/shape
x / z

ValueError: operands could not be broadcast together with shapes (2,2) (4,) 

An example of successful broadcasting to a correct shape.

In [113]:
z = np.array([[2, 4]])

In [114]:
z.shape

(1, 2)

In [115]:
x/z

array([[0.5, 0.5],
       [1.5, 1. ]])

In [116]:
zz = np.concatenate([z, z], axis=0)
zz

array([[2, 4],
       [2, 4]])

In [117]:
x/zz

array([[0.5, 0.5],
       [1.5, 1. ]])

In [118]:
z = np.array([[2], [4]])
z.shape

(2, 1)

In [119]:
x/z

array([[0.5 , 1.  ],
       [0.75, 1.  ]])

In [120]:
zz = np.concatenate([z, z], axis=1)
zz

array([[2, 2],
       [4, 4]])

In [121]:
x/zz

array([[0.5 , 1.  ],
       [0.75, 1.  ]])

In [122]:
x = np.array([[1, 3], [2, 4]])
x = x + y
x

array([[ 6,  9],
       [ 9, 12]])

In [123]:
x = np.array([[1, 3], [2, 4]])
x += y
x

array([[ 6,  9],
       [ 9, 12]])

### Elementwise functions
![element-wise-math](pics/element-wise-math-functs.png)

In [124]:
x = np.linspace(-1, 1, 11)
x

array([-1. , -0.8, -0.6, -0.4, -0.2,  0. ,  0.2,  0.4,  0.6,  0.8,  1. ])

In [125]:
y = np.sin(np.pi * x)
y

array([-1.22464680e-16, -5.87785252e-01, -9.51056516e-01, -9.51056516e-01,
       -5.87785252e-01,  0.00000000e+00,  5.87785252e-01,  9.51056516e-01,
        9.51056516e-01,  5.87785252e-01,  1.22464680e-16])

In [126]:
np.round(y, decimals=4) # round FP numbers to 4 decimals

array([-0.    , -0.5878, -0.9511, -0.9511, -0.5878,  0.    ,  0.5878,
        0.9511,  0.9511,  0.5878,  0.    ])

In [127]:
np.add(np.sin(x)**2, np.cos(x)**2)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [128]:
np.sin(x)**2 + np.cos(x)**2

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

![element-wise-math](pics/element-wise-math.png)

* Sometimes need to define new functions that use NumPy arrays element-by-element. __vectorize()__ may help; it transforms a (usually scalar) function.

In [129]:
def heaviside(x):
    return 1 if x > 0 else 0

heaviside(-1), heaviside(1.5)

(0, 1)

In [130]:
# won't work
heaviside(np.linspace(-5, 5, 11))

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [131]:
heaviside = np.vectorize(heaviside) # vectorize to the rescue!

In [132]:
# works, but relatively slow.
# better to use boolean-valued arrays (to be discussed later)
# use as quick-n-dirty check
heaviside(x)

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

### Aggregate functions

Accept array inputs, return scalar outputs.

![aggregate-funcs](pics/aggregate-funcs.png)

In [134]:
data = np.random.normal(size=(15,15))
data

array([[-0.10086431,  0.76224538, -1.12926943,  0.53121256, -0.36535128,
         0.50170388,  0.67737089, -0.51205381, -2.26685089, -1.09827727,
        -0.32990823,  0.76602886, -0.72579968, -1.46045147,  0.17239457],
       [-0.59683783, -1.55160052, -1.32889603, -0.90940541,  0.8091943 ,
         0.31540003,  0.31799856, -0.3079141 , -0.2436456 ,  0.47722968,
         0.80746319,  0.20932876, -0.45709807, -0.80435794,  0.16019832],
       [-0.41403936, -0.41930597,  1.03435818,  1.45002598,  0.115924  ,
         0.25127537,  0.00867024,  0.00769048,  0.49350609, -0.52187048,
        -0.90028103, -0.22814053, -0.23964803, -0.44615822,  3.0310091 ],
       [-0.0619028 ,  0.56783293,  0.11267067,  1.24496783,  0.23888096,
         0.25630935, -0.17890026,  0.71160918, -0.63097631,  1.44766332,
         0.51714557,  1.50226782, -0.31427504, -0.04847244, -1.3951009 ],
       [-1.24615831,  0.67954663,  1.37497039,  1.67306127, -0.29005785,
        -1.77186938, -0.58214599, -0.753768  , 

In [135]:
np.mean(data), data.mean()

(0.07099110732333659, 0.07099110732333659)

In [137]:
data = np.random.normal(size=(5, 10, 15))

In [138]:
data.sum(axis=0).shape

(10, 15)

In [139]:
data.sum(axis=(0, 2)).shape

(10,)

In [140]:
data.sum()

-12.360001099344707

In [141]:
data = np.arange(1,10).reshape(3,3)
data

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [142]:
data.sum()

45

In [143]:
data.sum(axis=0)

array([12, 15, 18])

In [144]:
data.sum(axis=1)

array([ 6, 15, 24])

![sum by axis](pics/sum-by-axis.png)

### Boolean arrays and vectorized conditional expressions
* Enables you to avoid using if statements. Winning!

In [145]:
a = np.array([1, 2, 3, 4])
b = np.array([4, 3, 2, 1])
a<b

array([ True,  True, False, False])

In [146]:
np.all(a<b), np.any(a<b)

(False, True)

In [147]:
if np.all(a < b):
    print("All a's are smaller than corresponding b's")
elif np.any(a < b):
    print("Some a's are smaller than corresponding b's")
else:
    print("All b's are smaller than corresponding a's")

Some a's are smaller than corresponding b's


In [148]:
x = np.array([-2, -1, 0, 1, 2])
x>0

array([False, False, False,  True,  True])

In [149]:
1*(x>0)

array([0, 0, 0, 1, 1])

In [150]:
x*(x>0)

array([0, 0, 0, 1, 2])

Conditional computing - for example, defining piecewise functions.

In [154]:
x = np.linspace(-5, 5, 11)
x

array([-5., -4., -3., -2., -1.,  0.,  1.,  2.,  3.,  4.,  5.])

In [158]:
def pulse(x, position, height, width):
    return height * (x >= position) * (x <= (position + width))

# expression is a multiplication of two Boolean arrays,
# so multiplication acts as an elementwise AND operator.

In [156]:
pulse(x, position=-2, height=1, width=5)

array([0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0])

In [157]:
pulse(x, position=1, height=1, width=5)

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

### Conditional / Logical Functions
![conditionals-logicals](pics/conditional-logical-funcs.png)

In [159]:
def pulse(x, position, height, width):
    return height * np.logical_and(x >= position, x <= (position + width))

In [160]:
x = np.linspace(-4, 4, 9)
x

array([-4., -3., -2., -1.,  0.,  1.,  2.,  3.,  4.])

In [161]:
np.where(x < 0, x**2, x**3) # select elements from two arrays

array([16.,  9.,  4.,  1.,  0.,  1.,  8., 27., 64.])

In [163]:
np.select([x < -1, x < 2, x >= 2],
          [100, 200, 300]) # choose val from list of conditions.

array([100, 100, 100, 200, 200, 200, 300, 300, 300])

In [166]:
np.choose([0, 0, 0, 1, 1, 1, 2, 2, 2], 
          [x**2, x**3, x**4]) # choose val from list of arrays.

array([ 16.,   9.,   4.,  -1.,   0.,   1.,  16.,  81., 256.])

In [167]:
np.nonzero(abs(x)>2) # returns tuple of indices

(array([0, 1, 7, 8]),)

In [168]:
x[np.nonzero(abs(x)>2)] # array accessed by tuple of indices

array([-4., -3.,  3.,  4.])

### Set operations
* Manages __unordered collections__ of unique objects.
![set-funcs](pics/set-funcs.png)

In [169]:
a = np.unique([1,2,3,3])
b = np.unique([2,3,4,4,5,6,5])

In [170]:
np.in1d(a, b) # test for existence

array([False,  True,  True])

In [171]:
1 in a # testing for single element presence

True

In [172]:
1 in b

False

In [173]:
np.all(np.in1d(a, b)) # a = subset of b?

False

In [174]:
np.union1d(a, b)

array([1, 2, 3, 4, 5, 6])

In [175]:
np.intersect1d(a, b)

array([2, 3])

In [176]:
np.setdiff1d(a, b)

array([1])

In [177]:
np.setdiff1d(b, a)

array([4, 5, 6])

### Operations on arrays

Operations that act upon arrays __as a single entity__, and return transformed arrays of the same size.

![array-funcs](pics/array-funcs.png)

In [184]:
data = np.arange(9).reshape(3, 3)
data

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [185]:
np.transpose(data) # transpose also exists as special method "T"

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

In [186]:
data.T

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

In [190]:
np.fliplr(data)

array([[2, 1, 0],
       [5, 4, 3],
       [8, 7, 6]])

In [191]:
np.flipud(data)

array([[6, 7, 8],
       [3, 4, 5],
       [0, 1, 2]])

### Matrix and vector operations
* NumPy uses . ("dot") to denote matrix multiplication.
![matrix-funcs](pics/matrix-funcs.png)

In [192]:
A = np.arange(1, 7).reshape(2, 3)
A

array([[1, 2, 3],
       [4, 5, 6]])

In [193]:
B = np.arange(1, 7).reshape(3, 2)
B

array([[1, 2],
       [3, 4],
       [5, 6]])

In [194]:
np.dot(A, B)

array([[22, 28],
       [49, 64]])

In [195]:
np.dot(B, A)

array([[ 9, 12, 15],
       [19, 26, 33],
       [29, 40, 51]])

In [196]:
A = np.arange(9).reshape(3, 3)
A

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [197]:
x = np.arange(3)
x

array([0, 1, 2])

In [198]:
# dot operation can also be used for matrix-vector multiplication
np.dot(A, x), A.dot(x)

(array([ 5, 14, 23]), array([ 5, 14, 23]))

Matrix multiplication expressions can quickly get VERY cumbersome. 
Below is an example (similarity transform): A'=BAB^-1:

In [201]:
A = np.random.rand(3,3)
B = np.random.rand(3,3)
A,B

(array([[0.32760441, 0.34538058, 0.53439761],
        [0.51206286, 0.34866607, 0.74527783],
        [0.17163767, 0.24599314, 0.82985868]]),
 array([[0.79568767, 0.42103214, 0.49618383],
        [0.54366265, 0.76413561, 0.59282686],
        [0.32665571, 0.86223813, 0.32895252]]))

In [202]:
Ap = np.dot(B, np.dot(A, np.linalg.inv(B)))
Ap

array([[-1.2469943 ,  4.5196653 , -2.76598763],
       [-1.42611611,  5.31564657, -3.31857753],
       [-0.97087076,  4.0734219 , -2.56252312]])

In [203]:
Ap = B.dot(A.dot(np.linalg.inv(B)))

NumPy provides __matrix__ data structure as an easier-to-read alternative.

In [207]:
A = np.matrix(A)
A

matrix([[0.32760441, 0.34538058, 0.53439761],
        [0.51206286, 0.34866607, 0.74527783],
        [0.17163767, 0.24599314, 0.82985868]])

In [208]:
B = np.matrix(B)
B

matrix([[0.79568767, 0.42103214, 0.49618383],
        [0.54366265, 0.76413561, 0.59282686],
        [0.32665571, 0.86223813, 0.32895252]])

In [209]:
Ap = B*A*B.I # I = inverse matrix
Ap

matrix([[-1.2469943 ,  4.5196653 , -2.76598763],
        [-1.42611611,  5.31564657, -3.31857753],
        [-0.97087076,  4.0734219 , -2.56252312]])

Unfortunately __matrix__ has some disadvantages & is usually discouraged. The primary problem is that expressions like A * B are context dependent, which causes readability issues.

In some cases consider explicitly casting arrays to matrices before computation, then explicitly casting the result back to ndarray.

In [212]:
A = np.asmatrix(A); A

matrix([[0.32760441, 0.34538058, 0.53439761],
        [0.51206286, 0.34866607, 0.74527783],
        [0.17163767, 0.24599314, 0.82985868]])

In [211]:
B = np.asmatrix(B); B

matrix([[0.79568767, 0.42103214, 0.49618383],
        [0.54366265, 0.76413561, 0.59282686],
        [0.32665571, 0.86223813, 0.32895252]])

In [213]:
Ap = B*A*B.I

In [214]:
Ap = np.asarray(Ap); Ap

array([[-1.2469943 ,  4.5196653 , -2.76598763],
       [-1.42611611,  5.31564657, -3.31857753],
       [-0.97087076,  4.0734219 , -2.56252312]])

In [215]:
np.inner(x,x)

5

In [216]:
np.dot(x,x)

5

__np.inner__ expects two inputs with the same dimension.
__np.dot__ can take input vectors of shape _1xN_ & _Nx1_ respectively.

In [217]:
y = x[:, np.newaxis]; y

array([[0],
       [1],
       [2]])

In [218]:
np.dot(y.T, y)

array([[5]])

__np.outer__ maps two vectors to a matrix.

In [219]:
x = np.array([1,2,3]); x

array([1, 2, 3])

In [220]:
np.outer(x,x) 

array([[1, 2, 3],
       [2, 4, 6],
       [3, 6, 9]])

__kron__ also returns an outer (Kronecker) product, but returns an output array of shape (MxP,NxQ) when inputs are shaped as (M,N) & (P,Q).

In [221]:
np.kron(x,x) 

array([1, 2, 3, 2, 4, 6, 3, 6, 9])

In [222]:
np.kron(x[:, np.newaxis], x[np.newaxis, :])

array([[1, 2, 3],
       [2, 4, 6],
       [3, 6, 9]])

In [223]:
np.kron(np.ones((2,2)), np.identity(2))

array([[1., 0., 1., 0.],
       [0., 1., 0., 1.],
       [1., 0., 1., 0.],
       [0., 1., 0., 1.]])

In [224]:
np.kron(np.identity(2), np.ones((2,2)))

array([[1., 1., 0., 0.],
       [1., 1., 0., 0.],
       [0., 0., 1., 1.],
       [0., 0., 1., 1.]])

Expressing common array ops using Einstein's summation convention. First argument is __an index expression__ (a string with comma-separated indices).

In [225]:
x = np.array([1, 2, 3, 4])

In [226]:
y = np.array([5, 6, 7, 8])

In [227]:
np.einsum("n,n", x, y)

70

In [228]:
np.inner(x,y)

70

In [231]:
A = np.arange(9).reshape(3, 3); A

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [232]:
B = A.T; B

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

In [233]:
np.einsum("mk,kn", A, B)

array([[  5,  14,  23],
       [ 14,  50,  86],
       [ 23,  86, 149]])

In [234]:
np.alltrue(np.einsum("mk,kn", A, B) == np.dot(A, B))

True