# Numpy Reference Guide
*A guide of useful methods which I've discovered over time, and a presentation of some standard ones; compiled by Andrea Walker February 2021.*

# Basics

## Creating Arrays and Matrices

### np.array(), .ones(), .zeros(), .empty()

In [2]:
'''
np.array([3,6,9]) # creating a general array from a list
np.ones()         # creating an array of ones; pass an integer(length of 1D array) or tuple (shape of N-D array)
np.zeros()        # same syntax as ones
np.empty()        # same syntax as ones      
'''
import numpy as np

### numpy.full(shape, fill_value, dtype=None, order='C', *, like=None)
> Return a new array of given shape and type, filled with fill_value.\
> https://numpy.org/doc/stable/reference/generated/numpy.full.html

#### *This is useful for creating an array of booleans*

In [3]:
np.full((3,4),True)

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

## Generating Series of numbers
A general comment on the two main numpy functions generating series of numbers : arange and linspace (if this is too big of a block of text, jump down to the tl;dr).

### np.linspace(start, stop, num=50)
>Linspace returns an array of  num  elements evenly spaced over the closed interval [start, stop].
 More verbosely, linspace requires two arguments (start and stop), and returns an array of length num  (default is 50) containing evenly spaced numbers beginning with start,  up to and including stop.
 > https://numpy.org/doc/stable/reference/generated/numpy.linspace.html

### np.arange([start, ]stop, [step, ])
>Arange returns an array of of evenly spaced elements over the half-open interval [start,stop). The number of elements is determined by step, which defaults to 1. 
More verbosely, arange requires one argument (stop), and returns an array containing evenly spaced numbers beginning with start (defaults to 0), at intervals of step (default of 1), up to but not including stop.
> https://numpy.org/doc/stable/reference/generated/numpy.arange.html

TL;DR : If you want to create an array of evenly spaced numbers, use linspace if you care about the number of points  in the array. Otherwise, use arange  if you are interested in the distance between the points in the array.

In [4]:
lnspace = np.linspace(1,10,20)
arng = np.arange(1,10,1.5)
print(lnspace,'\n',arng)

[ 1.          1.47368421  1.94736842  2.42105263  2.89473684  3.36842105
  3.84210526  4.31578947  4.78947368  5.26315789  5.73684211  6.21052632
  6.68421053  7.15789474  7.63157895  8.10526316  8.57894737  9.05263158
  9.52631579 10.        ] 
 [1.  2.5 4.  5.5 7.  8.5]


## Matrices 
### numpy.identity(n, dtype=None, *, like=None)[source]¶
>Return the identity array. \
>The identity array is a square array with ones on the main diagonal.\
>Parameters: n (int) :=Number of rows (and columns) in n x n output.

> https://numpy.org/doc/stable/reference/generated/numpy.identity.html

In [5]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C', *, like=None)
> Return a 2-D array with ones on the diagonal and zeros elsewhere.\
> Parameters: N (int) :=Number of rows in the output.\
> M (int), optional := Number of columns in the output. If None, defaults to N.

> https://numpy.org/doc/stable/reference/generated/numpy.eye.html

In [6]:
np.eye(3,4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

### numpy.atleast_2d(*arys)[source]¶
> View inputs as arrays with at least two dimensions.\
> Parameters: \
> arys1, arys2, …array_like
> One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have two or more dimensions are preserved.\
>Returns \
>res, res2, …ndarray \
>An array, or list of arrays, each with a.ndim >= 2. Copies are avoided where possible, and views with two or more dimensions are returned.

>https://numpy.org/doc/stable/reference/generated/numpy.atleast_2d.html

In [7]:
x= np.arange(5)
np.atleast_2d(x)

array([[0, 1, 2, 3, 4]])

# Operations on Arrays and Matrices

## Maximum and Minimum values, indices

**Note: for N-D arrays (N>1), all of the following flatten the input array unless an axis is specified.**

### numpy.amax(a, axis=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)
>Return the maximum of an array or maximum along an axis. \
>https://numpy.org/doc/stable/reference/generated/numpy.amax.html
    
### numpy.argmax(a, axis=None, out=None)
>Returns the indices of the maximum values along an axis. \
> https://numpy.org/doc/stable/reference/generated/numpy.argmax.html
    
### numpy.amin(a, axis=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)
>Return the minimum of an array or minimum along an axis.\
>https://numpy.org/doc/stable/reference/generated/numpy.amin.html
    
### numpy.argmin(a, axis=None, out=None)
>Returns the indices of the minimum values along an axis. \
>https://numpy.org/doc/stable/reference/generated/numpy.argmin.html

## Rounding 

### numpy.rint(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'rint'>
>Round elements of the array to the nearest integer. \
>Parameters: x (array_like,Input array.)\
>https://numpy.org/doc/stable/reference/generated/numpy.rint.html



In [8]:
x = np.array([1.2,3.5,5.7])
np.rint(x)

array([1., 4., 6.])

## Multiply / Add elements (along axis, or all)

### numpy.sum(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)
> Sum of array elements over a given axis. \
>Parameters a := (array_like, Elements to sum.) \
> axis:=None or int or tuple of ints, optional;
Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis. \
>If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
> https://numpy.org/doc/stable/reference/generated/numpy.sum.html
    

### numpy.prod(a, axis=None, dtype=None, out=None, keepdims=<no value>, initial=<no value>, where=<no value>)[source]
> Return the product of array elements over a given axis. \
> Parameters := a (array_like) Input data. \
> := axis (None or int or tuple of ints, optional), Axis or axes along which a product is performed. The default, axis=None, will calculate the product of all the elements in the input array. If axis is negative it counts from the last to the first axis. \
>If axis is a tuple of ints, a product is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.
> https://numpy.org/doc/stable/reference/generated/numpy.prod.html

In [9]:
print(np.sum([1,2,3,4]))
print(np.prod([2,2,2,2]))

10
16


## Sorting

### numpy.argsort(a, axis=-1, kind=None, order=None)
> Returns the indices that would sort an array.\
> Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.

>https://numpy.org/doc/stable/reference/generated/numpy.argsort.html

In [10]:
x = np.array([3,4,1,5,2])
np.argsort(x)

array([2, 4, 0, 1, 3])

## Histogramming / binning

### numpy.histogram(a, bins=10, range=None, normed=None, weights=None, density=None)
> Compute the histogram of a set of data. \
> Parameters  a := array_like, Input data. The histogram is computed over the flattened array. \
> bins := int or sequence of scalars or str, optional
> https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

### numpy.histogram2d(x, y, bins=10, range=None, normed=None, weights=None, density=None)[source]
>Compute the bi-dimensional histogram of two data samples. \
>Parameters: x := array_like, shape (N,)
An array containing the x coordinates of the points to be histogrammed. \
>y, array_like, shape (N,) :=
An array containing the y coordinates of the points to be histogrammed. \
> bins := int or array_like or [int, int] or [array, array], optional
The bin specification:
> * If int, the number of bins for the two dimensions (nx=ny=bins).
> * If array_like, the bin edges for the two dimensions (x_edges=y_edges=bins).
> * If [int, int], the number of bins in each dimension (nx, ny = bins).
> * If [array, array], the bin edges in each dimension (x_edges, y_edges = bins).
> * A combination [int, array] or [array, int], where int is the number of bins and array is the bin edges.
> https://numpy.org/doc/stable/reference/generated/numpy.histogram2d.html

## Returning a subset of an array

### Based on a condition 


#### Using conditional filters like in Pandas:
>Source, Stack overflow \
> https://stackoverflow.com/questions/3030480/how-do-i-select-elements-of-an-array-given-condition

> Suppose I have a numpy array x = [5, 2, 3, 1, 4, 5], y = ['f', 'o', 'o', 'b', 'a', 'r']. I want to select the elements in y corresponding to elements in x that are greater than 1 and less than 5.

In [11]:
x = np.array([5, 2, 3, 1, 4, 5])
y = np.array(['f','o','o','b','a','r'])
output = y[(x > 1) & (x < 5)] # desired output is ['o','o','a']
print(output)

['o' 'o' 'a']


Or, using a mask:


#### numpy.compress(condition, a, axis=None, out=None)[source]
> Return selected slices of an array along given axis.
> When working along a given axis, a slice along that axis is returned in output for each index where condition evaluates to True. When working on a 1-D array, compress is equivalent to extract.
> https://numpy.org/doc/stable/reference/generated/numpy.compress.html

In [12]:
#(one example: mask)
mask = np.full(5,True)
mask[2]=False
arr = np.arange(5)
new_arr = np.compress(mask,arr)
print("Mask: {} \n Original array: {} \n Masked array: {}".format(mask,arr,new_arr))

Mask: [ True  True False  True  True] 
 Original array: [0 1 2 3 4] 
 Masked array: [0 1 3 4]


In [13]:
#compress based on condition
arr = np.arange(10)
new_arr = np.compress(arr>5,arr)
print(arr,'\n',new_arr)

[0 1 2 3 4 5 6 7 8 9] 
 [6 7 8 9]


### numpy.take(a, indices, axis=None, out=None, mode='raise')[source]
>Take elements from an array along an axis. \
>When axis is not None, this function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis. A call such as np.take(arr, indices, axis=3) is equivalent to arr[:,:,:,indices,...]. \
> Explained without fancy indexing, this is equivalent to the following use of ndindex, which sets each of ii, jj, and kk to a tuple of indices:
> https://numpy.org/doc/stable/reference/generated/numpy.take.html

In [14]:
ind = np.arange(0,20,step=2)
x_sq = np.array([2**x for x in range(25)])

print(ind)
print(x_sq)
print(np.take(x_sq,ind))

[ 0  2  4  6  8 10 12 14 16 18]
[       1        2        4        8       16       32       64      128
      256      512     1024     2048     4096     8192    16384    32768
    65536   131072   262144   524288  1048576  2097152  4194304  8388608
 16777216]
[     1      4     16     64    256   1024   4096  16384  65536 262144]


## Conditionally populate

### numpy.where(condition[, x, y])
>Return elements chosen from x or y depending on condition.\
>https://numpy.org/doc/stable/reference/generated/numpy.where.html

In [15]:
x = np.random.normal(size=10)
x_relu = np.where(x>0,x,0)
print(x,'\n',x_relu)

[ 0.92719407 -1.68623154 -0.57333817  1.31506883  1.17210266 -0.87073216
  0.70888598  0.69615845 -0.45259     0.52158578] 
 [0.92719407 0.         0.         1.31506883 1.17210266 0.
 0.70888598 0.69615845 0.         0.52158578]


## Comparing two arrays

### numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
>Returns a boolean array where two arrays are element-wise equal within a tolerance. \
> The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.
> https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

In [16]:
arr1 = np.ones(5)
arr2 = np.ones(5)
close = np.isclose(arr1,arr2,atol=1e-7)
print(close)

[ True  True  True  True  True]


In [17]:
arr1 = 0.5*np.ones(5)
arr2 = np.ones(5)
print(np.isclose(arr1,arr2,atol=0.6))
print(np.isclose(arr1,arr2,atol=0.4))

[ True  True  True  True  True]
[False False False False False]


## Boolean Evaluation

### numpy.isinf(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) 
>Test element-wise for positive or negative infinity. \
> Returns a boolean array of the same shape as x, True where x == +/-inf, otherwise False.\
>Parameters: x (array_like, Input values) \
>https://numpy.org/doc/stable/reference/generated/numpy.isinf.html

In [18]:
print(np.isinf(np.inf))
print(np.isinf([3, np.inf, -np.inf]))

True
[False  True  True]


### numpy.isnan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) 
>Test element-wise for NaN and return result as a boolean array.\
>Parameters := x, array_like, Input array.\
>https://numpy.org/doc/stable/reference/generated/numpy.isnan.html

In [19]:
print(np.isnan(np.inf))
print(np.isnan([3, np.inf, -np.inf, np.nan]))

False
[False False False  True]


### numpy.all(a, axis=None, out=None, keepdims=<no value>, *, where=<no value>)
>Test whether all array elements along a given axis evaluate to True.\
>https://numpy.org/doc/stable/reference/generated/numpy.all.html

In [20]:
temp = [True,True,False]
print(np.all(temp))
print(np.all(temp[0:2]))

False
True


### numpy.any(a, axis=None, out=None, keepdims=<no value>, *, where=<no value>)
>Test whether any array element along a given axis evaluates to True.
> Returns single boolean unless axis is not None
>https://numpy.org/doc/stable/reference/generated/numpy.any.html

In [21]:
temp = [False,False,True]
print(np.any(temp))
print(np.any(temp[0:2]))

True
False


## Matrix math and Linear algebra

### Matrix inverses

### numpy.linalg.inv(a)
> Compute the (multiplicative) inverse of a matrix.\
> Given a square matrix a, return the matrix ainv satisfying dot(a, ainv) = dot(ainv, a) = eye(a.shape[0]).

> https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html

### numpy.linalg.pinv(a, rcond=1e-15, hermitian=False)
>Compute the (Moore-Penrose) **pseudo-inverse** of a matrix.\
>Calculate the generalized inverse of a matrix using its singular-value decomposition (SVD) and including all large singular values.\
> Changed in version 1.14: Can now operate on stacks of matrices \
>Parameters: a(…, M, N), array_like, := Matrix or stack of matrices to be pseudo-inverted.

>https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html

# Appendix: non-Numpy tips and tricks:

## Printing and String formatting

### String number formatting: 
https://mkaz.blog/code/python-string-format-cookbook/

### Printing Greek Letters and special characters:
https://pythonforundergradengineers.com/unicode-characters-in-python.html

### Printing the ± symbol:

In [22]:
print(u"\u00B1")
print("\u00B1")

±
±


## Python Lambda Functions
https://realpython.com/python-lambda/

## Jupyter notebook 

### Keyboard shortcuts
https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_Jupyter.html

### iPython magic commands
https://ipython.readthedocs.io/en/stable/interactive/magics.html

### Saving output of Jupyter notebook cell to a file:
https://stackoverflow.com/questions/27994137/how-to-save-the-output-of-a-cell-in-ipython-notebook