Useful function for numpy: 

https://numpy.org/doc/stable/reference/index.html 


<!-- # Quick Lookup

This part is intended for including a list of functionalities numpy has for quick lookups.  -->

# Array Creation 

### From array: 

np.array([input array])

np.zeros((row, col))

np.eye(N=<font  color='green'>Rows of matrix</font>, M=<font  color='green'>number of columns; "$N$" by default</font>, k=<font  color='green'>diagonal shift; "$k > 0$" refers to upper diagonal</font>)

np.full((row, col), <font  color='green'>value to fill in</font>)

### From file: 
np.fromfile(file_source, count=<font  color='green'>how many elements to take</font>, sep=<font  color='green'>use what to separate each element</font>)

### From interval: 

np.arange(stop), np.arange(start, stop), np.arange(start, stop, interval) -> lst[:stop], lst[start:stop], lst[start:stop:interval]
> create a list of integers from "start" to "stop", with optional "intervals" between each integer in the list. 

<br>

np.linspace(<font  color='green'>start</font>, <font  color='red'>stop</font>, <font  color='orange'>num=50</font>)
> generate a list of numbers with "constant" separation. <br> Note: the generated list of numbers are floats

- <font  color='green'>start</font>: the beginning element of the list
- <font  color='red'>stop</font>: the ending element of the list
- <font  color='orange'>num</font>: the amount of numbers in the generated list.  

<br>

np.logspace(start, stop, num=50, <font  color='green'>endpoint=True</font>, <font  color='orange'>base=10.0</font>)    
> generate a list of numbers where the distance between each logged element is constant. <br>
np.logspace(1, 3, 3): outputs {10, 100, 1000} as log10(10)=1, log10(100)=2, log10(1000)=3. 
- <font  color='green'>endpoint</font>: {False} if end-points are not included, and {True} otherwise. 
- <font  color='orange'>base</font>: the base when applying log function. Default is 10.0.


<br>

np.geomspace(start, stop, num=50, endpoint=True)   
> generate a list of numbers where the distance between each element's exponent/power is constant, given base. <br> 
automatically detects the base of number being exponentiated on. <br>
np.geomspace(1, 32, 6): outputs {1, 2, 4, 8, 16, 32}; base is automatically detected to be 2. 

### Diagonal entries: 

np.diag(1D_array), np.diag(2D_array, <font  color='green'>k=0</font>): 
> if input array is one dimensional: return a 2D diagonal array whose entries are those in 1D array. <br>
if input array is 2D: return 1D array of 2D input array's diagonal elements. 
- <font  color='green'>k</font>: the diagonal adjustment factor. When $k>0$, diagonal will be shifted upward. 

<br>

Np.diagonal(ndarray, [<font  color='green'>diag_offset</font>, <font  color='orange'>axis1</font>, <font  color='orange'>axis2</font>])
> returns a ($n-1$)Darray: handles higher dimensional array’s diagonal element extraction. 
- <font  color='green'>diag_offset</font>: if $>0$, diagonal line will be moved upward for element extraction
- <font  color='orange'>axis</font>: selecting two axis to form matrices, for extracting diagonal elements. 

In [None]:
import numpy as np

# below is a conventional way for calculating dot products
a = np.arange(24).reshape(3, 4, 2) # an array representing 3 layers of 4 vectors, each has length 2; a1, a2, a3, a4
b = np.arange(8).reshape(4, 2) # 4 vectors each with length 2; b1, b2, b3, b4
c = np.dot(a, b.T) # has shape [3, 4, 4]
v_comp = np.diagonal(c, 0, axis1=-1, axis2=-2) # diagonal elements are correct dot products: a1&b1, a2&b2, a3&b3, a4&b4
print("v_comp: ")
print(v_comp)
print("c: ")
print(c)

# Array indexing: 

### using slices: 
Slicing is equivalent to python list slicing, but applied on multiple dimensions, separated by comma ",". 

In [None]:
import numpy as np
test_indexing = np.arange(1024).reshape(4, 8, 4, 8)
print(test_indexing[0:2:1].shape) # this only offers slicing at axis=0
print(test_indexing[..., 3:6:2, :].shape) # this slices only at third axis; "..." is used to occupy rest of the spaces
                             # warning: "..." can only be used at one side of slicing, otherwise numpy won't know which dimension to slice!!!
print(test_indexing[:, 2:6:2, 1:3, :].shape) # multiple axis slicing
print(test_indexing[:, 2:7:2, 1:3, :].shape) # multiple axis slicing
print(test_indexing[:, 2:-1:2, 1:3, :].shape) # multiple axis slicing

### Advanced indexing: 
np.indices(shape_tuple) -> np.array
> return a tuple of arrays, used for indexing. <br> Specifically, the returned array will have shape $[n, shape1, shape2, …shapen]$,  where “n” indicates the number of axes of "ndarray", and for each 'shape1, ….shapen', each element represents index of corresponding element along '1st axis, … nth axis'. 
See below code demo for how to use it appropriately
- shape_tuple: a python tuple indicating shapes of array for producing indices. 

Using coordinate indexes to extract specific elements. 

In [None]:
# visualize indices
import numpy as np
print(np.indices((2, 3))) # the upper array indicates the index of corresponding element at axis 0, and lower is the same for axis 1


In [None]:
import numpy as np

test_indexing = np.arange(1024).reshape(4, 8, 4, 8)
indices = np.indices((4, 8, 4, 8)) # this gives an array of size [4, 1024]

indices_taking = test_indexing[indices[0], indices[1], 
                        indices[2], indices[3]] # can print the result
print(np.count_nonzero(indices_taking == test_indexing)) # booleans are representable by "0, 1"
    # this line shows indices can help reproducing original array. 
indices_taking = test_indexing[indices[0], indices[2], 
                        indices[2], indices[3]]
print(np.count_nonzero(indices_taking == test_indexing)) # this shows the ordering of indices matters

test_indexing = np.arange(1024).reshape(32, 32)
print(test_indexing[[2, 4, 1, 15], [2, 1, 10, 19]]) # this gives an array of shape [4], where 
            # where elements are respectively: test_indexing[2, 2], test_indexing[4, 1], 
            # test_indexing[1, 10] and test_indexing[15, 19]
print(test_indexing[2, 2], test_indexing[15, 19])

print(test_indexing[[[2, 4, 1, 15], [2, 4, 1, 15]], [[2, 1, 10, 19], [2, 4, 1, 15]]])
    # creates an array of size [2, 4]; realize when indexing, the shape of indices must be 
    # broadcastable along all dimensions!!!

 # boardcasting
print("boardcast: \n" + str(test_indexing[[[2, 4, 1, 15], [2, 4, 1, 15]], [[2, 1, 10, 19]]]))
print(test_indexing[[[2, 4, 1, 15], [2, 4, 1, 15]], [[2], [4]]])
# shape inconsistency
print(test_indexing[[[2, 4, 1, 15], [2, 4, 1, 15]], [[2, 3]]]) # result in error

# Array Operations: 

### Shape and Axis: 




np.array.shape
> a tuple where length is the dimension of the array, and "shape[0]" gives outer most dimension's size. 

<br>

np.reshape(np.array, shape), np.array.reshape(shape)
> reshape "np.array" into given new shape. RETURNS a new array. <br>
- shape: a tuple where each element at index "i" corresponds to the size at dimension "i". 
    > Warning: the new "shape"'s product must match with number of elements in "np.array"

<br>


np.swapaxes(np.array, axis1, axis2), np.array.swapaxes(axis1, axis2)
> swap "np.array"'s two axes specified by "axis1, axis2". RETURNS a new array.
- axis1, axis2: two axes to make the swap, must be integer and must less than "len(np.array.shape)". 

<br>

np.transpose(np.array, new_axes), np.array.transpose(new_axies)
> permute the axes according to given "new_axes". RETURNS a new array. See below code for one example
- new_axes: a tuple of length "len(np.array.shape)", contains distinct integers all in range  $0\leq integer < len(np.array.shape) - 1$
>Warning: "np.transpose" and "np.swapaxes" act differently than "np.reshape"; <br>
"np.reshape" flattens array first before converting to newly-shaped array. However "np.transpose" and "np.swapaxes" preserves elements on the dimensions. See demo below to see their difference. 

<br>


In [None]:
import numpy as np

test_arr = np.arange(120).reshape(2, 4, 3, 5)
print(test_arr.shape)
print(test_arr.reshape(2, 4, 5, 3).shape) 
print(test_arr.shape) # these two lines shows reshape returns a new array!!!
a = test_arr.swapaxes(0, 2).swapaxes(1, 2)
b = test_arr.transpose(2, 0, 1, 3)
print(np.count_nonzero(a == b)) # transpose is more powerful than swapaxes when handling multiple axes swapping. 



# below demonstrates reshapping and transposing are not the same, even resulting in same shape
print(np.count_nonzero(test_arr.flatten() == test_arr.reshape(3, 2, 4, 5).flatten()))
print(np.count_nonzero(test_arr.flatten() == b.flatten())) # realize both arrays have the same shape!!!
# you may put more code and use debug console to see what the real arrays look like. 
test_arr_small = np.arange(32).reshape(4, 8)
print(test_arr_small)
print("reshape result: ")
print(test_arr_small.reshape(8, 4))
print("transpose result: ")
print(test_arr_small.transpose(1, 0))
# above visualization should make most things clear

np.flatten(np.array), np.array.flatten()
> Convert an "np.array" into a 1D array. 

<br>

np.array.T
> transpose last two dimensions of "np.array". 

<br>

np.squeeze()

<br>

np.expand_dims(np.array, axis)
> insert a new dimension at all dimensions in "axis" tuple. Newly inserted dimensions will have size 1. 
- axis: an integer or a tuple of integers containing which dimensions to insert. 

### Convert to array: 


np.asarray(list_like)
- list_like: can be anything ranging from python list, tuples, numpy array etc.

<br>

np.asmatrix(list_like)
> input "list_like" must be convertible to a 2D array -> matrix. Otherwise lead to error. 

<br>

np.asscalar(list_like)
> "list_like" can have multiple dimensions, but when "flattened", should have size "1" only

<br>

np.diagonal()
> see reference in "Array creation - Diagonal entries" section. 



### Array combination and splitting

np.concatenate((ndarray1, ndarray2, …), axis=None) -> np.array
> All ndarrays must have same “side length” at ALL non-concatenational axes; 
- axis: “axis=None” gives flattened combination of all input arrays; <br>
“axis=0” gives concatenation at outer-most dimension [concat, …]; <br>“axis=-1” gives concatenation at inner-most dimension […, concat]; 

<br>


np.stack(Union[ndarray, (ndarray1, ndarray2)], axis=0) -> ndarray <br>
>input1=ndarray: rearrange all ndarray[i]’s element along “axis”; <br>
input2=(ndarray1, ndarray2): concatenate two ndarrays along “axis”; 
- Input1 elaboration: suppose “axis=n”; then is equivalent to “ndarray.transpose(1, 2, ..., n, 0, n+1, ...)”
- Input2 elaboration: first pile two ndarrays to form a new dimension, then treat the new array as “input1” and perform same operation as above. The input axis can be “1 extra” to compensate the newly added dimension. 
> Warning: this method always pile up data to form a new dimension; 
	Requiring all inputs to have exact same shape, and “axis” only affects at last two dim; 
	Above warning doesn’t apply to hstack, vstack or dstack
- Note:  for 3D cases, could be convenient to use: "np.vstack((nparray1, nparray2))" to replace axis=0, "np.hstack((nparray1, nparray2))" to replace axis=1, and "np.dstack((nparray1, nparray2))" to combine two arrays, by creating a third dimension. 

<br>

Np.split(ndarray, Union[int, array[int]], axis="0 by default") -> List
> split the array into either equal lengths or by specified intervals, along axis. 
- args[1]: <br>
if “int” is given, will split the array evenly by “int many”, but requires “divisibility (a|b)”; <br>
if "array[int]", then will split the array by “index in array[int]”. Occurrence of same index result in empty ndarray along given dimension. (size is 0). Index exceeding array length will be discarded. 

<br>

Np.array_split(ndarray, Union[int, array[int]], axis=0)
> same as np.split(ndarray, int, axis=0), except ‘divisibility’ criterion is no longer required. 

In [9]:
import numpy as np

# demo of np.stack
test_1 = np.arange(30).reshape(2, 3, 5)
test_2 = np.arange(30, 60).reshape(2, 3, 5)
res1 = np.stack((test_1, test_2), 1)
res2 = np.stack((test_1, test_2), 0)
res3 = np.stack((test_1, test_2), 3)
res4 = np.stack(res2, 3)
# print("res1: ")
# print(res1.shape)
# print("res2: ")
# print(res2.shape)
# print("res3: ")
# print(res3.shape) 
# print("res4: ")
# print(res4.shape)# this is also a demo for input 1; the output shape is the same as res3; 

# demo of np.split, empty ndarray case; 
split_res = np.split(test_1, [2, 2, 6], axis=2) # slicing at same index result in empty array; 
# print(split_res)
# print("split result shape:")
# print(split_res[0].shape, split_res[1].shape, split_res[2].shape)

array_split_res = np.array_split(test_1, 3, axis=2)
# print(array_split_res)
# print(array_split_res[0].shape, array_split_res[1].shape, array_split_res[2].shape)
# invalid_split = np.split(test_1, 2, axis=2)

### Adding and Modifying Array Elements


np.delete(ndarray, obj, axis=None)   ->   np.array
- obj: a int or an array of int, as an index.     
- axis: when “None”, delete “flattened” array’s elements (returns “flattened” array)
> Warning: If index is out of bound at given axis, will raise error!!!

<br>

np.insert(ndarray, ind, value, axis=None) ->  np.array
- ind: index to insert (new element will have the index after insertion), a scalar or int array. 
    > ind cannot exceed number of elements along an axis (if this number is given, same as appending along that axis)
- value: if a scalar, will insert a row/col/one value depending on shape of ndarray; if an array list, will insert according to “axis”(requiring shape-consistency). 
- axis=None will first “flatten” ndarray, and return “flattened” one. 
> (np.insert considers type-casting: float->int, boolean -> int for ndarray[int])<br><br>
> Realizing: “value” will be inserted into every "ind" along that axis, and shape consistency will be preserved by adding same row/columns/…, require boardcastable!<br><br>Also realizing if “ind” has only one index, all “value” will be added at that same position, with 1st element taking the “ind” index. However if “ind” is not size “1” and shape is inconsistent with “value”’s shape, will result in error. The dimension of “ind” and “value” should both be “1”. 

<br>


In [28]:
import numpy as np
b = np.arange(8).reshape(2, 4)
# res1 = np.insert(b, [2], [10, 11], axis=0) # demo of shape inconsistency
res2 = np.insert(b, [2], [10, 11, 12, 13], axis=0)
# print(res2)
res3 = np.insert(b, [2], [[10], [11]], axis=1)
# print(res3)

Np.append(ndarray, value, axis=None) -> np.array
> Append "value" into input "ndarray".
- "value"'s shape must be the same with "ndarray" except the axis to insert.  
- "axis" is "None": will flatten both "ndarray" and "value" to perform append operation. 

<br>

np.unique(ndarray, return_index=False, return_inverse=False, return_counts=False, axis=None) -> (unique_array, ...)
> Return an array of unique elements in SORTED ORDER, and FLATTENED
- return_index: if "True": will also return each unique element’s first apparence in FLATTENED array. 
- return_inverse: if "Ture", will also return another index array having same shape as input "ndarray". 
    > Each element is the corresponding index of element in returned unique list. "unique_array[index]" will give original input in “ndarray”
- return_counts: if true, will return an array “count” where “unique_array[i]” appeared in “ndarray” for “count[i]” many times. 
- axis: refer to previous discussions


In [None]:
# Code demo of np.unique
import numpy as np

test_arr = np.array([[1, 2, 1, 3], 
                    [2, 1, 3, 4], 
                    [4, 2, 5, 1]])

unique_output = np.unique(test_arr, return_index=True, return_inverse=True)
print(unique_output[0], unique_output[1], unique_output[2]) # prints out "unique element array", "indexing array of first apparence", "inverse_indexing for reconstruction"
print(unique_output[0][unique_output[2]].reshape((3, 4)))

np.flip(ndarray, axis=None) -> np.array
> Revert elements according to axis
- axis: if "None", will revert elements in flattened array (but returned array has same shape as input)
    > If is an int, just revert by the indicated axis. <br>If a tuple of int, will revert by all those axes. 

<br>


np.flipud(ndarray) -> np.array  
> up and down, same as np.flip(ndarray, axis=0)
<br>

np.fliplr(ndarray) -> np.array  
> left and right, same as np.flip(ndarray, axis=1)<br><br>
> Note: Both "flipud" and "fliplr" also works on higher dimensional arrays, with “axis” shown as above(only applicable to first two axes…)

<br>

np.reshape(ndarray, (new_shape[0], new_shape[1], new_shape[2], ...) , order=’C’)
> Requiring new shape has the “SAME” number of elements as input. <br>
First flatten input "ndarray", then re-construct according to given new shape -> not equal to “transpose”(see explanation of "np.transpose" for further explanation)
- (int, int, ...) as new shape. First int is outer-most dimension's shape. 
- order: "C": filling inner-layers first; <br>
step 1: [[a, b]]; step 2: [[a, b], [c, d]]; <br>
"F": filling outer-layers first; <br> step 1: [[a], [b]]; step 2: [[a, c], [b, d]]

<br>



# Sorting, Searching

np.sort(ndarray, axis=-1, kind=None, order=None) -> np.array
> Return a sorted copy of “ndarray”
- kind: {‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}
- axis: if "-1": sort along last axis; if "None": flatten to sort
	> Sorting along last axis doesn’t consume extra space when sorting; <br>
	(might be useful if swapaxis is used before sorting for space saving?)

<br>

np.argsort(ndarray, axis=-1, kind=None, order=None) -> np.array
> Return INDICES that would sort “ndarray”. See below demo for how to use this method


In [None]:
import numpy as np

test_arr = np.array([[1, 2, 1, 3], 
                    [2, 1, 3, 4], 
                    [4, 2, 5, 1]])

indice = np.argsort(test_arr, axis=None)
flatten_sorted = test_arr.flatten()[indice]
print(flatten_sorted)

<br>

np.array().sort(axis=-1, kind=None, order=None) -> None: 
> IN-PLACE sorting of “ndarray”

<br>

np.partition(ndarray, kth, kind='introselect') -> np.array
> Return a new array with partition as follows: <br>
	&emsp; The element at index “kth” of output array is the element in “ndarray[kth]” after sorting; <br> 
	&emsp; Any elements at “index” < “kth” are all less than “ndarray[kth]”; <br>
	&emsp; Same as “one step splitting” in merge sort <br>
- ‘introselect’: the algorithm for performing partition; 
	> Basically it performs similar technique as quicksort/heapsort depending on efficiency of which algorithm is faster. 


np.(nan)argmax(ndarray, axis=None, out=None, ...) -> np.array
> Return the INDICES of maximum values along an “axis” for “ndarray”. 
- axis:	if “axis” is none, will flatten “ndarray” and return the flattened max’s indice
- out: if given, the output of this method will be filled into “out” ndarray. 
    > It must have the appropriate shape to contain the result. 
> “nan” values are ignored if “(nan)” is included; otherwise “ValueError” is raised if “ndarray” is all-nan; 

<br>

np.(nan)argmin(ndarray, axis=None, out=None, ...) -> np.array
> Return the INDICES of MINimal values along an “axis” for “ndarray”. <br> See "np.argmax"'s parameter explanation. 




In [None]:
# demo of np.argmax
import numpy as np

test_arr = np.array([[1, 2, 1, 3], 
                    [2, 1, 3, 4], 
                    [4, 2, 5, 1]])
print(np.argmax(test_arr, axis=0))
print(np.argmax(test_arr, axis=1))
print(np.argmax(test_arr, axis=None))

np.nonzero(ndarray) -> np.array: 
> Return all indices, following the form: array(dim0_indices, dim1_indices, ….) where the indexed elements are non-zero. <br> See below demo for how to use returned values. 

In [None]:
# demo of np.nonzero output usage

import numpy as np
test_arr = np.array([[1, 2, 1, 3], 
                    [2, 1, 3, 4], 
                    [4, 2, 5, 1],
                    [0, 1, 0, 2]])
indices = np.array(np.nonzero(test_arr))
print(test_arr[indices[0], indices[1]])
print(test_arr.flatten())

np.where(condition, x, y) -> np.array
> Returns a new "ndarray", where when “condition” is satisfied at index "i", corresponding output at index "i" will be “x”’s corresponding element. Otherwise the output’s corresponding index "i" will be “y”’s corresopnding element. <br>
Below demo shows what "condition" can be: boolean expression or a True/False array(useful for masking). 

<br>

np.clip(ndarray, a_min, a_max, out=None) -> np.array
> Return a new "ndarray" where all values in the array are within "[a_min, a_max]". 
- out: "np.array" instance. If provided, the outputs will be placed into this array. If "out=ndarray", this will be in-place operation. 
> Sometimes "np.clip()" is more convenient than "np.where()" when the conditions are easy. See below demo

In [None]:
# demo of np.where and np.clip
import numpy as np

# task: eliminate all elements which are not in [0, 10]
test_arr = np.array([[-1, -2, 0, 3, 1, 5, 10, 12], [10, 2, 1, -3, 99, 100, 21, 0.01]])
res_where = np.where(test_arr > 0, test_arr, np.zeros_like(test_arr))
res_where = np.where(res_where <= 10, res_where, np.ones_like(test_arr) * 10)
print(res_where)

res_clip = np.clip(test_arr, 0, 10) # realize both intervals are "inclusive!!!"
print(res_clip)

# Functional Programming: 
Apply a function to all elements of an array, probably along an axis

np.apply_along_axis(func, axis, ndarray, func_args) -> np.array
> “func” must accept 1D array as input; once this method is called, it will extract ndarray along axis as 1D array and feed to “func”. <br>
If “func” returns a higher dimensional array for each element, the position of 1D array in returned array will be replaced by the higher dimensional array. 
- axis: int
- func_args: more arguments for func_name


<br>

np.apply_over_axes(func, ndarray, axis) -> np.array
> implements operations on axes one by one in “axis” array. <br>
NOT equivalent to applying "np.apply_along_axis" along multiple axes CUMULATIVELY, as the constraints of "func" is different!!!(see below demo) 
- “func” must accept numpy array and integer axis as only two inputs; the output of the function must NOT increase the dimension of "ndarray". 
- axis: an array of integers, specifying axes to apply "func"


In [8]:
import numpy as np

def test_func(arr, multiplier):
    """raise the dimension of 'arr' by 1, and repeat 'multiplier' many times."""
    return np.repeat(arr[None], multiplier, axis=0)

test_arr = np.arange(6).reshape(2, 3)
res = np.apply_along_axis(test_func, 1, test_arr, 3) # 3 is "multiplier", the function's argument
print(test_arr.shape, res.shape) # realize the extra "3" of res.shape, which raise dimension up; 

res2 = np.apply_over_axes(np.sum, test_arr, [0, 1]) # usually quite convenient for applying an operation
                                                    # CUMULATIVELY among multiple axes
print(res2)
# res2 = np.apply_over_axes(test_func, test_arr, [0]) # leads to an error as 
#                                                     # dimension of output is increased


(2, 3) (2, 3, 3)
[[15]]


<br>

np.vectorize(func)  ->   A function "func_v" which could be applied on vectors; 
- “func” can only be applied on scalars. When need to use "func_v", simply call it and give ndarray or scalars as input. 
    > if “func" allows multiple inputs, then each input is either a scalar, or ndarray ALL with BOARDCASTABLE shape (see code demo below)
> Note: vectorized function is iterated using for loop, thus not enhancing performance;


In [None]:
import numpy as np

def myfunc(a, b):
    if a > b: 
        return a - b
    else: 
        return a + b

func_v = np.vectorize(myfunc)
print(func_v([1, 2, 3, 4], 3))
print(func_v([1, 2, 3, 4], [2, 1, 3, 3])) # exactly the same shape: element-wise subtraction
print(func_v([[1, 2, 3, 4], [2, 3, 4, 5]], [2, 1, 3, 3])) # boardcastable shape, apply 2nd input to multiple rows of 1st input. 
# print(func_v([1, 2, 3, 4], [4, 5])) # lead to an error

np.piecewise(ndarray, [boolean-condition1, boolean-condition2, …], [func1, func2,…]) -> np.array
> returned array has same dimension as input "ndarray".

> Boolean-condition must be based on “input-variable”, and usually use inequalities; <br>
    (useful for creating activation/piecewise function)


# Advanced Indexing using Indices and Masks

### Indexing: 

np.where(condition, array1, array2) -> np.array
> refer to previous explanation in "sorting, searching" section. 

<br>

np.indices((shape1, shape2, shape3, …shapen)) -> np.array
> refer to previous section: "array indexing: slices" for explanation. 

<br>


np.ravel_multi_index(ndarray, shape) -> np.array
> Return the represented index of flattened (shape) array. This method treats "ndarray" as output of "np.indices(shape)", as described in 'array-indexing' part and returns corresponding flattened (shape) array’s index. See code demo below. 
- shape: a tuple specifying array's shape. 

> For the example below: 22 = 3x6+4, [3,4]; 41=6x6+5, [6,5], … <br>
One potential constraint is, the indexing array’s index cannot exceed (shape)’s axis length; e.g, [8, 6] fails because 8>7, where 7 is first axis’s length. 


In [None]:
import numpy as np

high_dim_index = np.array([[3, 6, 6], [4, 5, 1]])
test_arr = np.arange(42).reshape(7, 6) # if flattened, this will be the index of 1 dimensional array with 42 elements. 
flattened_index = np.ravel_multi_index(high_dim_index, (7, 6))
print(flattened_index) # 
print(test_arr[high_dim_index[0], high_dim_index[1]]) # you could see the elements printed out are exactly the same!!!

<br>

np.unravel_index(1Darray, shape) -> np.array
> return a ndarray of indices on arrays with shape "shape". <br>
This method does exactly the opposite of "np.ravel_multi_index"

<br>



np.mask_indices(mask_shape, mask_function, k) -> np.array
> Returns indices to access values NOT being masked. <br>Used for generating masked-indexes for array creation. 
- “mask_shape”: generated boolean mask has shape (mask_shape, mask_shape)
- "mask_function": func(input_array, k) returns a boolean array. 
    > mask_indices applied "mask_function" on "np.ones()" array, and returns nonzero element's indices -> "mask_func(np.ones((n, n)), k)". Thus any method applied on numpy array and for indexing could work. 
- k: argument for "mask_function". 


<br>

np.diag_indicies(int, optional[ndim]) -> np.array
> Return indexing array with shape (ndim, 1, int). <br>
Used for indexing the diagonal elements of an array having (ndim, int, int) shape. 


### Extracting elements according to advanced indexing

np.take(ndarray, indices, axis=None) -> np.array
> Return a new ndarray being indexed by given “indices”. <br>
Equivalent to ndarray[:, :, …:, indices, :, :…] where “indices” locates at “axis” position. <br>
When "axis=None", "ndarray" will be flattened before using. <br>
See below for a code demo. 

In [None]:
import numpy as np

test_take = np.arange(120).reshape(1, 2, 3, 4, 5)
print(test_take.shape)
take_res = np.take(test_take, [0, 1], axis=2)
print(take_res.shape)
take_res = np.take(test_take, [[0, 1, 2], [2, 3, 1]], axis=3)
print(take_res.shape)
print(take_res[..., 1, 1, :] == test_take[..., 3, :]) # this demonstrates the indexing's technique; 
        # 1, 1: gives "3" in the indexing array [[0, 1, 2], [2, 3, 1]]; 
take_res = np.take(test_take, [[0, 1, 4, 2, 1], [2, 3, 1, 1, 3]], axis=4)
print(take_res.shape)

np.choose(indices, array_choices) -> np.array
> Return a newly created array with elements from "array_choices", based on "indices". <br>
“indices” needs to have same length as outer-dimension of “array_choices”. <br>
Output will have same shape as “indices” <br>


> Last dimension of "array_choices" is always treated as the set of arrays to choose from; <br>
All elements in "indices" must not exceed "array_choices.shape[-1]"!<br>
Shape broadcastability between "indices.shape[:-2]" and "array_choices[:-2]" (both excluding last dimension) is demonstrated in the code below. <br>

> Note: this function is not easy to understand; if feel stuck can just skip it, or perform more experiments. 


In [None]:
import numpy as np

indices_arr = np.array([[0, 1, 2], [1, 2, 3]])
array_choices = np.arange(12).reshape(4, 3)
print(np.choose(indices_arr, array_choices))
print(np.choose(indices_arr, array_choices) == array_choices[indices_arr, [[0, 1, 2], [0, 1, 2]]])
# above shows: choices are taken by treating each column as a complete array for choosing elements from. 


array_choices = np.arange(24).reshape(4, 2, 3)
print(np.choose(indices_arr, array_choices))
print(np.choose(indices_arr, array_choices) == array_choices[
    indices_arr, [[0, 0, 0], [1, 1, 1]], [[0, 1, 2], [0, 1, 2]] # this shows shape broadcasting occurred at axis 1: [[0, 0, 0], [1, 1, 1]]
])
print(np.choose([2, 1, 1], array_choices) == array_choices[
    [[2, 1, 1], [2, 1, 1]], [[0, 0, 0], [1, 1, 1]], [[0, 1, 2], [0, 1, 2]] # this demonstrates shape broadcasting at axis 0
])

np.select(condlist, choicelist, default=0) -> np.array   
> Return a new array extracted according to condlist. 
- "condlist" is an boolean array of conditions, each applied on elements of "choicelist". 
- "choicelist" is a list of ndarrays. The ordering of choicelist’s elements affects ordering of 
evaluation. 
- "default" is the value to fill if no elements at given index from any choice-ndarray satisfies 
condlist. 

> In practise, if "choicelist[0][0]" doesn’t satisfy any condition in "condlist[0]", will move 
on to "choicelist[1][0]", …and so on. As long as one "True" condition is satisfied, will fill in output array at index "[0]" as the first found element, and move on to "choicelist[0][1]" and "condlist[1]"...

> Broadcastability: requirements on shape boardcasting is similar to that of “np.choose”; require “condlist.shape” boardcastable with “choicelist”’s last few dimensions


In [None]:
import numpy as np

choice_list = np.arange(9).reshape([3, 3])
cond_list = np.array([[False, False, False], [True, False, False], [True, True, True]])
print(np.select(cond_list, choice_list))
# print [3, 7, 8] instead of [6, 7, 8]: due to occurrence of first "True" at cond_list[1, 0]
# print(np.select(cond_list[:-2], choice_list)) # leads to error due to shape NOT BOARDCASTABLE

np.compress(condition, ndarray, axis=None) -> np.array
> Choosing elements by applying "condition" array along "axis" of "ndarray". 
- "condition": MUST BE 1D !!! a boolean “1darray”; 
    > if condition has less length than ndarray.shape[axis], will only output “len(condition)” dimensional ndarray along “axis”. 
- "axis": the axis where condition would be applied to extract elements out; 
    >"axis=None": will flatten ndarray first


In [None]:
import numpy as np

test_arr = np.arange(9).reshape(3, 3)
condition = np.array([False, True, True])
print(test_arr)
print(np.compress(condition, test_arr, axis=0))
print(np.compress(condition, test_arr, axis=1))
print(test_arr.flatten())
print(np.compress(condition, test_arr, axis=None)) 

### Adding & replacing elements according to advanced indexing


np.place(ndarray, mask, vals) -> None
> Replaces "ndarray"’s elements with "vals" according to "mask"-conditions;
- "mask" must be a boolean expression applied on “ndarray”, SAME size as “ndarray”
- "vals" will be FLATTENED, and replaced consecutively and cyclic
    > 1st occurrence of “True” on mask: replace with val[0], …. And a cycle (starting from "val[0]" again) if all elements in “vals” has been used for replacing once, twice, ...


<br>

np.put(ndarray, indices, vals) -> None
> Replaces "ndarray"’s element into “vals” based on "indices"
- "indices" can be indexing array or boolean array
    > warning: indices are for flattend array only!!!   
- "vals" will be replaced consecutively and cyclic, and flattened


In [None]:
import numpy as np

input_array = np.arange(9).reshape([3, 3])
print(input_array)
np.put(input_array, [3, 5], [2, 1])  # recall [3, 5] is the indices array for FLATTENED array!
print(input_array)

np.putmask(ndarray, mask, values) -> None
> Replace elements in "ndarray" with "values" according to "mask" condition. 
- "mask":  boolean array, must have SAME shape as "ndarray"
    > A value will be replaced only when element-mask is “True” 
- "values": can be a scalar or array-like(will be flattened); 
    > values will be replaced consecutively and cyclic. 


# np.ma package: mask operations; <br> import numpy.ma as ma

<center><img src="reference_img/mask_properties.png" width=300></center>
<!---
<center><img src="mask_properties.png" width=300><img src="mask_properties.png" width=500></center>
above inserts 2 images, and justs both to center. 
-->

Above image shows some properties of masked array OBJECT/INSTANCE. 
- Elements with “--” in “data” is where mask is applied (“True” in mask). 
- “fill_value” indicates what value to give when convert back to original array. 

<br>

Most operations that can be applied on ordinary "np.array" could also be applied directly on "ma.array". Below is an non-exhaustive list of methods also applicable on "ma.array": 
- Creation: ma.empty(), ma.empty_like(), ma.ones(), ma.ones_like(), ma.zeros(), ma.zeros_like()
- Inspection: ma.nonzero(), ma.shape()
- Shape-related: ma.reshape(), ma.array.flatten(), ma.array.reshape(), ma.swapaxes(), ma.transpose(), ma.array.swapaxes(), ma.array.transpose(); <br> ma.expand_dims(), ma.squeeze(), ma.stack()/dstack()/hstack()/vstack(), ma.concatenate()
- Sorting and Locating: ma.argmax(), ma.max(), ma.argsort(), ma.sort()



### Array Creation and Examination
ma.array(data, mask=False, fill_value=None) -> MaskedArray
> Creates a masked array object. 
- "mask": must have same shape as "data" or a single value. "True" indicates the current object is masked. 
- "fill_value": when unmasking, the value to fill at masked indexes. 

<br>

ma.MaskedArray(data, mask=False, fill_value=None) -> MaskedArray
> Returns a masked array with all properties shown at beginning of this section. 

<br>

ma.masked_all(shape) -> MaskedArray
> Return a masked array with shape “shape” where all elements are masked. 
- "shape" is a tuple of integers. 
> Cannot assign "fill_value"!!!

<br>

ma.MaskedArray.all() -> Boolean
> Returns "True" if all elements in "ma.MaskedArray" are masked. 

ma.MaskedArray.any() -> Boolean
> Returns "True" if there is at least one element in "ma.MaskedArray" being masked.

<br>

ma.count_masked(ma.array, axis=None)   ->  Union[int, np.array[int]]
> Count how many elements are masked. 
- "axis": specifying the axis to perform the counting. 
    > If "axis=None", will count flattened version and return a scalar.<br>
    If not, will return a “ndarray” where each element indicates how many “ma.array”’s elements are masked along that axis. 

<br>



### Mask out values

ma.mask_cols/rows/rowcols(ma.array) -> ma.array
> Return a modified ma_ndarray as follows: <br>
&emsp; If there is a mask at any element of that column/row, then all entries of that 
column/row/row&col will be applied with a mask. See below demo for better understanding.
- "ma_array" can ONLY be 2D!

In [None]:
import numpy as np
import numpy.ma as ma

mask_array = ma.array(np.arange(9).reshape(3, 3), mask=[[False, False, False], [True, False, True], [False, False, False]])
col_mask = ma.mask_cols(mask_array)
print(col_mask) # the 1st column and 3rd column contains at least 1 True in mask; 
row_mask = ma.mask_rows(mask_array)
print(row_mask)# the 2nd row contains at least 1 True in mask
rowcol_mask = ma.mask_rowcols(mask_array)
print(rowcol_mask) # combination of both masks above

ma.masked_invalid(np.array) -> ma.array
> mask a location of “np.array” whenever “np.NAN" or "np.Inf” occurs

<br>



ma.masked_equal/greater/greater_equal/less/less_equal/not_equal(np.array, value) -> ma.array
> Return a "ma.array" according to the following: <br> 
<br>
    > A mask is applied to a value in “np.array” whenever the element in "np.array" is “equal/greater/greater_equal/less/less_equal/not_equal” to the “value”. 

<br>

ma.masked_value(input_array, value, copy=True) -> Union[None, ma.array]
> mask entries from “input_array” if the entry is equal to “value”. 
- "value": can be either a scalar, or an “np.array”. 
    > if "np.array", the shape must be BOARDCASTABLE with 'last few dimensions' of "input_array". 
- "copy": whether to return a new copy of masked array. If "False", will perform in-place masking. 


In [None]:
import numpy as np
import numpy.ma as ma

input_arr = np.array([[0, 0, 2], [100, 200, 300]])
res1 = ma.masked_values(input_arr, [100, 200, 300])
print(res1)
res2 = ma.masked_values(input_arr, 0)
print(res2)
res3 = ma.masked_greater(input_arr, 5)
print(res3)

ma.masked_inside/outside(ndarray, v1, v2) -> ma.array 
> Return a "ma.array" as follows:  <br> 
&emsp; If the current element in “ndarray” is inside/outside interval “[v1, v2]”, it will be masked.  

<br>


ma.masked_where(condition, ndarray) -> ma.array
> Return a ma_ndarray as follows: 
<br> &emsp;	If the current element in “ndarray” meets with ALL conditions in “condition”, it will be masked. 

- “condition”: a 1D array, with expressions producing "True/False". Applied on “ndarray”. 
	> Masking an array 'conditioned' on another is possible (usually the index). See below demonstration. 


In [None]:
import numpy as np
import numpy.ma as ma

a = np.array([1, 2, 4, 100, 200])
b = np.array([10, 11, 19, 12, 0])
print(ma.masked_where(b > 12, b))
print(ma.masked_where(a >= 100, b)) # realize this masks the indexes. 
c = np.repeat(b[None], repeats=3, axis=0)
# print(ma.masked_where(a >= 100, c)) # this gives shape error: both arrays of masked_where must have same input!!!

# Linear Algebra <br> import numpy.linalg as lin


### Multiplications

np.dot(a, b) -> np.array
> Return dot product of two arrays as a new array. 
- “a” and “b” must satisfy shape restriction for matrix/vector multiplication. 
    > Only multiply last two axes
> Note: using “@” (a @ b) is preferred for “Matrix multiplication”<br>
&emsp; (supports scalar multiplication as well)

<br>

np.vdot(a, b) -> np.array
> Return dot product of two vectors <br> Doesn’t work on matrix-vector multiplication
<br>

> Note: if “a, b” have same shape but not 1D: will flatten both “a, b” to vectors!!!


<br>

np.inner(a, b)   ->  np.array
> Return inner product of two arrays, i.e,  $(a1*b1+a2*b2+…)$

<br>

np.outer(a, b)   -> np.array 
> Returns outer product of two arrays    (two vectors   ->  one matrix)<br>
	Useful for creating COVARIANCE MATRIX, where both input vectors are the difference of each random value with their mean. 

<br>

np.matmul(a, b)    -> np.array
> Returns matrix product of two arrays. <br>
Require "a", "b" satisfy shape constraints, and will perform matrix multiplication. 
<br>

> Equivalent to “a @ b”

<br>

np.multiply(a, b)  ->  np.array
> Return elementwise multiplication of two matrix/vector/…<br>
Requires “a, b” to have same shape


In [None]:
# this is a demo for multi-dimensional linear algebra methods. 
# for multi-dimensional arrays, the last dimension is treated as the vector, and multiplication method
# will be a cartesian product.
import numpy as np
import numpy.linalg as lin

a = np.arange(8).reshape(4, 2)
print(np.inner(a, a)) # realize diagonal are the vector multiplication with themselves. 
print(np.diag(np.inner(a, a))) # useful for calculating vector norm. 
print(np.outer(a, a)) # realize diagonal elements are perfect squares; 
print(np.diag(np.outer(a, a))) # useful for determining variance, instead of covariance
# use np.diag to extract desired outputs out. 


lin.multi_dot(arrays)   ->   np.array
> Return matrix-multiplication of multiple vectors in “arrays”
- "arrays": a tuples of vectors/matrix for multiplication
    > All arrays in "arrays" tuple must be 2D except 1st and last, treated as row-vec and col-vec respectively

<br>

lin.matrix_power(a, n)    -> np.array  
> Return the result of matrix multiplication of “n” many “a”. 
- “a” is required to be a square matrix
- If “n” is 0 will return identity

### Special_values

lin.eig(a) -> (eigenvalues, eigenvectors)
- “a” has shape (..., n, n)
- "eigenvalues" has dimension (..., n)
- "eigenvectors" has dimension (..., n, n), and are normalized. 
> Note: "eigenvector[..., :i]" corresponds to "eigenvalue[..., i]" 

<br>

lin.norm(a, axis)   ->   Union[int, np.array]
> Return norm of a matrix / vector, depending on input configurations. 
- axis: if an integer: calculate vector norm along given axis; <br> &emsp; if 2 ints in a tuple will calculate the matrix norm; <br> &emsp; if 'None', "a" must be either 1D or 2D. 
- a: input multi-dimensional array. 

<br>

lin.det(a)   -> np.array 
> Return square matrix “a”’s determinant. 
- "a" must have shape (..., n, n). 
- Output "np.array" will have shape (...)

<br>

lin.matrix_rank(M)  -> np.array
>   Return rank of matrix M
- "M": has shape (..., m, n)
- Output "np.array" has shape (...) 

<br>

lin.trace(a, offset=0, axis1=0, axis2=1)   ->  np.array
> Return the sum along diagonal of array(definition of matrix trace)
- "offset": if $>0$: take upper-shifted diagonal. 
- "axis1"&"axis2": specifies 2 axes for which matrices to take from. 

<br>

lin.inv(a)   ->   np.array
> compute inverse of matrix “a”. 
- "a" must have shape (..., n, n)
- Output "np.array" has shape (..., n, n)


### Linear Solver
Recall linear equations can be written in matrix multiplication form. 

lin.solve(a, b)   ->  np.array
> solve equation ax=b and return “x”
- “a, b” are variable matrix and vector, respectively. 
> Need to ensure shape consistency of matrix equations. Broadcasting is not supported, and "a" must be non-singular square matrix. 

<br>

lin.lstsq(a, b)    ->   np.array
> Find a best fit line for y=mx+c, as follows: <br>&emsp;
Given a set of points{(x, y)}, generate “a” as follows: a = [x 1] (last col contains only 1),
<br>&emsp;
and solves equation  “y = ap”   where p is: [[m], [c]], by letting “a” = x, “b” = y. <br>

>Returns “p” containing “m” and “c” for best fit line. 


# Mathematical operations: 


### Element-wise: 


- Triginometry: <br> np.sin(np.array), np.cos(np.array), tan, arcsin, arccos, arctan, sinh, cosh, tang, arcsinh, arccosh, arctanh.<br>

- Exponential & logarithmic: <br> np.exp(), exp2(), np.log(), log10(), log2()<br>

- Degree & radian conversion: <br> np.degrees(), np.radians()<br>

- Floor & Ceil: <br> np.floor(), np.ceil()<br>

- Reciprocal: np.reciprocal(): returns 1/x, elementwise. <br>

- np.sqrt(): square root <br>

- np.absolute(): absolute value; <br>

- np.sign(): "-1" for negative values and "1" for positive values, elementwise <br>

### Single Array Operations: 
Usually lead to dimension reduction. 

np.(nan)max(np.array, axis=None, keepdim=False), np.(nan)min(np.array, axis=None, keepdim=False) -> np.array
> Extract the max/min along "axis" of "np.array".
- axis: cane be integer OR array of integers, possible to perform multi-dimensional operations
- keepdims: if set to "True", will collapse the "axis" into size of "1". If 'False', the "axis" will all be eliminated. 


np.sum(np.array, axis=None, keepdims=False), np.prod(np.array, axis=None, keepdims=False) -> np.array
> perform summation or product along "axis". 
<br> &emsp; if "nan" exist, will lead to "nan" result. Should use nansum/nanprod: see below. 

<br>

np.cumsum(np.array, axis), np.cumprod(np.array, axis) -> np.array
> perform cumulative sum along axis, but: <br> &emsp; every element along the required "axis" in the resulting array will be the sum/product of all previous elements. See below demo. 

<br>

np.nansum(np.array, axis=None, keepdims=False), np.nanprod(np.array, axis=None, keepdims=False) -> np.array
> perform exactly the same operation as above two functions, except "nan" values will be ignored. 

<br>

np.nancumsum(np.array, axis), np.nancumprod(np.array, axis) -> np.array

In [None]:
# demo of np.cumsum and np.cumprod
import numpy as np

test_arr = np.arange(8).reshape(2, 4)
print(np.cumsum(test_arr, axis=1))
# 9=4+5, 15=4+5+6, ...
print(np.cumprod(test_arr, axis=1))
# 20=4*5, 120=4*5*6

### Multiple Array Arithmetic Operations: 
Require shape broadcastability between two arrays. 

np.add/multiply/divide(array1, array2) -> np.array
> Adding/Multiplying/Dividing array1 and array2's elements ELEMENTWISE. <br> &emsp; Require shape broadcastability

<br>

np.power(base_array, power_array) -> np.array
> Create new array of exponentials with base from "base_array" and raise power to "power_array". <br> &emsp; Following shape broadcastability, and applied elementwise. 


# Random Package

np.random.seed(x) -> None
> initialize random state to integer "x". 

<br>

np.random.Generator.integers(low, high=None, size=None, endpoint=False) -> np.array
> Generate integers following a uniform distribution, from range "low, high"; 
- "high": if “high” is not given, digits will be generated from zero to “low”; 
- "size": tuple of output ndarray shape; will generate enough integers to fill the ndarray; 
    > if not given, will generate a single integer; 
- "endpoint": if True, “high” will be possible to generate; in other words, inclusive. 

<br>

np.random.Generator.random(size=None, ) -> np.array
> Generate float points in the interval [0.0, 1.0), following a uniform distribution; <br> &emsp; can use shift-and-scale to get any desired interval; 

# Statistics: 

np.median(input_arr, axis=None, overwrite_input=False, keepdims=False) -> np.array
> return the median of "input_arr" along "axis". 
- "axis": as before, can also be tuple of integers. 
- "keepdims": as before, if 'True", will collapse assigned axes in "axis" into size of 1, instead of completely eliminate the "axis" axes. 

<br>

np.average(input_arr, axis=None, weights=None, keepdims=False) -> np.array
> Calculating average according to optional "weights".
- weights: for calculating weighted average. 
	> Can be 1D array or with same shape as “input_arr”. Realizing the 1d shape must match with “input_arr AND axis”. 

<br>

np.mean(input_arr, axis=None, keepdims=False, where=[]) -> np.array
> Calculate equally-weighted average of “input_arr”. 
- "where": a boolean np.array having broadcastable shape as "input_arr". 
	> If an element is 'False' in "where", the element will NOT be included in calculation. 

<br>

np.std(input_arr, axis=None, ddof=0, keepdims=False, where=[]) -> np.array
> Calculate AXIS-WISE elementwise standard deviation along an/multiple "axis" of "input_arr"
- ddof: when calculating standard deviation, the denominator has the form “N-ddof”, where “N” is the sample amount. One example of using “ddof” is 1, “N-1”; 

<br>

np.cov(input_arr, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None,) -> np.array
> Calculate the covariance of "input_arr", by treating either row/col as random variable. 
- "input_arr": 1D or 2D array. 
- "rowvar": When “rowvar” is True, will treat each row as a single variable, and each element in the row (column-wise) as an observation. (if False, then opposite.)
- "bias": if 'True', the denominator of calculating std is 'N', instead of 'N-1' when "bias=False"
- "fweights": frequency weights 1D vector; (currently not commonly used)
- "aweights": 1D observation vector weights; (currently not commonly used)
