### [numpy](https://numpy.org/doc/stable/index.html)

#### edited version from [@KeithGalli](https://github.com/KeithGalli/NumPy)


In [75]:
import numpy as np

# Overview

## Array Creation
`arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity, linspace, logspace, mgrid, ogrid, ones, ones_like, r_ , zeros, zeros_like`

## Conversions
`ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat`

## Manipulations
`array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit, hstack, ndarray.item, newaxis, ravel, repeat, reshape, resize, squeeze, swapaxes, take, transpose, vsplit, vstack`

## Questions
`all, any, nonzero, where`

## Ordering
`argmax, argmin, argsort, max, min, ptp, searchsorted, sort`

## Operations
`choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put, putmask, real, sum`

## Basic Statistics
`cov, mean, std, var`

## Basic Linear Algebra
`cross, dot, outer, linalg.svd, vdot`

### The Basics


In [76]:
a = np.array([1, 127, 3], dtype='int8')  # 8-bit integer (2's complement)
print(a)


[  1 127   3]


In [77]:
# 2d array
b = np.array([[9.0, 8.0, 7.0], [6.0, 5.0, 4.0]])
print(b)


[[9. 8. 7.]
 [6. 5. 4.]]


In [78]:
# Get Dimension
a.ndim


1

In [79]:
# Get Shape
b.shape  # ex. (2,3) is 2 rows and 3 columns


(2, 3)

In [80]:
# Get Shape
a.shape  # ex. (2,3) is 2 vectors with 3 elements each


(3,)

In [81]:
# Get Type
a.dtype


dtype('int8')

In [82]:
# Get Size
a.itemsize  # length of each element in bytes


1

In [83]:
# Get total size (in bytes)
a.nbytes


3

In [84]:
# Get number of elements
a.size


3

### Accessing/Changing specific elements, rows, columns, etc


In [85]:
a = np.array([[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14]])
print(a)


[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]]


In [86]:
# Get a specific element [r, c]
print("v1 :", a[1, -2])
print("v2: ", a[1][-2])  # normal python list


v1 : 13
v2:  13


In [87]:
# Get a specific row
a[0, ::2]


array([1, 3, 5, 7])

In [88]:
# Get a specific column
a[:, 2]


array([ 3, 10])

In [89]:
# Getting a little more fancy [startindex:endindex:stepsize]
a[0, 1::2]  # or a[0,1:-1:2]


array([2, 4, 6])

In [90]:
a[1, 5] = 20

a[:, 2] = [1, 2]
print(a)


[[ 1  2  1  4  5  6  7]
 [ 8  9  2 11 12 20 14]]


\*3-d example


In [91]:
b = np.array([[[1, 2, 1], [3, 4, 2]], [[5, 6, 3], [7, 8, 4]]])
print(b)
b.shape


[[[1 2 1]
  [3 4 2]]

 [[5 6 3]
  [7 8 4]]]


(2, 2, 3)

In [92]:
# Get specific element (work outside in)
b[0, 1, 1]


4

In [93]:
print(b[0, 1, :-1:])


[3 4]


In [94]:
# replace
b[0, 1, :-1:] = [0, 0]
b


array([[[1, 2, 1],
        [0, 0, 2]],

       [[5, 6, 3],
        [7, 8, 4]]])

### Initializing Different Types of Arrays


In [95]:
# All 0s matrix
np.zeros((2, 3))  # use shape as a tuple


array([[0., 0., 0.],
       [0., 0., 0.]])

In [96]:
# All 1s matrix
np.ones((4, 2, 2), dtype='int32')


array([[[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]]])

In [97]:
# Any other number
np.full((2, 2), 99)


array([[99, 99],
       [99, 99]])

In [98]:
a = np.array([[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, 14]])


In [99]:
# Any other number (full_like)
np.full_like(a, 99)


array([[99, 99, 99, 99, 99, 99, 99],
       [99, 99, 99, 99, 99, 99, 99]])

In [100]:
# Random decimal numbers
np.random.rand(4, 2)


array([[0.99901665, 0.77558042],
       [0.72170968, 0.61757221],
       [0.60367237, 0.15773637],
       [0.55332889, 0.76321466]])

In [101]:
# Random Integer values
np.random.randint(-4, 8, size=(3, 3))


array([[ 3, -3, -3],
       [ 4, -1,  7],
       [ 4,  7,  2]])

In [102]:
# The identity matrix
np.identity(5)


array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [103]:
# Repeat an array
arr = np.array([[1, 2, 3]])
r1 = np.repeat(arr, 3, axis=0)
print(r1)


[[1 2 3]
 [1 2 3]
 [1 2 3]]


In [104]:
arr2 = np.array([[1, 2],
                [3, 4]])
r2 = np.repeat(arr2, 2, axis=0)
print("axis=0:\n", r2)
r2 = np.repeat(arr2, 2, axis=1)
print("axis=1:\n", r2)


axis=0:
 [[1 2]
 [1 2]
 [3 4]
 [3 4]]
axis=1:
 [[1 1 2 2]
 [3 3 4 4]]


##### Axis Concept in Numpy

The concept of axis in numpy is the direction along the rows or columns. For example, in a 2D array:

- `axis 0` is the direction along the rows.
- `axis 1` is the direction along the columns.

Consider a defined 2D array.

###### Axis 0

> If we set `axis=0`, we are considering the row direction. Imagine rotating it in a circle. If we set `repeat=2`, we would end up with two instances of `[1 2]`

###### Axis 1

> On the other hand, if `axis=1`, it means we are considering the column direction. In this case, rotating would give us two 1's and two 2's, resulting in `[1 1 2 2]`.


In [105]:
output = np.ones((5, 5))
print(output)

z = np.zeros((3, 3))
z[1, 1] = 9
print(z)

output[1:-1, 1:-1] = z
print(output)


[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
[[0. 0. 0.]
 [0. 9. 0.]
 [0. 0. 0.]]
[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


##### Be careful when copying arrays!!!


In [106]:
print("deep copy")
a = np.array([1, 2, 3])
b = a.copy()
b[0] = 100

print("a:", a)
print("b:", b)

# but
print("shallow copy")
a = np.array([1, 2, 3])
b = a
b[0] = 100

print("a:", a)
print("b:", b)


deep copy
a: [1 2 3]
b: [100   2   3]
shallow copy
a: [100   2   3]
b: [100   2   3]


### Mathematics


In [107]:
a = np.array([1, 2, 3, 4], dtype='int32')
print(a)


[1 2 3 4]


In [108]:
a + 2


array([3, 4, 5, 6])

In [109]:
a - 2


array([-1,  0,  1,  2])

In [110]:
a * 2


array([2, 4, 6, 8])

In [111]:
a / 2


array([0.5, 1. , 1.5, 2. ])

In [112]:
# a= np.array([1,2,3,4])
b = np.array([1, 0, 1, 0])
a + b


array([2, 2, 4, 4])

In [113]:
a ** 2  # a^2


array([ 1,  4,  9, 16])

In [114]:
import math
a = np.array([math.pi/4])
# Take the trigonometric
print(f"sqrt(2)/2 = {math.sqrt(2)/2:.8f}")
np.set_printoptions(precision=8)  # control the number of decimal places
print(np.sin(a))
print(np.cos(a))
print(np.tan(a))


sqrt(2)/2 = 0.70710678
[0.70710678]
[0.70710678]
[1.]


In [115]:
# For a lot more (https://docs.scipy.org/doc/numpy/reference/routines.math.html)


##### Linear Algebra


In [116]:
a = np.ones((2, 3))
print(a)

b = np.full((3, 2), 2)
print(b)

np.matmul(a, b)


[[1. 1. 1.]
 [1. 1. 1.]]
[[2 2]
 [2 2]
 [2 2]]


array([[6., 6.],
       [6., 6.]])

In [117]:
# Find the determinant
c = np.identity(3)
np.linalg.det(c)


1.0

In [118]:
# Reference docs (https://docs.scipy.org/doc/numpy/reference/routines.linalg.html)

# Determinant
# Trace
# Singular Vector Decomposition
# Eigenvalues
# Matrix Norm
# Inverse
# Etc...


##### Statistics


In [119]:
stats = np.random.randint(0, 100, size=(2, 3))
print(stats)


[[ 8 13 29]
 [55 70  6]]


In [120]:
np.min(stats)


6

In [121]:
np.max(stats, axis=0)


array([55, 70, 29])

In [122]:
np.sum(stats, axis=0)


array([63, 83, 35])

### Reorganizing Arrays


In [123]:
before = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print("before:\n", before)

after = before.reshape((2, 2, 2))
print("after:\n", after)

# the condition is that the number of elements must be the same
# thus 2*4 = 2*2*2


before:
 [[1 2 3 4]
 [5 6 7 8]]
after:
 [[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [124]:
# Vertically stacking vectors
v1 = np.array([1, 2, 3, 4])
v2 = np.array([5, 6, 7, 8])

np.vstack([v1, v2, v1, v2])


array([[1, 2, 3, 4],
       [5, 6, 7, 8],
       [1, 2, 3, 4],
       [5, 6, 7, 8]])

In [125]:
# Horizontal  stack
h1 = np.ones((2, 4))
h2 = np.zeros((2, 2))

np.hstack((h1, h2))


array([[1., 1., 1., 1., 0., 0.],
       [1., 1., 1., 1., 0., 0.]])

### Miscellaneous

##### Load Data from File


In [126]:
filedata = np.genfromtxt('testNumpy.txt', delimiter=',')
filedata = filedata.astype('int64')
print(filedata)


[[  2  33  44  55 666  22   3   2   1   1   2   3   4   5   6   7   8   9]
 [  3  44  55  66 777  33   4   3   2   2   3   4   5   6   7   8   9   0]
 [  4  55  66  77 888  44   5   4   3   3   4   5   6   7   8   9   0   1]]


##### Boolean Masking and Advanced Indexing


In [127]:
(~((filedata > 2) & (filedata < 50)))


array([[ True, False, False,  True,  True, False, False,  True,  True,
         True,  True, False, False, False, False, False, False, False],
       [False, False,  True,  True,  True, False, False, False,  True,
         True, False, False, False, False, False, False, False,  True],
       [False,  True,  True,  True,  True, False, False, False, False,
        False, False, False, False, False, False, False,  True,  True]])

##### Learn more

In [1]:
# !RUN THIS CELL!
import numpy as np
import math
rg = np.random.default_rng(20230608)

In [129]:
a = np.linspace(0, math.pi, 10)
print(a)


[0.         0.34906585 0.6981317  1.04719755 1.3962634  1.74532925
 2.0943951  2.44346095 2.7925268  3.14159265]


In [130]:
b = np.fromfunction(lambda i, j: 10 * i + j, (5, 4), dtype=int)
print(b)


[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]


In [131]:
e = np.arange(12).reshape(2, 3, 2)
print(e.flat) # 1d iterator over the array
print(e.ravel()) # 1d array
for i in e.flat:
    print(i)

<numpy.flatiter object at 0x000002430F07B250>
[ 0  1  2  3  4  5  6  7  8  9 10 11]
0
1
2
3
4
5
6
7
8
9
10
11


In [132]:
a = np.floor(10*rg.random((3,4)))
print(a)
print(a.ravel()) # returns the array, flattened
print(a.T)  # returns the array, transposed
print(a.T @ a) # matrix multiplication

[[0. 9. 8. 8.]
 [2. 6. 1. 1.]
 [0. 5. 7. 7.]]
[0. 9. 8. 8. 2. 6. 1. 1. 0. 5. 7. 7.]
[[0. 2. 0.]
 [9. 6. 5.]
 [8. 1. 7.]
 [8. 1. 7.]]
[[  4.  12.   2.   2.]
 [ 12. 142. 113. 113.]
 [  2. 113. 114. 114.]
 [  2. 113. 114. 114.]]


In [133]:
a = np.floor(10*rg.random((5)))
b = np.floor(10*rg.random((5)))
print(a)
print(b)
print(a[:,np.newaxis])# adds a new axis explaination: https://stackoverflow.com/questions/29241056/how-does-numpy-newaxis-work-and-when-to-use-it
print(np.column_stack((a,b))) # stack 1d arrays as columns into 2d array (if 2d arrays, stack as 2d array)
print(np.row_stack((a,b))) # row_stack is equivalent to vstack for any input arrays

[6. 5. 3. 3. 9.]
[0. 5. 0. 4. 6.]
[[6.]
 [5.]
 [3.]
 [3.]
 [9.]]
[[6. 0.]
 [5. 5.]
 [3. 0.]
 [3. 4.]
 [9. 6.]]
[[6. 5. 3. 3. 9.]
 [0. 5. 0. 4. 6.]]


In [134]:
# spliting one array into several smaller ones
a = np.floor(10*rg.random((4,12)))
print(a,"\n")
print(np.hsplit(a,3),"\n") # split a into 3
print(np.vsplit(a,(2,3)),"\n") # split a after the second and the third row

[[1. 3. 9. 2. 9. 3. 2. 5. 0. 0. 1. 1.]
 [8. 7. 6. 8. 8. 5. 3. 4. 5. 1. 1. 2.]
 [3. 3. 3. 0. 0. 0. 3. 8. 4. 0. 3. 2.]
 [0. 3. 8. 7. 5. 3. 8. 8. 3. 3. 4. 4.]] 

[array([[1., 3., 9., 2.],
       [8., 7., 6., 8.],
       [3., 3., 3., 0.],
       [0., 3., 8., 7.]]), array([[9., 3., 2., 5.],
       [8., 5., 3., 4.],
       [0., 0., 3., 8.],
       [5., 3., 8., 8.]]), array([[0., 0., 1., 1.],
       [5., 1., 1., 2.],
       [4., 0., 3., 2.],
       [3., 3., 4., 4.]])] 

[array([[1., 3., 9., 2., 9., 3., 2., 5., 0., 0., 1., 1.],
       [8., 7., 6., 8., 8., 5., 3., 4., 5., 1., 1., 2.]]), array([[3., 3., 3., 0., 0., 0., 3., 8., 4., 0., 3., 2.]]), array([[0., 3., 8., 7., 5., 3., 8., 8., 3., 3., 4., 4.]])] 



Shallow copy

In [135]:
a = np.arange(12)
b = a # no new object is created
print("b is a :",b is a)
b.shape = 3,4 # changes the shape of a
print("a.shape :",a.shape)
print("a :",a)
c = a.view() # creates a view of the array with the same data
print("a.view() is a :",c is a)
# The view method creates a new array object that looks at the same data.
print("but c is a view of the data owned by a :",c.base is a) # c is a view of the data owned by a
# slicing an array returns a view of it
s = a[1:3,1:3]
print("a[1:3,1:3] :\n",s)
s[:] = 10 # s[:] is a view of s. Note the difference between s=10 and s[:]=10
print("a change when use s[:] = 10 because s[:] is a view of s : \n",a) # a's element change when use s[:] = 10 because 

b is a : True
a.shape : (3, 4)
a : [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
a.view() is a : False
but c is a view of the data owned by a : True
a[1:3,1:3] :
 [[ 5  6]
 [ 9 10]]
a change when use s[:] = 10 because s[:] is a view of s : 
 [[ 0  1  2  3]
 [ 4 10 10  7]
 [ 8 10 10 11]]


deep copy

In [136]:
a = np.arange(int(1e8))
print(a)
b = a[:100].copy()
print(b is a)
print(b.base is a) # b doesn't share anything with a

[       0        1        2 ... 99999997 99999998 99999999]
False
False


# identity matrix

### what is the difference between eye and identity?
- `numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C')` : This function returns a 2-D array with ones on the diagonal and zeros elsewhere. It is a more versatile function because it can create non-square matrices (if M is not None and M != N) and also shift the location of the ones (if k is not 0). Here, 'N' is the number of rows, 'M' is the number of columns, and 'k' is the index of the diagonal.

> order : {‘C’, ‘F’}, C mean C-style row-major array, F means Fortran-style column-major array

> It diferent between when program filled value in array. in 'C' ,the element after a[0,0] is a[0,1] and in 'F' ,the element after a[0,0] is a[1,0]

- `numpy.identity(n, dtype=None)` : This function returns a square array with ones on the main diagonal. It is less flexible because it can only create square matrices and the ones can only be on the main diagonal. Here, 'n' is the number of rows (and also columns) in the output.

In [137]:
# example how to use eye
a = np.eye(2) # Create a 2x2 identity matrix
print(a)
b = np.eye(3, k=1) # Create a 3x3 identity matrix with k=1
print(b)
c = np.eye(3, k=-1) # Create a 3x3 identity matrix with k=-1
print(c)
# k is the diagonal index
# Why name is eye? Because it is the letter "I" being an identity matrix

[[1. 0.]
 [0. 1.]]
[[0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 0.]]
[[0. 0. 0.]
 [1. 0. 0.]
 [0. 1. 0.]]


# Less Basic

## Broadcasting rule

### what is broadcasting rule?
- Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

> Broadcasting solves the problem of arithmetic between arrays of differing shapes by in effect replicating the smaller array along the last mismatched dimension.

### what is the rule of broadcasting?
- Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays:

> Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

> Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

> Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

So the sizes of all arrays must match

In [11]:
import numpy as np
a = np.array([1.0, 2.0, 3.0])
b = 2.0
print(a * b)

[2. 4. 6.]
