<a href="https://colab.research.google.com/github/jbpost2/ST-554-Big-Data-With-Python-Course-Notes/blob/main/01_Programming_in_python/12_Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# `NumPy`

One of the most famous modules for statistis is called `numpy`

- A library for scientific computing

    + Provides the multidimensional array object (such as vectors and matrices) and methods for manipulating them

- Bring in the module with `import` and the convention is to reference it as `np`

In [None]:
import numpy as np

---

# Plan

Go through common data types

- Learn how to create
- Consider commonly used functions and methods
- See control flow and other tricks along the way

This topic, compound objects:
- **numpy array**

Recall: **functions** & **methods** act on objects. We'll see how to obtain **attributes** here as well!

---

# Creating an Array

- Arrays are like lists but process much faster

- To create an `ndarray` object, pass a list, tuple, or any array-like object to `np.array()`

--

.left45[

In [None]:
a = np.array(1)
a
type(a)
a.shape

]

.right45[

In [None]:
b = np.array([1, 2, 3])
b
type(b)
b.shape

]


---

# Array Dimension

- 0D arrays are a scalar (sort of... [see here for discussion](https://stackoverflow.com/questions/773030/why-are-0d-arrays-in-numpy-not-considered-scalar))
- 1D arrays are vectors
- 2D arrays are matrices
- 3D and up are just called arrays

- `.shape` attribute gives the dimensions of an array at a tuple

--

.left45[

In [None]:
c = np.array([1, "a", True])
c
type(c)
c.shape

]

.right45[

In [None]:
d = np.array([
  [1, 2, 3],
  [4, 5, 6]]
  )
type(d)
d.shape

]

---

# Functions for Creating Arrays

Creating a vector or matrix of all zeros

- Row vector

In [None]:
A0 = np.zeros(4)
A0

- Column vector

In [None]:
A0 = np.zeros((4,1))
A0

---

# Functions for Creating Arrays

- Matrix of zeros

In [None]:
A = np.zeros((4,2))
A
A.shape

---

# Functions for Creating Arrays

- Row of all ones

In [None]:
b = np.ones(4)
b

- Matrix of all ones

In [None]:
B = np.ones((2,3))
B

---

# Functions for Creating Arrays

- Matrix of 10's

In [None]:
C = np.ones((2, 3)) * 10
C

- `np.full()` does this automatically

In [None]:
C = np.full((2,3), 10)
C

---

# Functions for Creating Arrays

- Be careful! C is an integer valued array

In [None]:
import numpy as np

In [None]:
C = np.full((2,3), 10)
C[0,0] = 6.5
C

- Avoid by creating with a float

In [None]:
C = np.full((2,3), 10.0)  #or C = np.ones((2, 3)) * 10.0
C[0,0] = 6.5
C

---

# Functions for Creating Arrays

- Create an identity matrix with `np.eye()`

In [None]:
D = np.eye(3)
D

- Create a random matrix (values between 0 and 1) with `np.random.random()`

In [None]:
E = np.random.random((3,5))
E

- [Many more ways to create!](https://numpy.org/doc/stable/reference/routines.array-creation.html)

---

# Reshaping an Array

- Reshape an array with the `.reshape()` method

.left45[

In [None]:
F = np.random.random((10,1))
F
F.shape

]
.right45[

In [None]:
G = F.reshape(1, -1) #-1 flattens to a 1D array
G
G.shape

]

---

# Reshaping an Array

- Reshape an array with the `.reshape()` method

In [None]:
G = F.reshape(2, 5)
G

- Careful!  This is a view of the original array

In [None]:
G.base is None
G.base

---

# Copying an Array

- Instead, copy with `.copy()` method

In [None]:
H = F.reshape(2, 5).copy()
H.base is None
H.base

---

# Indexing an Array

- Access in the same was as lists `[]`

In [None]:
import numpy as np

In [None]:
b = np.array([1, 2, 3])
print(b[0], b[1], b[2])
b[0] = 5
b

---

# Indexing an Array

- Depending on the dimensions, you add the required commas

In [None]:
E = np.random.random((3, 2, 2))
E
print(E[0, 0, 0], E[0, 1, 0], E[1, 0, 1])

---

# Slicing an Array

- Recall `[start:end]` for slicing

    + Returns everything from start up to and **excluding** end
    + Leaving start blank implies a 0
    + Leaving end blank returns everything from start through the end of the array


---

# Slicing an Array

- Recall `[start:end]` for slicing

In [None]:
import numpy as np

In [None]:
A = np.array([
  [1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]])
A
B = A[:2, 1:3]
B

---

# Careful with Modifying

In [None]:
A = np.array([
  [1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]])

B = A[:2, 1:3]

Changing an element of `B` changes `A`!

In [None]:
B[0, 0] = 919
A

---

# Returning All of One Index

- Use a `:` with nothing else

.left45[

In [None]:
A = np.array([
  [1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]])
A1 = A[1, :]
A1
A1.shape

]
.right45[

In [None]:
A2 = A[1:3, :]
A2
A2.shape

]

---

# Operations on Arrays

- We saw that multiplying by a constant was performed elementwise  
- All basic functions act elementwise

In [None]:
import numpy as np

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.add(x, y)
x + 10

---

# Operations on Arrays

- We saw that multiplying by a constant was performed elementwise  
- All basic functions act element-wise

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.multiply(x, y, where = (x >= 3), out = x) #x * y would give basic element-wise multiplication
x / y

---

# Operations on Arrays

- `sqrt()` function can be used to find the square roots of the elements of a matrix

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.sqrt(x)

---

# Operations on Arrays

- `np.linalg.inv()` will provide the inverse of a square matrix

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.linalg.inv(x)

---

# Operations on Arrays

- `matmul()` function can be used for matrix multiplication

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.matmul(x, y)

---

# Operations on Arrays

- NumPy has some useful functions for performing basic computations on arrays

In [None]:
x = np.array([
  [1,2,10],
  [3,4,11]])
np.sum(x)

- Column-wise and row-wise sums

In [None]:
x.shape
np.sum(x, axis=0)
np.sum(x, axis=1)

---

# Operations on Arrays

- Combine arrays (appropriately sized)

In [None]:
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.hstack((x, y))
np.vstack((x, y))

- [Lots of operations!](https://numpy.org/doc/stable/reference/routines.array-manipulation.html)

---

# To JupyterLab!  

- Create some arrays

- Loop through the values

    + Check out `np.nditer()`
    
- Apply some `ufuncs`

---

# Recap

- `NumPy` is a widely used library that provides arrays

    + Lots of functions to create arrays
    
    + Very fast computation and many useful functions for operating on arrays!