# Let's learn some NumPy!

#### <font color="blue">Okay, but I know some Python, what's this other stuff you want me to learn?</font>

NumPy is a multidimensional array manipulation toolkit containing numerous high-performance functions to make your life as easy as possible!

Aka: It does fun mathy-math stuff so my brain doesn't have to always go BRRRR, but instead just go brrrrr.

## Let's install it!

To install NumPy, there are just two simple steps:

1. Run `python -m pip install numpy` or `python3 -m pip install numpy` (depending on what command you use to open up python.
    - If you are using Colab, you can just run `!python3 -m pip install numpy` in a code block, as so:

In [None]:
!python3 -m pip install numpy

## What is NumPy?

NumPy is a Python library that adds support for massive multidimensional arrays along with vectorizable mathematical operations and tools for manipulating the aformentioned arrays.

#### So I keep hearing about these things called arrays. What are they?

Great question!

Remember lists are a built-in part of Python.

All of these following statements will return a `list` object.

The `list` function provides an easy way to cast objects to a list, though to be honest, it's not that commonly used.
In this case, we are casting a `tuple` to a `list`.

In [None]:
a = list((1, 2, 3))

print(a)

What we have here is an empty `list`. These can be made by simply putting a pair of square brackets, like above (`[]`).

In [None]:
a = []

print(a)

To pre-populate a list, we can just place all the elements between the square brackets and delimit them with commas.

In [None]:
a = [1, 2, 3, 4]

print(a)

`str` objects are also nice because they cast super well to a `list`. However, `str` objects are indexable themselves, so the only real time you'd really use a `list` is for mutability (the ability to change elements of the list).

In [None]:
a = list("hello!")

print(a)

Oooh! Here's `list` comprehension, probably one of my favorite parts of Python!

By saying `[i**2 for i in range(20)]`, a `list` of the values computed by `i**2` is outputted into a `list` with `i` values from `0` to `19`.

In [None]:
a = [i**2 for i in range(20)]

print(a)

Here's the last one!

Our handy-dandy `list.append` method.

If you have a `list` and want to plop something onto the end, you can take the `list` and just use its `append` method.

In [None]:
a = [1, 2, 3, 4]

# Our friend 5 has accidentally got themselves kicked off the list, so we want to add them back to complete the friend group
a.append(5)

print(a)

# In the words of Dr. Wheeler: "Viola!"

### Smh Arvin, you're getting distracted again. What are arrays???

Oh shoot, I totally forgot, yes, arrays.

So, as you can see with the `list` examples above, `list` objects are designed to store a varying number of items of varying types, but often in one-dimension.

On the othe hand, arrays are typically only used for a single datatype, typically called a dtype, for objects with certain multidimensional shapes.
It might sound horribly constrained, but in actuality, it's highly useful and modular, so let's dive in!

Uhh, Arvin, you never explained how to use NumPy though. i mean, I downloaded it, is there something I need to do to access it?

Good question and I'll explain it right now!

## 2.1 Importing!

First, with the `import` keyword directly:

<img alt="<font style='color: blue;'>import</font> <font style='color: green;'>*module*</font> <font style='color: blue;'>as</font> <font style='color: purple;'>*name*</font>" src="assets/import_structure.png" width="500px"/>

* The green `module` is where you put the name of the module you want to import. Remember, it has to be a module. You can't directly import a method like this. You'd have to use `from`.

* The blue `import` and `as` are keywords. `import` is required, but `as` is only used if you want to give it a "name."

* The purple `name` is optional. You don't have to rename it.

The other way to import things is with the `from` keyword.

<img alt="Structure of the from keyword" src="assets/from_structure.png" width="500px"/>

* The blue `from` and `import` keywords are both required. The `as` keyword can be incorporated, but it's pretty rare.

* The green `module` is where you put the name of the module you are importing from.

* The purple section is what exactly you're importing. For example, `*` would import everything into the current namespace, `thing` would import just the `thing`, and `(thing1, thing2)` would import both `thing1` and `thing2`

Notes: You can use `as` with the `from` keyword like so: `from <module> import <thing1> as <abbreviation>`.
Also, the different between `import` and `from` is as such:

If you use `import numpy.linalg', then to access `numpy.linalg.norm()`, you will need to type the whole thing out.

On the other hand...

If you use `from numpy.linalg import norm`, then to access `numpy.linalg.norm()`, I can just say `norm`.

However, `from` statements can be dangerous. If you do it wrong, you might unintentially overwrite a current method or variable.

#### Let's try importing NumPy now!

In [None]:
import numpy as np # owo, see the import keyword? That's how we access NumPy. If you see an error, now is the time to let me know :)

- Wait? That's all it took?

    - Yeah! Python is an extremely modular and extensible language, and importing is literally easier than breathing.


- What's the `np` represent?

    - Well `np` is a commonly used abbreviation for NumPy throughout the Python community. You can import it however you'd like, but `np` is recognizable to pretty much everyone.

## 2.2 Initializing Arrays

Thankfully, arrays are super simple to make and NumPy provides NUMEROUS methods to initialize them!

So, let's get started!

NumPy provides the `np.array()` method, which honestly, seems to resemble the `list` command. Let's try it out!

In [None]:
a = np.array([[1, 2], [3, 4]])

print(a)

So, as you can see, we sent a two-dimensional list as an input to the `np.array` function, which then converted the `list` to an `np.ndarray` object.

$\begin{bmatrix}
1 & 2\\
3 & 4
\end{bmatrix} \rightarrow$ `np.ndarray`

What's an `ndarray`? Well, it's short for n-dimensional array and it pretty much just a nod to the multidimensional capabilities of NumPy.



Interestingly, the array we made prints like 

```[[1 2] 
 [3 4]]```

instead of 
```[1, 2, 3, 4]``` 

Anyone have any idea why that is?


Okay, well, let's see how else we can initialize an array.

NumPy provides these methods:

* `np.array`
* `np.zeros`
* `np.ones`
* `np.empty`
* `np.arange` and `np.linspace`
* `np.zeros_like`, `np.ones_like`, `np.empty_like`
* `np.random.*`

So, let's test out a few of these! I'll start out with `np.zeros` but, feel free to play around!

In [None]:
np.zeros((4, 4, 3)) # The shape of the array is the input to the function.

## 2.3 Indexing & Slicing

Okay, in my opinion, this is probably the most confusing section of this entire thing, so I'm gonna just rip off the bandaid, but with soapy water so it hurts less.

Put simply, indexing and slicing are ways of extracting subsets of an array. It is, fundamentally, a very simple idea, but thinking about it in multiple dimensions can take some mental fortitude.

### 2.3.1 Indexing

If you're familiar with Python or went to our last lecture, you have some familiarity with indexing in lists. For a quick recap:

* Indexing is done using bracket notation.
* Indexing starts at 0
* Indexing multiple dimensions can be done using multiple bracket pairs.

With indexing in NumPy, approximately the same rules still apply!

One notable difference is that NumPy prefers slightly different notation. It is encouraged to represent multiple dimensional indices with a list of numbers rather than individual bracket pairs.

Let's try them out.

In [None]:
x = np.array([
    [1, 2, 3, 4, 5],
    [2, 3, 4, 5, 6],
    [3, 4, 5, 6, 7],
    [4, 5, 6, 7, 8],
    [5, 6, 7, 8, 9]
])

print(x)

# What do you think x[0] will be?

print(x[0])

# How about x[0, 0]
print(x[][0])

### 2.3.2 Slicing

Slicing is actually really similar to indexing, except instead of trying to extract a single element from the array, you are getting a block of the array.

For example, trying to get a 3x3 subset of a matrix.

$A = \begin{bmatrix}
1 & 2 & 3 & 4 & 5 & 6\\
3 & 4 & 5 & 6 & 7 & 8\\
5 & 6 & 7 & 8 & 9 & 10\\
7 & 8 & 9 & 10 & 11 & 12\\
\end{bmatrix}$

Let's say we want to get the middle 2x2 block from this 4x6 matrix. Just from "eyeballing" it, it seems like
$\begin{bmatrix}
5 & 6\\
7 & 8
\end{bmatrix}$ is what we're looking for.

Now, we can do what is known as **slicing**.

So, over the zeroth axis (rows), we find that we are grabbing the piece from the 1st index to the 3rd index. It's important to note that slicing is (inclusive, exclusive). The element at the exclusive index will not be included.

Over the first axis (columns), we find that we are cutting from the 2rd index to the 4th index.

To prove this, we will slice!

Let's jump right into it!

In [None]:
A = np.array([
    [1, 2, 3, 4, 5, 6],
    [3, 4, 5, 6, 7, 8],
    [5, 6, 7, 8, 9, 10],
    [7, 8, 9, 10, 11, 12]
])

print(A) # Return our array

print(A[1:3, 2:4]) # Return our slice

Voila! The slice is complete!

Of course, slices have some more fun features.

If you have an array `A` with `n` dimensions, you can have up to `n` dimensional slices.

All slices fit a general format:
    
<img src="assets/slice_structure.png" width="500px" />

In the last slice, we left out the step size, because it is assumed to be 1.
We can also leave out the start and/or end values too, indicating we want to use the entirety of that axis or even just from a certain value to either end of the axis.

In [None]:
A = np.array([
    [
        [1,2,3],
        [4,5,6],
        [7,8,9]
    ],
    [
        [10,11,12],
        [13,14,15],
        [16,17,18]
    ],
    [
        [19,20,21],
        [22,23,24],
        [25,26,27]
    ]
])

print("All axes:\n", A)

# Slice the first axis only
print("First axis only:\n", A[1:3])

# Slice second axis only
print("Second axis only:\n", A[:, 1:3])

# Slice third axis only
print("Third axis only:\n", A[:, :, 1:3])

So, as you can tell, this can be a very difficult topic to grasp. It takes some time to build visualization skills to help deconstruct and analyze these arrays. But of course, practice makes perfect! Perhaps another time, we can talk about conditional indexing with NumPy!

## 2.4 Attributes!

Now that this part is over, we can get to the chill stuff: `attributes`.

`np.ndarray` is what is known as a class. It's like a factory for making objects. Put in the necessary information and out comes the object! Each of these objects have intrinsic properties that allow you to manipulate them. These properties are called attributes.

One of the most important attributes of an `ndarray` is `ndarray.shape`. In Python, attributes of an object are typically accessed using the `.` symbol. Just as you might expect, `ndarray.shape` tells you the shape of the array.

In [None]:
x = np.random.randn(100, 20, 5)

print(x.shape)

Woah! Who would've guessed? That a 100x20x5 array would have the shape 100x20x5? Preposterous!

You would think that `ndarray.size` would be the same thing as shape, but it's not. It actually tells you the total number of elements in the array.

Another useful attribute is `ndarray.ndim`. It simply tells you how many dimensions the array has.

`ndarray.T`: The `.T` attribute is the transposed version of the original array. What is transposition?

`ndarray.dtype`: The `.dtype` attribute returns the datatype being stored within the array. i.e. `int`, `float32`, `float64`, `complex128`.

`ndarrays` have numerous attributes and for a complete list, please visit https://numpy.org/doc/stable/reference/arrays.ndarray.html.

## 2.5 Reshaping

Okay, so this one is a little weird to explain, but Numpy, being the amazing library it is, supports reshaping!

I'm sure you are familiar with the word "reshaping," but there are particular constraints with reshaping in NumPy. The biggest thing to keep in mind is that Reshaping != Rescaling. When reshaping an array, the total number of elements (`size`) should remain constant.

Reshaping arrays in NumPy is rather easy. In fact, it's so commonly used that it's actually an attribute for arrays!

`a_reshaped = a.reshape(<new_shape>)`

Let's try out an easy example!

In [None]:
A = np.array([
    [1, 3, 5, 7],
    [2, 6, 8, 4],
    [1, 3, 9, 27],
    [1, 1, 2, 3]
])

print(A) # Original Array

In [None]:
print(A.reshape((2, 8)))

In [None]:
print(A.reshape((8, 2)))

Since I didn't cover actually performing a transposition, I will do that here as well!

In [None]:
print(A.T)

NumPy also provides a few helper functions for reshaping:

* `ndarray.flatten()`: Returns a flattened (unraveled into 1-D) copy of an N-dimensional array
* `ndarray.ravel()`: A very close relative of `ndarray.flatten()` but doesn't necessarily make a copy of the array beforehand
* `np.expand_dims()`: Returns a copy of the array with an added dimension at the axis of choice.

Let's play around with these for a bit!

In [None]:
print(A.flatten())
print(A.ravel())
print(np.expand_dims(A, axis=1))

## 2.6 Mathematical Functions

Saving the best for last, we have math! Like I stated at the beginning, NumPy is filled with mathematical functions for doing fun mathy-math stuff! :)

First of all, let's talk array arithmetic!

If you have two arrays of the same shape (or an array and a scalar), you can perform addition, subtraction, division, and multiplication!

`C = A + B`
`C = A - B`
`C = A * B`
`C = A / B`

This arithmetic is all elementwise, which is why the shape requirement exists.

On the other hand, if you want to do fun stuff like dot products and cross products and linear algebra, NumPy provides mathematical functions for those!

Let's say I want to take the dot product of two vectors, `a` and `b`.

You could do: `np.dot(a, b)`
But you could also do: `a.dot(b)`

Same applies for cross products!

However, if you want to, say, take the norm (magnitude of vector) of vector a, you would have to use NumPy's submodule, `linalg`: `np.linalg.norm(a)`

A lot of mathematical functions are provided as attributes of `ndarray`, including, but not limited to `ndarray.sum`, `ndarray.max`, `ndarray.mean`, `ndarray.std`.

NumPy is so expansive that I could not possibly even dare to cover the entirety of the basics of it, but if you are feeling pursuant, I highly recommend checking out the documentation and following their beginner's guide (https://numpy.org/doc/stable/user/absolute_beginners.html) as well as their documentation (https://numpy.org/doc/stable/)