# NumPy
## Overview
### What You'll Learn
In this section, you'll learn
1. What NumPy and its arrays are
1. How to generate them
1. How to access them
1. How to do some basic operations on them

### Prerequisites
Before starting this section, you should have an understanding of
1. [Basic Python](https://github.com/HackBinghamton/PythonWorkshop)

### Introduction
**Numerical Python (NumPy)** is an open source module of Python which provides fast mathematical computation on arrays and matrices, which are an essential part of machine learning, which we'll be diving into next week!

NumPy is considered to be one of two essential Python libraries for any data analysis, scientific computation, including model development for machine learning.

If you want to learn more about the functions this workshop goes over, checkout the [NumPy documentation](https://numpy.org/devdocs/). 


## Setup

***Make sure to run the below code block to set the section up!***

In [None]:
!pip install numpy

## Importing
We can import NumPy by typing:

In [None]:
import numpy as np

## NumPy Arrays (ndarray)

NumPy’s main object is a homogeneous multidimensional array, also called a **ndarray**. It is a table with same type elements (usually integers).

In NumPy, dimensions are called ***axes***. The number of axes is called the ***rank***. To create an NumPy array, you can use the following functions:
* `np.array`
* `np.ones`
* `np.full`
* `np.arange`
* `np.linspace`
* `np.random.rand`
* `np.empty`

Try running each of the below code blocks to see what each function does!

In [None]:
import numpy as np

a = np.array([1, 2, 3])
print(a, type(a))

In [None]:
import numpy as np

# This creates an array with 3 rows and 4 columns
# *dtype*: you can set dtype to the type of variable you want in your array (int, bool, etc.)
np.ones( (3,4), dtype=np.int16 )  

In [None]:
import numpy as np

np.full( (3,4), 0.11 )  

In [None]:
import numpy as np

# Creates the array [10, 15, 20, 25]
print(np.arange( 10, 30, 5 ))
 
# Floating point example
print(np.arange( 0, 2, 0.3 ))             


In [None]:
import numpy as np

np.linspace(0, 5/3, 6)

In [None]:
import numpy as np

np.random.rand(2,3)

In [None]:
import numpy as np

np.empty((2,3))

### Exercise

Create and print out a 3x3 ndarray that contains True for all values.

*Hint*: Don't forget to change the **dtype**!

In [None]:
# Your code here!

## Array Attributes

These are attributes of NumPy's array object that we can use:

* **ndim**: displays the dimension of the array
* **shape**: returns a tuple of integers indicating the size of the array
* **size**: returns the total number of elements in the NumPy array
* **dtype**: returns the type of elements in the array, i.e., int64, character
* **itemsize**: returns the size in bytes of each item
* **arange**: returns an ndarray of evenly spaced values
* **reshape**: reshapes the NumPy array

## Python Lists vs. NumPy Arrays

Some ways in which NumPy arrays are different from normal Python arrays are:
1. You can assign a single value to a slice of the ndarray
2. When you modify a slice, you modify the original ndarray 
3. You can use boolean indexing to access an ndarray

### Assigning a single value to a slice
If you assign a single value to a ndarray slice, it is copied across the whole slice. This makes it easier to assign values because a regular array would need a loop to assign all of them.

In [None]:
import numpy as np

a = np.array([1, 2, 5, 7, 8])
a[1:3] = -1
a

### Modifying slices will modify the original
ndarray slices are actually views on the same array the slice was taken from. If you modify the slice, you modify the original ndarray as well.


In [None]:
a = np.array([1, 2, 5, 7, 8])
a_slice = a[1:5]
a_slice[1] = 1000
a
# Original array was modified

If we need a copy of the NumPy array, we need to use the copy method: another_slice = a[1:5].copy(). If we modify another_slice, 'a' remains same.



In [None]:
import numpy as np

a = np.array([1, 2, 5, 7, 8])
another_slice = a[1:5].copy()
another_slice[1] = 1000
a
# Original array was modified

### Accessing multidimensional arrays

The way multidimensional arrays are accessed using NumPy is different from how they are accessed in normal python arrays. The generic format in NumPy multi-dimensional arrays is:

```
Array[row_start_index:row_end_index, column_start_index: column_end_index]
```

NumPy arrays can also be accessed using boolean indexing:

In [None]:
import numpy as np

a = np.arange(12).reshape(3, 4)
print(a)

In [None]:
import numpy as np

a = np.arange(12).reshape(3, 4)

rows_on = np.array([True, False, True])
a[rows_on , : ]      # Rows 0 and 2, all columns

## Broadcasting

In general, when NumPy expects arrays of the same shape but finds that this is not the case, it applies the **broadcasting** rules.

[Here](https://cloudxlab.com/blog/wp-content/uploads/2017/12/Screen-Shot-2017-12-13-at-5.57.21-PM.png) are some examples of these broadcasting rules applied.

There are 2 rules of Broadcasting to remember:

1. For the arrays that do not have the same rank, then a 1 will be added to the beginning of the smaller ranking arrays until their ranks match.
    *  Example: When adding arrays A and B of sizes (3,3) and (,3) [rank 2 and rank 1], 1 will be added to the beginning of array B to make it (1,3) [rank=2]. The two sets are compatible when their dimensions are equal or either one of the dimension is 1. 

2. When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other.
    * Example: when adding a 2D array A of shape (3,3) to a 2D ndarray B of shape (1, 3). NumPy will apply the above rule of broadcasting. It shall stretch the array B and replicate the first row 3 times to make array B of dimensions (3,3) and perform the operation.



### Example

```
arr1 = np.ones((2, 3))
arr2 = np.arange(3)
```

Lets find the shape of the sum of these two arrays. The shape of the arrays are

```
arr1.shape = (2, 3)
arr2.shape = (3,)
```

We look at rule 1 and see that the array **arr2** has fewer dimensions, so we pad it on the left with ones:

```
arr1.shape -> (2, 3)
arr2.shape -> (1, 3)
```

Next is rule 2, and we now see that the first dimension disagrees, so we stretch this dimension to match:

```
arr1.shape -> (2, 3)
arr2.shape -> (2, 3)
```

The shapes match, and we see that the final shape will be (2, 3):

In [None]:
import numpy as np

arr1 = np.ones((2, 3))
arr2 = np.arange(3)

arr1 + arr2

## Math with NumPy

NumPy provides a layout for using math with the NumPy arrays.

NumPy provides ***basic mathematical and statistical functions*** like mean, min, max, sum, prod, std, var, summation across different axes, transposing of a matrix, etc, while NumPy arrays themselves are capable of performing ***basic operations*** such as addition, subtraction, product, matrix dot product, division, modulo, exponents and conditional operations.

NumPy is even able to solve linear equations. If we want to sofind the coefficients of:

```
2x + 6y = 6
5x + 3y = -9
```

We can use ndarrays and built-in NumPy linear algebra functionality, like so:

In [None]:
import numpy as np

coeffs  = np.array([[2, 6], [5, 3]])
depvars = np.array([6, -9])
solution = np.linalg.solve(coeffs, depvars)
solution

### Exercise

Solve this equation using NumPy:

```
3x + 5y - 7z = 130
x + 33y + 10z = 5
-13x - 2y + 3z = 17
```

In [None]:
# Your code here!

If your answer is correct, you should have `x = -6.13952078`, `y = 5.55929633`, and `z = -17.23172581`.

## Note: Using NumPy with Pandas

If you're ever trying to take a numpy array you've generated and use it to modify/append to/do an operation with a Pandas DataFrame, you can do that! We won't cover that in detail here, since there are many functions that revolve around (e.g. `flatten`, and the arguments for the constructors of NumPy Arrays and Pandas DataFrames), but we just wanted to make a note of that in case you were wondering (and maybe it'll help you on the project)!