# NumPy

## What is NumPy?

NumPy (Numerical Python) is a python package designed for scientific computing which provides numerous important tools and the powerful NumPy `array` object. Please find further documentation on NumPy here: https://docs.scipy.org/doc/

NumPy gives python many of the same tools available in MATLAB. If you are a MATLAB user the following table could be helpful for learning NumPy:
https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html

## The NumPy `array`

The NumPy `array` is a python data structure representing an $n$-dimensional ($n = $ 1, 2, 3, ...) grid of values, which can be accessed via their indices. Unlike lists which can contain a variety of object types (item 1 could be a `float`, item 2 a `bool`, etc.), a NumPy `array` can only contain items of a single type. This allows for faster mathematical operations, while also using less memory. 

There are many ways to initialize a NumPy array, including functions such as `zeros()`, `ones()`, `empty()`, `arange()`, or by converting a list to an array.

The `zeros()` function takes two arguements and returns an array filled with zeros; the first argument specifies the shape of the `array`, while the second specifies the type of values stored.

In [1]:
from numpy import zeros
zeros([4], float)

array([0., 0., 0., 0.])

If the first argument is a list of integers, the first (second) value specifies the number of rows (columns).

In [2]:
zeros([3,4], int)

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

Alternatively, the `ones()` function initializes each entry of the `array` to one.

In [3]:
from numpy import ones
ones([6],complex)

array([1.+0.j, 1.+0.j, 1.+0.j, 1.+0.j, 1.+0.j, 1.+0.j])

The `empty()` function returns an `array` without initializing the entries (they contain random junk).

In [4]:
from numpy import empty
empty(5,float)

array([ 1.  ,  2.75,  6.  , 10.75, 17.  ])

The `array()` function can be used to create an `array` from a `list`.

In [5]:
from numpy import array
r = [1.0, 1.5, -2.2]
array(r,float)

array([ 1. ,  1.5, -2.2])

Passing a 2-dimensional list to `array()` will yield a 2-dimensional array.

In [6]:
array([[1,2,3],[4,5,6]],int)

array([[1, 2, 3],
       [4, 5, 6]])

Elements of an array are accessed similar to lists.

In [7]:
a = array([1, 2, 3], float)
a[1]

2.0

### Arithmetic

Arithmetic operations on an `array` are applied element-by-element.

In [8]:
a = array([1,2,3,4],int)

2*a

array([2, 4, 6, 8])

Contrast this to the behavior for lists.

In [9]:
l = [1,2,3,4]

2*l

[1, 2, 3, 4, 1, 2, 3, 4]

Multiplication of two `array`s is also performed element by element.

In [10]:
a = array([1,2,3,4],int)
b = array([2,4,6,8],int)

a*b

array([ 2,  8, 18, 32])

To instead perform the dot product of two arrays, use the `dot()` function.

In [11]:
from numpy import dot

dot(a,b)

60

Many of the built-in functions we learned previously can be applied to `array`s.

In [12]:
sum(a), len(a), min(a), max(a)

(10, 4, 1, 4)

### Applying math to arrays

We saw previously that a given function could be applied element-by-element to entries in a list using the `map()` function.

In [13]:
from math import sqrt

l = [1, 4, 9]

list(map(sqrt,l))   #cast to list for printing

[1.0, 2.0, 3.0]

Element-by-element application is the default behavior for mathematical functions defined within the NumPy package.

In [14]:
from numpy import sqrt

a = array(l,float)
sqrt(a)

array([1., 2., 3.])

Note that we imported a different `sqrt` function this time, from the NumPy package.  To avoid ambiguity, it is common practice to import the entire NumPy package, rather than individual functions.

In [15]:
import numpy as np
np.sqrt(a)

array([1., 2., 3.])

As described in the NumPy [documentation](https://docs.scipy.org/doc/numpy/reference/routines.math.html), many functions are available.

In [16]:
l = [0, np.pi/4, np.pi/2, 3*np.pi/4, np.pi]

print(np.degrees(l))
print(np.cos(l))
print(np.sin(l))
print(np.tan(l))

[  0.  45.  90. 135. 180.]
[ 1.00000000e+00  7.07106781e-01  6.12323400e-17 -7.07106781e-01
 -1.00000000e+00]
[0.00000000e+00 7.07106781e-01 1.00000000e+00 7.07106781e-01
 1.22464680e-16]
[ 0.00000000e+00  1.00000000e+00  1.63312394e+16 -1.00000000e+00
 -1.22464680e-16]


You can see that we don't even need to pass an `array` to these functions.  If we pass a `list`, it is automatically converted to an `array`, on which the function operates.  A NumPy `array` is always returned.

In [17]:
type(np.tan(l))

numpy.ndarray

Most NumPy functions can operate on an `array` with any dimensionality.

In [18]:
a = array([[1,2,3],[4,5,6]], int)

np.sum(a)

21

NumPy `array`s have a number of useful attributes that tell us about the array.  The `size` attribute contains the total number of elements in the `array`.  The `ndim` attribute contains the number of dimensions.  The `shape` attribute contains the number of elements along each dimension).

In [19]:
print(a, "\n")

print("a.size =",a.size)
print("a.ndim =",a.ndim)
print("a.shape ",a.shape)

[[1 2 3]
 [4 5 6]] 

a.size = 6
a.ndim = 2
a.shape  (2, 3)


The NumPy `arange()` function is similar to the built-in `range()` function.  It takes 3 arguments (`start`, `stop`, and `step`) and returns an `array` of values in the interval [`start`, `stop`), with spacing between values given by `step`.

More info: https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html#numpy.arange

In [20]:
a = np.arange(0, 10, 2)

a

array([0, 2, 4, 6, 8])

When using a non-integer step, it is often better to use the NumPy `linspace()` function.  It takes 3 arguments (`start`, `stop`, and `num`) and returns an `array` of `num` evenly spaced values, calculated over the interval [`start`, `stop`].

More info: https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html

In [21]:
a = np.linspace(0, 1, 11)

a

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

### Indexing and Slicing

Individual elements of a 1D `array` can be access just like `list`s.

In [22]:
a[2], a[-1]

(0.2, 1.0)

However, note the difference in the 2D case.

In [23]:
l = [[1,2,3],
     [4,5,6],
     [7,8,9]]
a = np.array(l, int)

print(l[1][1])   #2 square brackets
print(a[1,1])    #1 square bracket

5
5


NumPy `array`s also support slicing.

In [24]:
a = np.linspace(0, 1, 11)

a[1::2]

array([0.1, 0.3, 0.5, 0.7, 0.9])

We can slice a 2-dimensional array to obtain a particular column.

In [25]:
a = np.array([[ 0,  1,  2,  3,  4,  5],
              [10, 11, 12, 13, 14, 15],
              [20, 21, 22, 23, 24, 25],
              [30, 31, 32, 33, 34, 35],
              [40, 41, 42, 43, 44, 45]])

a[:,0]   #return values from all rows, column 0

array([ 0, 10, 20, 30, 40])

Or a particular row.

In [26]:
a[1,:]   #return values from row 1, all columns

array([10, 11, 12, 13, 14, 15])

Or any rectangular region.

In [27]:
a[2:4,1:4] #return values from rows [0,1), columns [1,2,3)

array([[21, 22, 23],
       [31, 32, 33]])

## Reshaping

NumPy `array`s can be "reshaped" using the `reshape()` function.

In [28]:
b = a.reshape(15,2)

print("b:")
print(b,"\n")
print("b.shape =", b.shape)
print("b.ndim =", b.ndim)

b:
[[ 0  1]
 [ 2  3]
 [ 4  5]
 [10 11]
 [12 13]
 [14 15]
 [20 21]
 [22 23]
 [24 25]
 [30 31]
 [32 33]
 [34 35]
 [40 41]
 [42 43]
 [44 45]] 

b.shape = (15, 2)
b.ndim = 2


## Iterating over an array

A `for` loop over a 2-dimensional `array` iterates over the rows.

In [29]:
a = np.arange(20)
a = a.reshape(4, 5)

print("a:")
print(a,"\n")

for y in a: #iterate over rows
    print("y =",y)

a:
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]] 

y = [0 1 2 3 4]
y = [5 6 7 8 9]
y = [10 11 12 13 14]
y = [15 16 17 18 19]


To iterate over all entries in a 2-dimensional `array`, nest two `for` loops.

In [30]:
for y in a: #iterate over rows
    for x in y: #iterate over values within the row
        print(x)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


Equivalently, using the `nditer()` function: https://numpy.org/doc/stable/reference/generated/numpy.nditer.html

In [31]:
for x in np.nditer(a):
    print(x)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


## Indexing with a boolean mask

It is possible to select specific `array` entries using a boolean "mask."  

The mask should have the same number of entries as the `array`.  We can construct such a mask using a boolean expression.

In [32]:
a = np.array([5, 3, 2, 6, 8])
mask = a>=5 #the inequality on the right-hand side of the assignment statement is evaluated 
            #element-by-element, creating another array consisting of booleans

mask

array([ True, False, False,  True,  True])

Indexing the `array` with this mask returns a new `array` containing elements for which the corresponding mask entries are `True`.

In [33]:
a[mask]

array([5, 6, 8])

## Importing data with Numpy

Data can be imported into a NumPy `array` using the `loadtxt()` function.  Detailed info: https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html

In [34]:
path = '../Data/millikan.txt'
data = np.loadtxt(path, float)

print(data)

[[5.48740e+14 5.30900e-01]
 [6.93100e+14 1.08420e+00]
 [7.43070e+14 1.27340e+00]
 [8.21930e+14 1.65980e+00]
 [9.60740e+14 2.19856e+00]
 [1.18400e+15 3.10891e+00]]


Web-based data can also be imported.

In [35]:
url = 'https://raw.githubusercontent.com/jstupak/ComputationalPhysics/master/Data/millikan.txt'
data = np.loadtxt(url, float)

print(data)

[[5.48740e+14 5.30900e-01]
 [6.93100e+14 1.08420e+00]
 [7.43070e+14 1.27340e+00]
 [8.21930e+14 1.65980e+00]
 [9.60740e+14 2.19856e+00]
 [1.18400e+15 3.10891e+00]]


## Exercises

Consider the following `array`:

In [36]:
a = np.array([[ 4, -1,  4,  0,  0,  0],
              [ 5, -3,  3,  0,  0,  0],
              [ 3,  2, -3,  0,  0,  0],
              [ 0,  0,  0,  2,  3, -3],
              [ 0,  0,  0, -5,  4,  3],
              [ 0,  0,  0,  1,  3,  4]])

1. Create an `array` that contains the square of each element of `a`.

<details>
    <summary style="display:list-item">Click for solution</summary>

```python
np.square(a)
```
    
</details>

2. Create two 3x3 `array`s that contains the upper-left, and bottom right corners of `a`.  Use matrix multiplication to find the product of these `array`s.

<details>
    <summary style="display:list-item">Click for solution</summary>

```python
b = a[:3,:3]
c = a[-3:,-3:]

b, c, np.matmul(b,c)
```
    
</details>

3. Create an `array` that contains **integers** representing the absolute value of each element of `a`.

<details>
    <summary style="display:list-item">Click for solution</summary>

```python
np.array(np.fabs(a), int)
```
    
</details>

4. Replace every zero in `a` with a nine.

<details>
    <summary style="display:list-item">Click for hint</summary>
Create a boolean mask of the same dimensions as <code>a</code>, which holds <code>True</code> for each zero.  Then apply this mask by indexing.
</details>

<details>
    <summary style="display:list-item">Click for solution</summary>

```python
mask = a==0
a[mask] = 9

a
```
    
</details>