<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"></ul></div>

<font size=6>Foundations of Biomedical Computing</font>

<font size=5>Worksheet #2a - NumPy Basics</font>

The focus of this assignment will be on the basics of NumPy. NumPy is an incredibly powerful and fast matrix math and linear algebra library that allows us to perform data analysis with much greater ease than the built-in python lists.

___

<font size=5>IMPORTANT</font>

It may be possible that NumPy libary is not present on your system. Just to ensure that this is not the case, run the following cells to ensure that NumPy is installed and imported to use for this notebook. Just note that it may take a minute or two to install and setup.

In [None]:
# uncomment the command below and run if necessary
# pip install numpy 

In [3]:
import numpy as np

___

<font size=5>Section 1: Arrays</font>

___

Here are a number of ways to create NumPy arrays, both with and without initial data. The list below includes some, but not all of the methods to generate arrays in NumPy.
<br>
To generate arrays *without* initial data: 
- `numpy.empty(shape, dtype=..)` – not initialized, by default dtype is float
- `numpy.zeros(shape, dtype=..)` – initialized with zeros, by default dtype is float
- `numpy.ones(shape, dtype=..)` – initialized with ones, by default dtype is float
- `numpy.arange(shape, dtype=..)` – initialized with sequential numbers, by default dtype is int
- `numpy.linspace(v1, v2, nsam)` – initialized with values from v1 to v2 taking nsam evenly distributed samples
- `np.eye(row,col, k=..)` – Return a 2-D array with ones on the k-th diagonal and zeros elsewhere. Default k=0
[//]: # (Hello)


To generate arrays *with* initial data:
- `numpy.array(data)`
[//]: # (Hello)

Also of note are the attributes every NumPy array has, as a refresher, these are:
- Dimensions -> `array.ndim`
- Shape -> `array.shape`
- Size ->`array.size`
- Type -> `array.dtype`
- Item Size -> `array.itemsize`
- Data -> `array.data`

1\. Using the Array creation methods from above, create a 1D numpy array of numbers from 0 to 9, *without* manually specifying the values.


In [7]:
array0to9 = np.arange(0,10, dtype = int)
print(array0to9)

[0 1 2 3 4 5 6 7 8 9]


2\. Create a 2D numpy array of zeros, with dimensions of 3x3.

In [13]:
array3x3 = np.zeros((3,3))
print(array3x3)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


3\. Create a 2D numpy array with a diagonal of ones, with dimensions of 5x5.

In [14]:
array5x5 = np.eye(5,5,k=0)
print(array5x5)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


4\. Create a 1D numpy array with the following values: 7, 3, 23, 456, 192, 254.

In [16]:
array1d = np.array([7, 3, 23, 456, 192, 254])
print(array1d)

[  7   3  23 456 192 254]


5\. Create a 1D numpy array of 50 evenly spaced values between 1 and 10.

In [22]:
array1d50 = np.arange(0,10,0.2)
print(array1d50)


[0.  0.2 0.4 0.6 0.8 1.  1.2 1.4 1.6 1.8 2.  2.2 2.4 2.6 2.8 3.  3.2 3.4
 3.6 3.8 4.  4.2 4.4 4.6 4.8 5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8 7.
 7.2 7.4 7.6 7.8 8.  8.2 8.4 8.6 8.8 9.  9.2 9.4 9.6 9.8]
50


NumPy arrays also allow for the easy application of array operations. These look like and work very similarly to list comprehensions for traditional python lists. An example which replaces all odd numbers with -1 is as follows:
<br>
`arr[arr % 2 == 1] = -1`

6\. Using the technique from above, write a array operation that extracts all numbers divisible by 5 from an array of 100 numbers.

In [24]:
arr100 = np.arange(0,100)
print(arr100[arr100 % 5 == 0])

[ 0  5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95]


___

<font size=5>Section 2: Indexing and Slicing</font>

Indexing and slicing form the basis of all array operations, and luckily, indexing and slicing for NumPy arrays is quite similar to Python's standard list indexing and slicing. 
<br>


7\. Create a 2D array with 4 rows and 5 columns and do the following to it:
- Create a subarray of the first two rows and first 3 columns
- Create a subarray of all rows, but every other column
- Create a subarray of every other row, but all columns
- Reverse all of the rows and all of the columns in the array
[//]: # (Hello)
Each of these can be done with one array slicing call.

In [52]:
arr45 = np.arange(20).reshape(4,5)
print(arr45)

sub1 = arr45[:2,:3]
print(sub1)

sub2 = arr45[:, ::2]
print(sub2)

sub3 = arr45[::2, :]
print(sub3)

arr45 = arr45[::-1, ::-1]
print(arr45)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[[0 1 2]
 [5 6 7]]
[[ 0  2  4]
 [ 5  7  9]
 [10 12 14]
 [15 17 19]]
[[ 0  1  2  3  4]
 [10 11 12 13 14]]
[[19 18 17 16 15]
 [14 13 12 11 10]
 [ 9  8  7  6  5]
 [ 4  3  2  1  0]]


___

<font size=5>Section 3: Resizing</font>

Resizing is critically important to dealing with arrays. Numpy includes functions to resize arrays, which can be called via `np.resize(a,new_shape)`

8\. Using the resize method, convert the following arrays:
- Create a 8 element 1D array and convert it to a 2x4 2D array
- Create a 12 element 1D and convert it to a 4x3 2D array
- Create a 60 element 1D and convert it to a 3x4x5 3D array

In [54]:
arr8 = np.arange(8)
arr8 = arr8.reshape(2,4)
print(arr8)

arr12 = np.arange(12)
arr12 = arr12.reshape(4,3)
print(arr12)

arr60 = np.arange(60)
arr60 = arr60.reshape(3,4,5)
print(arr60)


[[0 1 2 3]
 [4 5 6 7]]
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]


___

<font size=5>Warmup for the Love Data Week Games</font>

Love Data Week is coming up soon - It's Feb 10-14!

There will be seminars and data-inspired activities in the real world, but we'll imagine a fictitious charity games day to practice numpy skills.

<font size=5>The Love Data Week Games are coming up soon!</font>

Biomedical Data Scientists from all over the world are competing on the same day to build community and raise money for ventures that improve access to health care worldwide.  Two teams from St. Louis, the Bread-Sliced Bagels and the Toasted Raviolis, decide to enter.

9\. The teams begin training for the Running Race.

The task is to create an array of 1000 random integers between 0 and 9, and divide all of the integers by 2.

Team Bread-sliced Bagels plans to use a regular python function to do this.  Write code for Team Bread-sliced Bagels that creates a normal python list of 1000 random integers, then creates a function using *def* that uses a for loop to step through the list and divides each element by 2.  The function should return the new array.

In [79]:
import random

bagels = [random.randint(0,9) for i in range(0,1000)]
def divide(arr: list[int]) -> list[float]:
    for i in range(0,1000):
        arr[i] = arr[i] / 2
    return arr


Team Toasted Ravioli plans to use numpy to do this.  Write code for Team Toasted Ravioli to create the array using numpy, then using a numpy ufunc to divide all of the integers by 2.

In [80]:
ravioli = np.random.rand(1000)

def npdivide(arr: np.ndarray) -> np.ndarray:
    arr = np.divide(arr, 2)
    return arr



Use %timeit to determine how fast each team completes the race.  Which team is faster?  Is this what you expect?

In [81]:
%timeit divide(bagels)
%timeit npdivide(ravioli)

53.8 μs ± 2.29 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
1.19 μs ± 13.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [82]:
# ravioli is significantly faster than bagels since it is using numpy. the numpy npdivide runs significantly faster. 
# this is expected, since numpy has better performance than regular python