
----

# NumPy Arrays



---

### Table of Contents


1 - [NumPy Arrays](#section1)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.1 - [Creating NumPy Arrays](#subsection1)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.2 - [Basic Operations with NumPy Arrays](#subsection2)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.3 - [Indexing & Slicing NumPy Arrays](#subsection3)<br>



---
## NumPy Arrays <a id='section1'></a>

NumPy (short for "Numerical Python") is a Python library that allows us to work with arrays and easily process large amounts of numerical data. This means we must import the library in order to use it! 

In order to save us some typing time we can give our libraries a shorter alias, like <b>np</b> for NumPy.

In [None]:
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
from jupyter_utils import *

### 1.1 Creating Numpy Arrays  <a id='subsection1'></a>


A NumPy array is just a table of data of the same type. From [NumPy.org](https://numpy.org/doc/stable/user/absolute_beginners.html): NumPy arrays can be used to perform a wide variety of mathematical operations on arrays. It adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices and it supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.


In order to use NumPy you can either convert data you already have into a NumPy array or create a blank array from scratch. NumPy arrays and Python lists are similar, yet they react differently to various operations.

In [None]:
#Create an array
array1 = np.array([1,2,3,4])

#Create a list
list1 = [1,2,3,4]

array1, list1

Try doing the following operations and see if you notice any differences!

array1 + array1

list1 + list1

array1 * array1

list1 * list1

In [None]:
# EXERCISE




print(...) # add two arrays
print(...) # add two lists

In [None]:
# EXERCISE

# multiply two arrays
# multiply two lists

print(...)
print(...)

In [None]:
# EXAMPLE

# Create a new list

list_of_numbers = [0, 1, 2, 3, 4, 5]
list_of_numbers

In [None]:
# Example

# Verify that it is indeed a regular Python list

type(list_of_numbers)

In [None]:
# EXAMPLE

#Create a new Numpy array from our list and display it

array_from_list = np.array(list_of_numbers)
array_from_list

In [None]:
# EXERCISE

# Verify that it's an array and not a list

...

Usually, we get arrays from our data. If we don't yet have any data or just want a placeholder for our data, we can create a NumPy array filled with ones by calling np.ones() and specifying the size of the array in a tuple, (3, 4).

*A tuple is another data type in Python. It's similar to a list, but uses parentheses instead of square brackets, and unlike lists, are unchangeable*

In [None]:
# EXAMPLE

#Create an array of size 3x4 and fill it with ones

ones_array = np.ones((3, 4))
ones_array

We can also create arrays filled with zeros.

In [None]:
# EXERCISE

# Create a 3x4 array of zeroes
# It will have a similar syntax, but instead we need to call 
# a function "zeros", not "ones" from our NumPy library

zeros_array = ...
zeros_array

In [None]:
import numpy as np

In [None]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x)

Similarly we can create NumPy arrays filled with any other value. To accomplish this we use np.full and specify both the size of the array and the value we would like to fill that array with.

In [None]:
# EXAMPLE

array_of_twos = np.full((3, 4), 2)
array_of_twos

In [None]:
# EXERCISE

# Create a 2x5 array of halves: "0.5" or "1/2"

array_of_halves = ...
array_of_halves

In [None]:
# EXAMPLE

x = np.array ( [1,2,3] )
y = np.array ( [4,5,6] )
# add array x to array y and store it in a result array
result = np.add(x,y)
result

In [None]:
# EXERCISE

# Modify the above code to subtract array x from array y

x = np.array ( [1,2,3] )
y = np.array ( [4,5,6] )

result = ...
result

In the cell below, create a 4x3 array filled with Pi.

**Hint:** Similar to the math library, you can use the the numpy library to import the value of pi.

In [None]:
# EXERCISE

array_of_pies = ...
array_of_pies

If we want NumPy to fill the array with random numbers between zero and one, we can use np.random.rand().
Although counterintuitive, for this function we do not need to put the size of the array in a tuple. Instead we just give NumPy the dimensions of the array directly.  

In [None]:
# EXAMPLE

np.random.rand(2,3)

### 1.2 Basic Operations with Numpy Arrays  <a id='subsection2'></a>


NumPy is much more than just an array creator. It allows us to do blazingly fast operations with arrays. Operations performed with NumPy on arrays can be computed significantly faster than with other Python functions on lists.
For example, let's say that we have an array of one million random probabilities of it raining on a particular day.

In [None]:
# EXAMPLE

random_million = np.random.rand(1000000, 1)
random_million

Let's check that the array is actully one million numbers long. 

**Hint:** Just like with most other data structures (lists) and some data types (strings), you can use the traditional Python len( ) function to check the *length* of your object.

In [None]:
# EXERCISE


Yep, that checks out.<br> <br>Now let's say that we want these probabilities to be in percentages (out of 100 rather) than proportions (from zero to one). We can just multiply the whole array by 100!

In [None]:
# EXAMPLE

percentages = random_million * 100
percentages

Notice that NumPy accomplishes that multiplication in a fraction of a second when we use it with arrays. That's a million multiplications! In the cell below you can see how much longer the code is for doing the same operations on a list. 

**Note:** We cut our list to have 10,000 values only, because it can take it a long time to run a for-loop over 1 million values. 

In [None]:
# EXAMPLE

random_mln_lst = list(random_million[:10000])
percentages_lst = [] 

for i in random_mln_lst:
    percentages_lst += [i*100] 
    
print(percentages_lst)    

 Let's see if we can get NumPy to at least break a sweat doing multiplications.

In [None]:
# EXAMPLE

# 100 million multiplications
np.random.rand(100000000) * 100

NumPy is relly fast! Not to mention that it first needs to come up with the random numbers in the array, and only then can it do the multiplications we are asking it to do. That's pretty useful if you want to analyze a huge amount of data!

Let's see what other operations we can do with NumPy arrays.

Can it add the same number to all the elements of our array? How about subtracting it? Even dividing by it?

In [None]:
# EXAMPLE

plus_fifty = random_million + 50
plus_fifty

How about if we want to divide each value by 2?

In [None]:
# EXERCISE

divided_by_two = ...
divided_by_two

We can even do those arithmetic operations between two arrays if they are of the same size!

In [None]:
# EXAMPLE

sum_of_arrays = divided_by_two + plus_fifty
sum_of_arrays

In [None]:
# EXAMPLE

x = np.array ([2,3,4])
np.exp(x)

In [None]:
# EXERCISE

#Modify the above code to calculate the square root of x.
x = np.array ([2,3,4])
...

### 1.3 Indexing & Slicing Numpy Arrays  <a id='subsection3'></a>


Just like we did with Python lists, if we ever need to retreive a value at a particular index in a NumPy array, we can use [num:num] to get it.

In [None]:
# EXAMPLE

print(array_from_list)
print("Value at index 0 is:", array_from_list[0])

Try to retrieve a value at index **3** of our array.

In [None]:
# EXERCISE

print("Value at index 3 is:", ...) 

We can also get a "slice" of numbers just like we would from a list. Remember that we slice up to the stopindex - 1!

In [None]:
# EXAMPLE

array_from_list[2:5]

Now how would you return all the values starting with index 2 (skipping values at indices 0 and 1)?

In [None]:
# EXERCISE

...

You can think of arrays as tables. If your array has more than one column per row, we just use a comma between the index of the first dimension (row) and the index of the second dimension (column). The indexing and slicing works exactly the same as before, but we can do it separately for rows and columns.

In [None]:
# EXAMPLE

three_by_five = np.random.rand(5, 3)
three_by_five

In [None]:
# EXAMPLE

three_by_five[0, 0]

Now let's try to retrieve the value from the bottom right. 

**Hint:** Remember, first we input rows, then columns. Also, don't forget that Python starts counting from *zero*.

In [None]:
# EXERCISE

...

Just like with regular arrays and lists, we can slice arrays that have multiple columns. The same rules apply: first we input the desired rows, then the desired columns. In the cell below, we are asking for our array to output values on row 0, columns 1 through 3 (our array has only 2 columns, but it won't error). 

In [None]:
# EXAMPLE

three_by_five[0, 1:4]

In [None]:
# EXAMPLE - output rows 1 through 4, column 0

three_by_five[1:, :1]

Now how would we output only the last values of all rows?

In [None]:
# EXERCISE
...


Let's have some fun with arrays

We have already loaded some data for you about the sunrise and sunset times WashingtonDC.

the following arrays give average time of the sunrise for each month, and one giving the average time of the sunset. The first row of each array gives the hour, and the second row, gives the minutes. 

In [None]:
print(dc_sunrise)
print(dc_sunset)

So you can see that in January in DC, the average times for a sunrise was 7:23. Let's say that we want to figure out the average number of hours of sunlight in DC for each month. How would we do that? 

In [None]:
# Example

# we have both hours and minutes, let's write both rows in terms of hours.

dc_sunrise [1,:] = dc_sunrise[1,:] / 60
dc_sunset [1,:] = dc_sunset[1,:] / 60

print(dc_sunrise)
print(dc_sunset)

Wait, what just happened here, why did all the minutes go to zeros?
It's because we'e been using arrays of integers so it did integer division and rounded all the values down. Fortunately we can just reload the data and try that again. this time we will load the values as floats. We can check that using the dtype function

In [None]:
print(dc_sunrise.dtype)
print(dc_sunset.dtype)

In [None]:
dc_sunrise = np.float32(dc_sunrise_orig)
dc_sunset = np.float32(dc_sunset_orig)

print(dc_sunrise.dtype)
print(dc_sunset.dtype)

Now that we have arrays of floats we can do the division without a problem. 

In [None]:
dc_sunrise[1,:] = dc_sunrise[1,:] / 60
dc_sunset[1,:] = dc_sunset[1,:] / 60.

print(dc_sunrise)
print(dc_sunset)

In [None]:
# Example

# let's make the sunset hours in military time. 

dc_sunset[0,:]+=12

print(dc_sunset)

In [None]:
# Example
# Now we can calculate the total number of hours 
# after midnight it is for the sunrise and the sunset using 
# the sum function. 


dc_sunrise_hrs = np.sum(dc_sunrise,0)
dc_sunset_hrs = np.sum(dc_sunset,0)

print(dc_sunrise_hrs)
print(dc_sunset_hrs)


The `np.sum ` command has multiple possible inputs, here we inputed, the array we wanted to sum and the direction we wanted to sum across (in this case, we wanted to add the rows so we put 0)

In [None]:
# Example
# Find the number of minutes of daylight

dc_daylight = dc_sunset_hrs - dc_sunrise_hrs 
dc_daylight

In [None]:
# Example 
#Let's find the mean number of hours of sunlight for DC

dc_mean_daylight = np.mean(dc_daylight)
print(dc_mean_daylight, 'hours')

dc_range = np.max(dc_daylight) - np.min(dc_daylight)
print('range: ' dc_range)

We also have sunrise and sunset data for 5 other cities:   Berlin, Germany; Canberra, Australia; Beijing, China; Helsinki, Finland; and Manilla, Philippines. The other arrays are named `berlin_sunrise, berlin_sunset, canberra_sunrise, canberra_sunset, beijing_sunrise, beijing_sunset, helsinki_sunrise, helsinki_sunset, manila_sunrise, manila_sunset`

Have each person at your table pick a different city and use the cells below to figure out the average number of hours of daylight in each city. So you can compare. 

Which city had the most hours of daylight? 
type your answer here:


Let's say that you want to model the path of a bullet shot into the air, like we did in the last notebook. This is way easier to do using numpy arrays. 

In [None]:
#Exercise

ts = np.linspace(0,40,41) #(This function creats an array of numbers 
                         #that is 41 numbers long from 0 to 40 (inclusive))
x_0 = 0
v_0 = 200
a = -9.8

xs = ...

print(xs)

In [None]:
# you haven't learned plotting yet so we did it for you
plt.plot(ts, xs)
plt.xlabel('time (s)')
plt.ylabel('distance (m)')

You and your friends have decided to go on a hike, we have already imported the an array called `contour`, which contains the information for a contour map giving the elevations of the area, and a list called `hiking_path`, which contains the pixels representing the path you and your friends will take.  See the plot below

In [None]:
# You haven't learned how to plot yet, so we did it for you. 

plot_path = np.zeros(contour.shape)
for ii in range(len(hiking_path)):
    plot_path[hiking_path[ii]] = 1

fig = plt.figure(frameon=False)
#im2 = plt.imshow(contour,cmap=plt.cm.viridis, alpha=.9, vmin = 1400, interpolation='bilinear')
im1 = plt.imshow(plot_path, cmap=plt.cm.gray, interpolation='nearest')
im2 = plt.imshow(contour,cmap=plt.cm.viridis, alpha=.9, vmin = 1400, interpolation='bilinear')
plt.colorbar()
plt.show()
plt.axis('off')

We can use `contour` and `hiking_path` to find the elevation for each step of the path. For example, if we wanted to find the elevation (in feet) of the first part, we would use the code in the next cell. 

In [None]:
# EXAMPLE
contour[hiking_path[0]]

In [None]:
# EXERCISE
# Use what you know about numpy arrays and loops to figure out 
# the maximum change in altitude on the hike ( The maximum minus the minimum elevations.)

elevations = np.zeros((len(hiking_path),1))

for ii in ... :
    elevations[ii]= ...
    
max_change = ...

print(max_change)

---
Notebook developed by: Kseniya Usovich, Baishakhi Bose, Alisa Bettale, Laurel Hales