# Introduction to numpy

The numpy library is short for "numerical python"
In this script I will motivate numpy as well
as provide an introduction to it's functionality
Numpy is a huge library with tons of capabilities
so there will be a lot we don't cover.
The numpy documentation is excellent though:
https://numpy.org/doc/stable/

There is also a really good quickstart guide
that covers many of the things we will see here too:
https://numpy.org/doc/stable/user/quickstart.html

To start with numpy you can import it. The 
line `import numpy as np` is the standard way
to import it. This just renames `numpy` as `np`
to save you time on typing
I'm also importing math so that we can compare
and finally the function from the previous script

In [1]:
import time
import math
import numpy as np

def air_pressure_at_height(h):
    p0 = 101325      # reference pressure in pascals
    M = 0.02896968   # molar mass of air kg/mol
    g = 9.81         # gravity m/s2
    R0 = 8.314462618 # gas constant J/(mol·K) 
    T = 273          # temp in kelvin

    ratio = -(g * h * M) / (R0 * T)
    # NOTE: here I changed math.exp -> np.exp, 
    #       you will see why in a minute
    p_h = p0 * np.exp(ratio)
    return p_h

In [2]:
# Heights to calculate pressure at
start = 0
end = 20000
step = 1

In [3]:
t0 = time.time() 
h_list = range(start, end, step)
p_list = [] 

for height in h_list: 
    p_h = air_pressure_at_height(height)
    p_list.append(p_h) 

t1 = time.time()
base_python_time = t1-t0
print("With plain python this took:", base_python_time, " seconds")

With plain python this took: 0.04263567924499512  seconds


In [4]:
t0 = time.time()
h_array = np.arange(start, end, step)
p_array = air_pressure_at_height(h_array)
t1 = time.time()
numpy_time = t1-t0
print("With numpy this took:", numpy_time, " seconds")
print("Numpy version is ", base_python_time/numpy_time, " times faster")

With numpy this took: 0.0008068084716796875  seconds
Numpy version is  52.844858156028366  times faster


Okay, so how did that work?
Numpy is an "array-based" library, meaning it defines the "array" type
Here you can see we have `h_array` is an `ndarray`, which means 
N-dimensional array. In our case N=1. We can also look at the shape
NOTE: The length of the shape is always equal to the number of dimensions

In [5]:
print(type(h_array))
print(h_array.ndim)
print(h_array.shape) 
print(len(h_array.shape) == h_array.ndim)

<class 'numpy.ndarray'>
1
(20000,)
True


# What else can you do with numpy? Basically anything with numbers!

In [6]:
array_shape = (5, 5)

# Create an array of all ones with a specific shape
ones_matrix = np.ones(array_shape)
print(ones_matrix)
print(ones_matrix.shape)
print()

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
(5, 5)



# Numpy also makes math easier to do, now you can just multiply numbers and any ndarray object

In [7]:
print(0.1 * ones_matrix)
print()

[[0.1 0.1 0.1 0.1 0.1]
 [0.1 0.1 0.1 0.1 0.1]
 [0.1 0.1 0.1 0.1 0.1]
 [0.1 0.1 0.1 0.1 0.1]
 [0.1 0.1 0.1 0.1 0.1]]



# You can also operate on arrays with other numpy functions

First we'll cover some handy functions:

In [8]:
print(np.sum(ones_matrix))
print(np.sum(ones_matrix, axis=1))

# Create an array from a list
sample_1d = np.array([0,1,2])
sample_2d = np.array(
    [[0,1,2],
     [3,4,5]]
)
print(sample_1d)
print()
print(sample_2d)
print() #creates a space bettewn the resutls
print(np.sum(sample_2d, axis=1))

25.0
[5. 5. 5. 5. 5.]
[0 1 2]

[[0 1 2]
 [3 4 5]]

[ 3 12]


In [9]:
# Create an array of all zeros
zeros = np.zeros(array_shape)
print(zeros)
print()

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]



In [10]:
# Arange is just like the regular `range` function
# but produces an array rather than a `range` object
sequence_1 = np.arange(start, end, step)
print(sequence_1)
print()

[    0     1     2 ... 19997 19998 19999]



Linspace is the counterpart to `arange`. If you know
how many numbers you need rather than the step size
Here, 5 evenly spaced values form 0 to pi 
NOTE: Linspace actually includes the end point

In [11]:
sequence_2 = np.linspace(0, np.pi, 5)
print(sequence_2)
print()

[0.         0.78539816 1.57079633 2.35619449 3.14159265]



# You can create random numbers too

In [12]:
random_matrix = np.random.random(array_shape)
print(random_matrix)

[[0.32872669 0.74190436 0.89967422 0.70792172 0.85745503]
 [0.16636774 0.73049498 0.01429831 0.69264688 0.85787575]
 [0.10928469 0.89127583 0.71306694 0.47980016 0.63634163]
 [0.49846996 0.1808572  0.99243232 0.23666436 0.12287594]
 [0.88821612 0.33983191 0.07941638 0.48600274 0.66432614]]


# Numpy is really good for linear algebra

In [13]:
vector_shape = array_shape[0]
random_vector = np.random.random(vector_shape)
random_matrix_2 = np.random.random(array_shape)

# For example the dot product   
random_dot = np.dot(random_vector, random_matrix_2)

# Or matrix multiplication
random_matmul = np.matmul(random_matrix, random_matrix_2)

# And there are a ton of operations, you can usually find what you need by searching online for "numpy (whatever function you want)"

In [14]:
np.exp(random_matrix)
np.sin(random_matrix)
np.log(random_vector)
np.max(random_matmul)

2.655511462634199

# Finally, let's talk about slicing and indexing
# Like lists you can index arrays directly

In [15]:
sequence_1[0]
sequence_1[-1]

# You can also "slice" which is basically indexing
# multiple values. The syntax for slicing is like so
start = 0
stop = 5
step = 2
print(sequence_1[start:stop:step])

# Additionally you can create a `slice` object which
# can be convenient because you can use it multiple times
my_slice = slice(start, stop, step)
print(sequence_1[my_slice])
print(sequence_2[my_slice])

[0 2 4]
[0 2 4]
[0.         1.57079633 3.14159265]


# For multidimensional arrays indexing works a  bit differently

In [16]:
increasing_matrix = np.arange(0, 9).reshape((3,3)) #reare for a 1 dim array to a  2 dim array
print(increasing_matrix)
#indexing on a two dimientional array makes it 
print()
# Get the first row
print(increasing_matrix[0])
print()
# Get the middle value
print(increasing_matrix[1,2])
print()
# Get the first column, the `:` means "everything"
# It is equivalent to `slice(None, None, None)`
print(increasing_matrix[:, 0])

[[0 1 2]
 [3 4 5]
 [6 7 8]]

[0 1 2]

5

[0 3 6]


# Arrays can be of large dimension (up to, I think, 64)

In [17]:
length = 3
dim = 5
array_5d = np.arange(0, length**dim).reshape(dim * (length,))
print(array_5d.shape, array_5d.ndim)

(3, 3, 3, 3, 3) 5


Numpy also features some rudimentary ways of 
reading data from files. This is how you'll complete
your forecasting assignment. I've downloaded the daily
streamflow in cubic feet per second for the last thirty 
days (ending Sept 10) and placed it in the `data` directory.
But because I posted it on GitHub we can open it directly
over the internet.

In [18]:
filename =('https://raw.githubusercontent.com/HAS-Tools-Fall2022'
           '/Course-Materials22/main/data/verde_river_daily_flow_cfs.csv')
flows = np.loadtxt(
    filename,           # The location of the text file
    delimiter=',',      # character which splits data into groups
    usecols=1           # Just take column 1, which is the flows
)
print(flows)

[136.  281.  234.  181.  224.  215.  209.  236.  645.  579.  417.  372.
 295.  308.  395.  357.  276.  253.  202.  165.  140.  119.  102.   86.7
  84.9  73.8  67.   63.   57.1  55.1]
