<h1>ATM S 541: LECTURE 2</h1>

<h2>Data Analysis Using Numpy</h2>

01/29/2020

<h3>2.0 Logistics</h3>

Note that the assignment at the end of the previous document will be due <b>by the start of class on Friday</b>.

Our first weather discussion will be during the first 10-15 minutes of class on Friday!

<h3>2.1 Modules and Functions</h3>

A <b>module</b> is Python is simply a Python source file (i.e., a file that ends in .py).

Modules then contain <b>functions</b>, which can be run separately from the rest of the module if desired.

In [None]:
# Let's try importing the cosine function from the "math" module:

from math import cos

cos(3.14159)

In [None]:
# If we wanted to actually get pi in here, we might as well just import the whole "math" module:

import math

math.cos(math.pi)

In [None]:
# If you don't want to type "math" every time, you can import a module as whatever name you'd like:

import math as m

m.cos(m.pi)

<h3>2.2 Numpy</h3>

Numpy is the core scientific library in Python and enables the use of arrays.

<h4>2.2.1 Creating Arrays</h4>

An array in numpy is just a grid of values (i.e., a matrix).

The dimensionality of the array is called its <b>rank</b>.

The number of entries in each dimension of the array is called its <b>size</b>.

In [None]:
# Any and all numpy operations require this line (and we often shorten "numpy" for convenience to "np").
# Note that importing numpy once will keep it around for the entire Jupyter Notebook.

import numpy as np

# First, create an array with rank 1.

a = np.array([1,2,3])
print(a)

# Next, we can see what "type" looks like for an array variable.

print(type(a))

In [None]:
# The shape of the array is easy to output.

print(a.shape)

In [None]:
# Likewise, we can print individual members of the array, remembering that indexing starts at 0 in Python!

print(a[1])

In [None]:
# We can also very simply replace a particular value within the array.

a[0] = 3
print(a)

In [None]:
# To create an array with rank 2, we use the following code:

b = np.array([[1,2,3],[4,5,6]])
print(b.shape)
print(b[0,0],b[0,1],b[1,0])

<h4>2.2.2 Special Arrays</h4>

In matrix algebra, there are certain kinds of matrices that come up frequently; similarly, numpy gives you the option to rapidly build particular kinds of arrays.

In [None]:
# An array of all zeros or all ones can be useful if you're 
# initalizing an array of variables (for example, before passing them through a while loop):

a = np.zeros((2,2))
print(a)

b = np.ones((1,2))
print(b)

In [None]:
# You can also easily create an identity matrix:

c = np.eye(2)
print(c)

<h4>2.2.3 Slicing Arrays</h4>

<b>Slicing</b> is a way of simply grabbing a subset of an array; this will come in handy for a variety of applications!

Indexing works a little differently when you're defining a range within an array rather than just a single number (where counting starts at 0).

The range under consideration is written as <b>"A:B"</b>, where <b>A</b> is the first index to be included, and <b>B</b> is the index <i>after</i> the final index to be included.

So, if you're looking at a list that is written as ["Entry 0","Entry 1","Entry 2","Entry 3"], then 0:2 refers to ["Entry 0","Entry 1"].

As a shortcut, you can leave either <b>A</b> or <b>B</b> blank, and the code will simply grab everything from the beginning or to the end, respectively.

So again looking at our example of ["Entry 0","Entry 1","Entry 2","Entry 3"], :2 refers to ["Entry 0","Entry 1"] and 2: refers to ["Entry 2","Entry 3"].

In [None]:
# As an example, let's build a rank 2 array with shape (3,4).

a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

# Use slicing to only pull out the subarray that consists of the first two rows and the middle two columns.

b = a[:2,1:3]
print(b)

In [None]:
# Important note: a subarray slice is still part of the original array and can be modified as such!

print(a[0,1])
b[0,0] = 999
print(a[0,1])

<h4>2.2.4 Integer Indexing of Arrays</h4>

As mentioned in the previous slide, creating subarrays via slicing can be dangerous, because modifying the subarrays will modify the original arrays. To avoid this, you can use 

In [None]:
# Let's create another array to work with:

a = np.array([[1,2], [3, 4], [5, 6]])

print(a)

In [None]:
# We can create a subset of the array through indexing instead:

b = a[[0, 1, 2], [0, 1, 0]]

print(b)

In [None]:
# Now, when we change a value in b, the change is not reflected in a:

b[0] = 999
print(a)

<h4>2.2.5 Boolean Indexing of Arrays</h4>

Finally, you can index (and hence modify) certain parts of an array based on a conditional statement!

In [None]:
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(a)

In [None]:
# Let's create an index that will return "True" if a value is divisible by 2 and "False" otherwise!

bool_idx = (a%2 == 0)
print(bool_idx)

In [None]:
# If you only want to print the values that meet our Boolean statement (that is, the even numbers), you can do it in two ways:

print(a[bool_idx])

# or, more simply:

print(a[a%2 == 0])

<h4>2.2.6 Data Types in Arrays</h4>

Sometimes you'll want to specify the data type you're going to be using in arrays. The two most relevant to what we'll be doing in this class are <b>int64</b> and <b>float64</b>.

In [None]:
# Force a float datatype in an array:

x = np.array([[1, 2],[3,4]], dtype=np.float64)
print(x)

<h4>2.2.7 Simple Math in Arrays</h4>

You can perform mathematical operations across entire arrays using numpy!

This is extremely powerful, since it avoids having to get too bogged down in loops.

Note that the default behavior here is <b>element-wise</b>, not matrix multiplication (so A*B is not the matrix multiplication of the two, but rather the first element of A times the first element of B, the second element of A times the second element of B, and so on).

In [None]:
# Sum two arrays!

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

# There are a couple different ways to do this that will give the same result:

print(x + y)
print(np.add(x,y))

In [None]:
# Take the difference of two arrays!

print(x - y)
print(np.subtract(x,y))

In [None]:
# Multiply (element-wise) two arrays!

print(x * y)
print(np.multiply(x,y))

In [None]:
# Divide (element-wise) two arrays!

print(x / y)
print(np.divide(x,y))

In [None]:
# Take the square root of an array!

print(np.sqrt(x))

<h4>2.2.8 Linear Algebra in Arrays</h4>

Remember dot products? Numpy's dot function enables us to compute the inner products of vectors, multiply vectors with matrices, and multiply matrices together.

In [None]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# You can calculate the vector inner products with either of these methods:

print(v.dot(w))
print(np.dot(v, w))

# You can calculate the product of a matrix and a vector with either of these methods:

print(x.dot(v))
print(np.dot(x, v))

# You can calculate the product of two matrices with either of these methods:

print(x.dot(y))
print(np.dot(x, y))

<h4>2.2.9 Useful Numpy Tools</h4>

Numpy provides <a href="https://docs.scipy.org/doc/numpy/reference/routines.math.html">an absurd number</a> of helpful routines you can apply to arrays.

In [None]:
x = np.array([[1,2],[3,4]])

# To compute the sum of all elements in the array:

print(np.sum(x))

# To compute the sum of all elements in each column:

print(np.sum(x, axis=0))

# To compute the sum of all elements in each row:

print(np.sum(x, axis=1))