# Introduction

The lab section of Math 218 will give you experience using computational tools to work with linear algebra,
strengthening concepts from class as well as providing applications that go well beyond class material.  

Our main tool for computation will be the Python programming language with the NumPy (Numerical Python) library. We will use the Jupyter environment. In fact, you are already running a Jupyter notebook right now. Python is built into Jupyter. 

In this lab, we will give an introduction to array processing in NumPy. You will learn about creating, manipulating, and slicing arrays (and what that last term means). This lab is foundational for all future work, and will provide a place of reference for much of what we do later. Later in the semester, if you forget how to do some basic array operations, refer back to this lab!

To start, make a copy of this notebook (File menu -> Make a Copy...)

**You will need to do this every time, as these master notebooks may be overwritten!**

Then, type in the following command in the code box below to import the Numpy library.

```python
import numpy as np
```

Press ctrl-enter to run your commands.

**Throughout this course, this is going to be the first command you type into Jupyter, as all our work will always use the NumPy library.**

In [1]:
import numpy as np

## NumPy Arrays

An *array* is simply a line or rectangle of numbers (NumPy also supports higher dimensional arrays, but we will rarely use such objects). Lines of numbers are *one-dimensional* arrays, and rectangles of numbers are *two-dimensional* arrays. In class, we will refer to these as *vectors* and *matrices*, but let's stay simple for now.

### Creating Arrays by Hand

We will use the following two 1-D arrays and two 2-D arrays to start with: 

$$v=\begin{bmatrix}5\\ 3 \\ -2\end{bmatrix}
\mbox{, }
w=\begin{bmatrix}1\\ 5 \\ -1\end{bmatrix}
\mbox{, }
A=\begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
\mbox{, and }
 B=\begin{bmatrix}
3 & 1 & 1 \\
2 & 2 & 4 \\
5 & 7 & 1
\end{bmatrix}$$

Use the following code to enter a 1-D array:

```python
v = np.array([5,3,-2])
```

2-D arrays are entered row-by-row:
```python
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
```

**Question 1** Initialize these and the remaining arrays in the code box below. Check your work by using the `print()` command to output variables (e.g. `print(A)`).

In [10]:
v = np.array([[5],[3],[-2]])
w = np.array([[1],[5],[-1]])
A = np.array([[1,2,3,],[4,5,6],[7,8,9]])
B = np.array([[3,1,1],[2,2,4],[5,7,1]])

print(v)
print(w)
print(A)
print(B)

[[ 5]
 [ 3]
 [-2]]
[[ 1]
 [ 5]
 [-1]]
[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[3 1 1]
 [2 2 4]
 [5 7 1]]


### Adding and Subtracting Arrays, and Scalar Multiples

Two arrays can be added (or subtracted) if they are the same size. An array can also be multiplied by a scalar (a number)

**Question 2** Print the following arrays: *v+w*, *w-v*, and *A+B*. Also print *2v* and *-3(A-B)*. Check your answers by hand.

In [16]:
print(v+w)
print(w-v)
print(A+B)
print(2*v)
print(-3*(A-B))

[[ 6]
 [ 8]
 [-3]]
[[-4]
 [ 2]
 [ 1]]
[[ 4  3  4]
 [ 6  7 10]
 [12 15 10]]
[[10]
 [ 6]
 [-4]]
[[  6  -3  -6]
 [ -6  -9  -6]
 [ -6  -3 -24]]


### Accessing Arrays

NumPy indexes arrays from zero. That is, we access the first element of a one-dimensional array using code like `v[0]`, the second element with `v[1]`, etc. We can access the *last* element of a 1-D array using `v[-1]`, the second to last as `v[-2]` and so on. Elements of two-dimensional arrays are accessed using code like `A[1,2]`. 

**Question 3** Print the third element of *w* and the element in the second row, first column of *B-A*. Also print the second element of *cv*, where *c* is the element in the third row, first column of *A*. Check your answers by hand.

In [25]:
print(w[2]) #doing print(w[-1]) will choose the last element
X = B-A
print(A)
print(X[1][0])
c = A[2][0]
print(c*v)

[-1]
[[1 2 3]
 [4 5 6]
 [7 8 9]]
-2
[[ 35]
 [ 21]
 [-14]]


### Accessing Rows and Columns of 2-D Arrays

If *A* is a 2-D array, then `A[0]` is the first *row* of *A*. Note that `A[0]` is a 1-D array! The first *column* of *A* is accessed using the command `A[:,0]`. This is our first example of a *slice* of an array. Slicing is done using the colon (`:`) operator. We will see more complex slices soon.

**Question 4** Print the sum of the first row of *A* and twice the second column of *B*. Suppose you didn't know the size of an array, how would you access its second-to-last column? Check that this gives the same answer, and check by hand.

In [29]:
print(A)
print(A[:,0]) #getting the first column
print(A[:,[0]]) #getting the first column as a column vector
print(A[1,:]) #getting the second row

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[1 4 7]
[[1]
 [4]
 [7]]
[4 5 6]


#### The `:` Operator in General

Recall that we access the element in the second row, first column of an array using `A[1,0]`. The reason `A[:,0]` is the *entire* first column is that on its own, `:` means 'all of'. So `A[:,0]` means *in the first column, give me all the rows*.

**Question 5** What will `A[0,:]` give you? Test your answer. Why is this syntax unnecessary?

If we want the first two rows of *A*, we can use the command `A[:2]`. Note that this outputs the rows 0 and 1. That is, the syntax *does not* include row 2!

**Question 6** Print the first two rows of *A* and the first two *columns* of *B*. Can you take the sum of these two arrays? Explain.

If we want an array composed of all but the first row of *A*, we can use the syntax `A[1:]`. Note that, in contrast to the last bit of slicing we did, this command *does* include row 1. It only excludes row 0.

If we have a $4\times 4$ array *C*, and we want the array composed the second and third rows of *C*, we can use the syntax `C[1:3]`. Note that NumPy *includes* the lower bound and *excludes* the upper bound.

**Question 7** Create a $5\times 5$ array $A$ composed of the first 25 integers using the command `np.arange(25).reshape((5,5))` (more on this later). Write a command that prints out the central $3\times 3$ array.

In [47]:
A = np.arange(1,26).reshape((5,5)) #python commands are always inclusive of the start, exclusive of end
print(A)
B = A
print(B[-2:,-2:])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]
[[19 20]
 [24 25]]


**Question 8** Suppose you don't know the size of an array. Write a command that prints the $2\times 3$ section in it bottom left-hand corner. Test your command out on the array from the previous question. (Hint: Look back at negative indices!)

### Logical Indexing

In addition to accessing known slices of arrays, NumPy also allows extraction of elements using logical conditions. For example, you may want to extract all numbers greater than zero from an array *A*. To do this, use the syntax `A[A>0]`.

**Question 9** Try this out on the array *A* from above: write a line of code that returns all elements greater than 7. Is the output a 1-D or 2-D array? Why *must* this simple (relatively simple, that is...see the next question) logical indexing always give 1-D array?

In [63]:
A = np.arange(1,26).reshape((5,5))
print(A)
print(A[A>7])
print(A>7)
w = np.arange(5)
print(w)
print(w [[True, True, False, True, False]])


[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]
[ 8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25]
[[False False False False False]
 [False False  True  True  True]
 [ True  True  True  True  True]
 [ True  True  True  True  True]
 [ True  True  True  True  True]]
[0 1 2 3 4]
[0 1 3]


### Stepping Up the Logic

Logical indexing is often the most powerful data extraction tool in the NumPy toolbox. While there are many ways to achieve the data extraction results we want, logical indexing is almost always the fastest, most efficient, and most elegant. This section gives examples. 

**<span style="font-variant:small-caps;">If everything above seemed elementary, this is the place to start paying deep attention!</span>**

Now, suppose we want to return an array consisting of all columns whose sum is an even number. We need a few things:

* A command that finds column sums (that is, a list of the sums of each column);
* A command that checks if a number if even;
* A way to extract only the right columns.

#### Column and Row Aggregation

To sum up the columns of an array *A*, use the command `A.sum(axis=0)`. Likewise, to sum rows, use `A.sum(axis=1)`. Since NumPy arrays are row-by-row, axis zero enumerates the *rows*. That is, it is the *vertical* axis (this may be a little confusing! Make sure you re-read and understand it!). Position zero along axis zero is is the first row, and so on. Likewise, position zero along axis one is the first column.

Aggregation commands, like `sum`, `max`, `min` and so on (A complete list can be found [here](https://jakevdp.github.io/PythonDataScienceHandbook/02.04-computation-on-arrays-aggregates.html), around half way down the page.) are often applied along a given axis. The axis numbers in these commands (zero and one for 2-D arrays) refer to the *axis* that NumPy *collapses*. That is, `A.sum(axis=0)` collapses all the rows together by summing them together, giving a list of sums of each column. It adds *vertically*, along the zero axis. Likewise `A.sum(axis=1)` adds *horizontally*, along the one axis.

**Question 10** We saw above that in order to extract columns 1 and 2, we can use syntax like `A[:,1:3]` (or equivalently, `A[:,[1,2]]`). If we use the same syntax, but give an array of logical (boolean) conditions instead of `1:3` or `[1,2]`, we can extract columns according to a condition: 
* Try `A[:,[True, True, False, True, False]]`. What happens?

We check if a number is even by checking if its remainder when divided by two is zero: *a* is even if `a % 2 == 0` returns `True`. 

* Write a command that returns a logical 1-D array of five elements, with 'True' if the corresponding column sum in *A* is even, and 'False' if it is odd. Don't create this array by hand, but rather, use the ideas above.

* Lastly, using all this, write a single line of code that returns only the columns in the array *A* whose sum is even. Test your code on the $5\times 5$ array above. You want the actual columns, not just the column numbers.

In [65]:
A = np.arange(25).reshape((5,5))
print (A)
print(A.sum(axis=0)) #axis=0 goes through the columns
print(A.min(axis=1)) #axis=1 goes through the rows

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
[50 55 60 65 70]
[ 0  5 10 15 20]


### Other Ways to Create Arrays

#### The `reshape` command

As mentioned above, we can take a 1-D array and reshape it as we want. Sometimes it is easier just to enter a long string of number, then shape an array as needed. This is what the *reshape* command does. Suppose we want the following $3\times 4$ array:

$$\begin{bmatrix} 1 & 2 & 3 & 4 \\ 7 & 4 & 3 & 10 \\ -2 & 7 & 9 & -1\end{bmatrix}$$

We can use the following Python code:
```python
C = np.array([1,2,3,4,7,4,3,10,-2,7,9,-1]).reshape(3,4)
```

If *A* is an array with $mn$ entries in it, the command `A.reshape(m,n)` returns a 2-D array with $m$ rows and $n$ columns. Note that the array *A* may or may not be 1-D. We can also take a 2-D array and reshape it to be 1-D with a command like `A.reshape(n)` or take a 2-D array and reshape it to be a different 2-D array.

**Question 11** Create a $3\times 4$ array with any entries you want in it. Reshape it to be 1-D. What order do the numbers in the reshaped array appear in?

In [74]:
C = np.array([1,2,3,4,7,4,3,10,-2,7,9,-1]).reshape(3,4)
print(C)
C=C.reshape(12)
print(C)

[[ 1  2  3  4]
 [ 7  4  3 10]
 [-2  7  9 -1]]
[ 1  2  3  4  7  4  3 10 -2  7  9 -1]


#### Creating Zero and One Arrays

The commands `np.zeros((m,n))` and `np.ones((m,n))` create $m\times n$ arrays filled with zeros and ones respectively. Note the double parentheses! We are passing the *tuple* `(m,n)` to the command as a *shape parameter*. If we replace `(m,n)` with a single number `n`, we get a 1-D array of length *n*.

#### Creating Diagonal Arrays

If *A* is a 1-D array with *n* entries, the command `np.diag(A)` returns an $n\times n$ array whose main diagonal (from top-left to bottom-right corners) is *A*. Note that if the same command is called on a 2-D array, it returns the diagonal starting from the top-left corner.

### Modifying Arrays

Up to now, we have largely extracted data *from* arrays. In this section, we will see how to take existing arrays and modify them.

#### Modifying Individual Elements of Arrays

The easiest (and often, least efficient) way to modify arrays is element-by-element. This is straightforward: `A[1,2] = 4` changes the number in the second row, third column of the array *A* to be *4*. If you want to add a number to an array element, subtract from it, multiply it, or divide it, use inplace syntax:

```python
A[1,2] += 2 # Add two
A[1,2] -= 2 # Subtract two
A[1,2] *= 2 # Multiply by two
A[1,2] /= 2 # Divide by two (Caution! See section on floats below)
```

#### Modifying Whole Rows or Columns
There are two important ways to modify rows and columns. Since rows and columns are 1-D arrays, we can add a 1-D array of the same size. For example, if *A* is a $5\times 4$ array, the following work:
```python
v = np.array([1,3,4,0])
w = np.array([0,2,5,2,4])
A[2] += v
A[:,1] += w
```

Note that we can also do element-by-element multiplication and division:
```python
A[2] *= np.array([1,2,3,4]) # This mutiplies the first element of the row by one, the second by two, etc.
```
NumPy also supports modifying entire rows or columns at once, for example by adding the same number to all elements in a row:

```python
A[1] += 2 # This adds two to every element in the second row of v
```

This is an example of *broadcasting*, which we will examine in a more general case below.

#### Modifying More General Slices
Similarly to the above, if we have an $m\times n$ slice of an array, we can either add/subtract/multiply/divide it to/by a single number, or we can add/subtract/multiply/divide its elements to/by an identically shaped array. Also see below for modifying such slices by broadcasting.


**Question 12** Starting from a $6\times 4$ array of ones, carry out the following operations in order:
* Insert the number seven in the third row, fourth column;
* Add two to every number in the second column;
* Subtract three from every number in the fourth row;
* Consider the $3\times 3$ slice whose top-left corner is *A[1,1]* and whose bottom-right corner is *A[3,3]*. Multiply these entries by multiples of two, starting from zero and ending at 16. (Hint: Use the `arange` and `reshape` commands.)
* Print out your array, and check your result by hand.

In [89]:
w = np.array([[1],[5],[-1]])
print(w)
A = np.diag((1,5,-1)) + np.diag((3,3,3))
print(A)
C = np.array([1,2,3,4,7,4,3,10,-2,7,9,-1]).reshape(3,4)
print(C)
print(np.diag(C,1))

np.diag()

[[ 1]
 [ 5]
 [-1]]
[[4 0 0]
 [0 8 0]
 [0 0 2]]
[[ 1  2  3  4]
 [ 7  4  3 10]
 [-2  7  9 -1]]
[ 2  3 -1]


TypeError: _diag_dispatcher() missing 1 required positional argument: 'v'

#### Transposes
The *transpose* of an array is the same array, but with rows and columns swapped. We will see throughout 218 that transposes play a very important role in linear algebra, starting with the next lab. For now, though, we can view transposition as a physical modification of an array.

If *A* is an $m\times n$ NumPy array, then its transpose is `A.T`.

#### Broadcasting: Adding a Row to all Rows (or a Column to all Columns)
If we want to add one to all entries of an 2-D array, we can just use the syntax `A += 1`. Suppose, however, we want to add one to the first column, two to the second, three to the third, and so on. We could write a command for each column, but NumPy lets us do better!

**Question 13** 
* Create a 1-D array *v* that contains the numbers one through ten (use `np.arange()` and add to the result, or take a slice of it). Then create a $5\times 10$ *A* array all of whose entries are two. Then add the two arrays and examine your output. Explain what happened.
* Now try the same with columns. Take that same array *A*. Create a 1-D array *w* with the numbers one through five. Try adding *A* to *w*. 
* The last operation should failed. Use tranposes (twice) to fix it.

**Question 14** Start from a $6\times 6$ array containing sequential integers from zero to 35. Consider the sub-array whose top left corner is the entry in the second row, first column, and whose bottom right corner is the entry in the fifth row, third column. Multiply each element in this sub-array by ten. Then, by using slices and broadcasting, add the sequence *1, 2, 3* to each row of this slice. Print your answer and check it by hand.

## References and Copies

Unless it has to, NumPy avoids creating new copies of arrays, passing them *by reference* intsead. This can be very useful at times, but has its pitfalls. We work to understand this here.

**Question 15** Declare the array $$A=\begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}$$<br><br>
Then, create what appears to be a new array $B$, which is identical to $A$: `B = A`. Change the entry in the second row, second column of $B$ to 10, then print $A$ and $B$.

What you saw in the last question also applies to arrays that are *views* into a given array. For example:

**Question 16** Create the $5\times 5$ array *A* containing sequential integers from zero to 24. Declare *B* to be the $3\times 3$ array in its center. Multiply *B* by ten, then output *A*.

To create a copy of *A* or a slice, use the syntax `B = A.copy()`. Slices work similarly. If you don't need to keep a copy of the original array, taking advantage of NumPy's passing by reference leads to highly efficient code!

## Integers and Floats: The Biggest Headache in 218L

By default, NumPy arrays contain integers. Sometimes, that is what we want, but it often leads to problematic results. The following question shows a couple of examples:

**Question 17** 
* Declare the same $5\times 5$ array *A* as in the previous question. Try to multiply its first row by 1.5 in two ways:
  * First, try `A[0] *= 1.5`;
  * When you see that fails, try `A[0] = 1.5 * A[0]`, then print *A*. What happened?
* With the same array, try dividing it by two using both methods. Comment on your results.

It is almost always better to use the inline method (`A[0] *= 1.5`, etc) so that code gives an error explicitly. Using the second method is both more typing and will silently give unexpected results, as we saw above.

If you know ahead of time that you will need floating point numbers, you can declare an array to be floating point: 
```python 
A = np.arange(25,dtype='float').reshape(5,5)```

You can also convert an existing array to be floating point:
```python
A = A.astype(float)
```
Lastly, if you are explicitly entering an array, adding a decimal point to any entry will make it a float:
```python
A = np.array([1,2,3])  # This is an integer array
B = np.array([1.,2,3]) # This is a float array
```