# Lecture 9: July 12th, 2024

__Reminders and Announcements:__

* Complete your MATLAB outcomes if you haven't already.
* Jessica will cover some new material in discussion today: "Timing a Computation" and "Counting in NumPy".

## Alternating List
__Big Goal:__ Write a function that creates a length $n$ list of alternating $3$'s and $7$'s $[3,7,3,7 \dots]$.

## Constant List

__Mini Goal:__ Write a function that takes as input a natural number $n$ and as output returns a length $n$ list $[3,3, \dots, 3]$.

Below is the function that we wrote last lecture. We started today by talking a little bit about how the variable `n` defined in `constant(n)` is what's called a _local variable_. Outside of the function, `n` is not defined. We saw some examples of this by printing `n` at various places throughout the cell.

In [17]:
def constant(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
    #print(n)
    return mylist
#print(n)

In [3]:
constant(6)

[3, 3, 3, 3, 3, 3]

Notice that `n` is a locally defined variable. If we try to call it below we get an error.

In [4]:
n

NameError: name 'n' is not defined

Defining `n` separately does not impact `constant(4)`.

In [5]:
n = 100

In [14]:
constant(4)

4


[3, 3, 3, 3]

## Alternating List – Part 2 

In [28]:
def alt1(n):
    mylist = constant(n)
    #with this method, we don't check 
    #whether an index is even or odd
    for i in range(1,n,2):
        mylist[i] = 7
    return mylist

In [18]:
alt1(5)

[3, 3, 3, 3, 3]

It's typically not a good idea to copy-paste blocks of code. Notice that we call `constant(n)` instead of pasting the code for it. 
* If I need to make a change to a part of the code, it's best to only have to change it in one place. 
* This is the main reason why we encourage you not to copy-paste. Think about it: if there's a mistake somewhere and you copy-paste it, you've now introduced even more mistakes.

In [23]:
alt1(4)

[3, 7, 3, 7]

In [24]:
alt1(5)

[3, 7, 3, 7, 3]

__Question from the chat:__ how do we know when to use `()` versus `[]`. <br>
__Answer:__ this is one of those things that you'll just have to get used to with time, but there are a few general rules.
* In Python, `[]` are typically used for indexing, and making lists.
* `()` are typically used with functions, or to make tuples.

## Alternating List – Part 3 

### Checking if the index is even or odd 

In [45]:
#this example shows how to use elif statements
#the most elegant way to do this is just with if-else
def alt2(n):
    mylist = []
    for i in range(n):
        #if i is even, append 3
        #else, append 7
        if i%2 == 0:
            mylist.append(3)
        elif i%2 == 1:
            mylist.append(7)
        elif i%2 == 17: #never will happen
            disp("Something is wrong!")
    return mylist

We can check whether an index is even or odd by looking at its remainder when we divide by 2.

In [29]:
6%2 #remainder of 6 divided by 2

0

In [30]:
7%2

1

In [31]:
7%4 #remainder of 7 divided by 4

3

In [46]:
alt2(5)

[3, 7, 3, 7, 3]

In [47]:
alt2(10)

[3, 7, 3, 7, 3, 7, 3, 7, 3, 7]

We might think that the `alt1` function is shorter (takes fewer lines of code) than `alt2`, but don't forget that we call `constant` in `alt1` which contributes about three extra lines of code.

## Alternating List – Part 4
### Using NumPy 

* NumPy is one of the most famous Python libraries. It stands for Numerical Python, and should remind you of MATLAB.
* It does need to be installed, though. So let us know if you're having any trouble with it.

In [48]:
import numpy as np

* `import numpy`, as you would expect, tells Python to import NumPy.
* `as np` means instead of me calling it NumPy, I'm going to call it `np`. 
* Note: `np` is a standard naming convention. Avoid using any other name for it.


The first function we see should be very familiar.

In [49]:
np.zeros((3,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [51]:
A = np.zeros((3,5))
type(A)

numpy.ndarray

Notice this is a little different from MATLAB, where we would instead expect a $7 \times 7$ matrix

In [52]:
np.zeros(7)

array([0., 0., 0., 0., 0., 0., 0.])

In [54]:
np.zeros(7).shape

(7,)

In [56]:
#Getting index 0 element of array
type(np.zeros(7)[0])

numpy.float64

Maybe we want our array to have integers instead. Let's call help to see how to change the data type.

In [57]:
help(np.zeros)

Help on built-in function zeros in module numpy:

zeros(...)
    zeros(shape, dtype=float, order='C', *, like=None)
    
    Return a new array of given shape and type, filled with zeros.
    
    Parameters
    ----------
    shape : int or tuple of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: 'C'
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    like : array_like, optional
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this arg

In [58]:
np.zeros(7, dtype=np.int64)

array([0, 0, 0, 0, 0, 0, 0])

In [59]:
#technically, I don't need to specify dtype
np.zeros(7,np.int64)

array([0, 0, 0, 0, 0, 0, 0])

In [60]:
def alt3(n):
    A = np.zeros(n,dtype=np.int64)+3
    return A

In [61]:
alt3(4)

array([3, 3, 3, 3])

In [62]:
alt3(5)

array([3, 3, 3, 3, 3])

Here's something a little bit special with NumPy. We're used to it from MATLAB, but it won't work in "base" Python. 

In [63]:
alt3(5) + 3

array([6, 6, 6, 6, 6])

Notice the following causes an error:

In [64]:
mylist = [0,0,0]
mylist+3

TypeError: can only concatenate list (not "int") to list

In [65]:
def alt3(n):
    A = np.zeros(n,dtype=np.int64)+3
    A[1::2] = 7
    return A

In [66]:
alt3(5)

array([3, 7, 3, 7, 3])

In [67]:
alt3(10)

array([3, 7, 3, 7, 3, 7, 3, 7, 3, 7])

We might be surprised that the following doesn't work with lists.

In [68]:
mylist = [3,1,4,1,5,9]

We can still slice lists:

In [70]:
mylist[::2]

[3, 4, 5]

but Python doesn't know what we mean by the following:

In [71]:
mylist[::2] = -17

TypeError: must assign iterable to extended slice

Because the slice has length 3, I need to pass a length 3 list of what values I'd like to assign.

In [72]:
mylist[::2] = [-17,-17,-17]

In [73]:
mylist

[-17, 1, -17, 1, -17, 9]

In [74]:
mylist[::2] = [-18,-18]

ValueError: attempt to assign sequence of size 2 to extended slice of size 3

## Alternating List – Part 5
### Timing Strats

The point of this section is to show you that NumPy is _very_ fast.

In [95]:
def alt3(n):
    A = np.zeros(n,dtype=np.int64)+3
    A[1::2] = 7
    return A

Let's try to write a function that mimics `alt3`, but doesn't use NumPy.

In [75]:
def alt4(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
    for i in range(1,n,2):
        mylist[i] = 7
    return mylist

In [76]:
alt4(10)

[3, 7, 3, 7, 3, 7, 3, 7, 3, 7]

In [77]:
alt4(11)

[3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3]

We've already seen examples with Jupyter magics and timing:

In [78]:
%%timeit
alt4(10)

2.76 µs ± 907 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


What if I only really care about the first number?

In [79]:
import time

This returns the amount of time that's passed since the beginning of "computer time" (Unix time): January 1st, 1970. 

In [80]:
time.time()

1720803551.305769

Notice that if I call it again, some time has passed.

In [81]:
time.time() 

1720803594.866464

Using this concept, we can time how long it takes to run `alt4(10**3)`.

In [82]:
start = time.time()
alt4(10**3)
end = time.time()
t = end - start
print(t)

0.0010256767272949219


I want to find the first $n$ such that `alt4(n)` takes more than 5 seconds to run. To do this, we'll introduce while loops in Python.

In [85]:
n = 100
t = 0
while t < 5:
    n = n*2
    start = time.time()
    alt4(n)
    end = time.time()
    t = end - start
print(n)

52428800


__Question:__ Think about why we call `n = n*2` at the start of the loop, and not at the end.

Notice that if I reduce `n` by half, it takes less than 5 seconds to run.

In [91]:
start = time.time()
alt4(int(n/2))
end = time.time()
t = end-start

In [92]:
t

4.063894033432007

Let's not compare this to the time it takes for `alt3(n)`.

In [93]:
n

52428800

In [96]:
start = time.time()
alt3(n)
end = time.time()
t = end-start
print(t)

0.5635237693786621


You should stop and appreciate just how fast NumPy is!

Here, we use the NumPy version, but convert the output to a list.

In [97]:
def alt3_list(n):
    A = np.zeros(n,dtype=np.int64)+3
    A[1::2] = 7
    return list(A)

In [98]:
start = time.time()
alt3_list(n)
end = time.time()
t = end-start
print(t)

6.5183188915252686


Notice that converting to a list has slowed us down significantly.

Next, we instead use a for-loop to change the odd indices. We'll see that this also slows us down quite a bit.

In [99]:
def alt3_for(n):
    A = np.zeros(n,dtype=np.int64)+3
    for i in range(1,n,2):
        A[i] = 7
    return A

In [100]:
start = time.time()
alt3_for(n)
end = time.time()
t = end-start
print(t)

5.239586114883423


__Takeaways:__ for-loops and lists are very slow, while NumPy is fast :)

***

_This is the start of Python Unit 2._
It is all about NumPy!

## Regular Arrays

__Goal:__ Make NumPy arrays that follow certain patterns.

In [101]:
import numpy as np

* $[0, \dots, 0]$ (length 13)

In [104]:
#we've already seen this!
np.zeros(13,dtype=np.int64)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [105]:
#this is called list comprehension, it's not specific to NumPy
#we'll get more practice with it in unit 3
[0 for _ in range(13)]

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Some kind of cool examples, but not super important.

In [106]:
[0] + [0]

[0, 0]

In [107]:
[0]*13

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

* A $3 \times 5$ matrix of all 7s.

I'll start by making a small, but common, mistake.

In [108]:
np.zeros(3,5)+7

TypeError: Cannot interpret '5' as a data type

In [109]:
help(np.zeros)

Help on built-in function zeros in module numpy:

zeros(...)
    zeros(shape, dtype=float, order='C', *, like=None)
    
    Return a new array of given shape and type, filled with zeros.
    
    Parameters
    ----------
    shape : int or tuple of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: 'C'
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    like : array_like, optional
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this arg

In [112]:
np.zeros((3,5),dtype=np.int64)+7

array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

In [113]:
np.ones((3,5),dtype=np.int64)*7

array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

Keep these examples in mind for broadcasting. We start with a $3 \times 5$ array of all ones/zeros, and know how to multiply/add 7 to each of the components.

* $[0,…,200]$ length 5, evenly distributed

This should make you think of `linspace` from MATLAB

In [114]:
help(np.linspace)

Help on function linspace in module numpy:

linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
    Return evenly spaced numbers over a specified interval.
    
    Returns `num` evenly spaced samples, calculated over the
    interval [`start`, `stop`].
    
    The endpoint of the interval can optionally be excluded.
    
    .. versionchanged:: 1.16.0
        Non-scalar `start` and `stop` are now supported.
    
    .. versionchanged:: 1.20.0
        Values are rounded towards ``-inf`` instead of ``0`` when an
        integer ``dtype`` is specified. The old behavior can
        still be obtained with ``np.linspace(start, stop, num).astype(int)``
    
    Parameters
    ----------
    start : array_like
        The starting value of the sequence.
    stop : array_like
        The end value of the sequence, unless `endpoint` is set to False.
        In that case, the sequence consists of all but the last of ``num + 1``
        evenly spaced samples, so that 

Notice that this is one of the rare examples where the right-endpoint is included.

In [115]:
np.linspace(0,200,5)

array([  0.,  50., 100., 150., 200.])

* The $5 \times 5$ matrix
$$
\begin{pmatrix}
0 & 0 & 0 & 0 & 0 \\
1 & 1 & 1 & 1 & 1 \\
2 & 2 & 2 & 2 & 2 \\
3 & 3 & 3 & 3 & 3 \\
4 & 4 & 4 & 4 & 4
\end{pmatrix}
$$

In [118]:
arr = np.zeros((5,5),dtype=np.int64)
arr

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

In [119]:
#0th row of arr
arr[0]

array([0, 0, 0, 0, 0])

In [120]:
#index 1 row of arr
arr[1]

array([0, 0, 0, 0, 0])

In [121]:
for i in range(5):
    arr[i] = i

In [122]:
arr

array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4]])

* The $2 \times 5$ matrix 
$$
\begin{pmatrix}
2 & 5 & 8 & 11 & 14 \\
17 & 20 & 23 & 26 & 29
\end{pmatrix}
$$

In [123]:
arr = np.zeros((2,5),dtype=np.int64)
arr

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

Let's start by making a mistake:

In [124]:
for i in range(2):
    for j in range(5):
        arr[i,j] = 2 + 3*j

In [125]:
arr

array([[ 2,  5,  8, 11, 14],
       [ 2,  5,  8, 11, 14]])

In [126]:
for i in range(2):
    for j in range(5):
        arr[i,j] = 2 + 3*j + 15*i

In [127]:
arr

array([[ 2,  5,  8, 11, 14],
       [17, 20, 23, 26, 29]])

Let's make our code a little bit more readable, just so that we understand what's going on.

In [128]:
cols = 5
step = 3
rows = 2
arr = np.zeros((rows,cols),dtype=np.int64)
for i in range(rows):
    for j in range(cols):
        arr[i,j] = 2 + step*j + cols*i*step

In [129]:
arr

array([[ 2,  5,  8, 11, 14],
       [17, 20, 23, 26, 29]])

Here's how we could make this array without for-loops.

In [130]:
np.arange(2,30,3)

array([ 2,  5,  8, 11, 14, 17, 20, 23, 26, 29])

In [131]:
np.arange(2,30,3).reshape((2,5))

array([[ 2,  5,  8, 11, 14],
       [17, 20, 23, 26, 29]])

This is where we ended today's lecture. We'll get to the material below starting Monday :)

## Random Numbers

__Note:__ I’m first going to show you the old way of generating random numbers in NumPy. If you’re reading older code, this is likely the way you’ll see random numbers generated. All new code should be written using the new method that I show you.

* Make a length 10 NumPy array of random integers between 0 (inclusive) and 39 (exclusive).

* Choose five of those numbers (with replacement) and put them into a NumPy array.

* Create a $3 \times 5$ NumPy array of random real numbers between -1 and 4.

* Create a length 10 NumPy array of random numbers that follow a normal distribution with mean 2 and standard deviation 0.1.

## Changing Rows and Columns 

__Goal:__ Modify rows and columns of NumPy arrays.

In [1]:
import numpy as np

In [2]:
arr = np.zeros((4,4),dtype=int)
for i in range(4):
    arr[i] = i
arr

array([[0, 0, 0, 0],
       [1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3]])

## Broadcasting