Let's talk more! about numpy
----------------------

**Going from lists to arrays and figuring out if that worked well...**

In [None]:
import numpy as np

# how do I make a numpy array from a python list?
pylist = [[10, 1, 2021], [2, 9, 2022]]
nparray = np.array(pylist)

# how do I print a numpy array?
print(nparray)

# and back to a list?
backtolist = nparray.tolist()
print(backtolist)

In [None]:
# how do I figure out the type of a numpy array? 
print(nparray.dtype)

# hmm, will this work and why or why not?
print(backtolist.dtype)

In [None]:
# how do I change the type of a numpy array?
nparrayFloat = nparray.astype(float)
print(nparrayFloat.dtype)

nparrayStr = np.array(nparrayFloat, dtype=str)
print(nparrayStr.dtype)

nparrayFloat = nparrayStr.astype(float)
print(nparrayFloat.dtype)

In [None]:
# (review!) how do I see the number of dimensions, number of elements, and shape of a numpy array?

**Making numpy arrays...**

In [None]:
# make an array of zeros
nparrayZero = np.zeros([3, 10])
print(nparrayZero)

# that's floats ... what if we want ints?

# what if we want zeros instead of ones?

# what if we want sevens?

In [None]:
# make an array of random values
nparrayRandomFloat = np.random.random([3, 10])
print(nparrayRandomFloat)
print(nparrayRandomFloat.dtype)

# what if we want random ints? let's see...
nparrayRandomFloat = np.random.random([3, 10], dtype=int)

# hmm, if not that then what?


In [None]:
# what if we want random floats in an interval?
print(np.linspace(0, 10, 10))

# what if we want to shape that into a 2 by 5 array?

# what if we want random ints in an interval?

**Getting access to elements and "slices" of numpy arrays...**

In [None]:
# (review!) how do I access an element in an array?
print(nparrayRandomFloat[0][0])

# is there a prettier way?
print(nparrayRandomFloat[0, 0])

# this is only marginally prettier for a 2-d array but imagine a 10-d array!

In [None]:
print(nparrayRandomFloat)

# how do I access the whole second column?
print("first column")
print(nparrayRandomFloat[:, 1])

# what about the whole second row?
print("first row")
nparrayRandomFloat[1, :]

# what about the last two rows?

# what about the first row and last two columns?

In [None]:
# how do I access the 1st and 3rd columns?
print("first and third columns")
print(nparrayRandomFloat[np.ix_(np.arange(nparrayRandomFloat.shape[0]), [1, 3])])

# whaaaat was that?

**Modifying (slices of) arrays...**

In [None]:
# how do I *change* the element at 1, 1 of the array?

In [None]:
# and now for some magic! how do I assign the second row to 1s?
nparrayRandomFloat[1] = 1
print(nparrayRandomFloat)
print(nparrayRandomFloat.dtype)

# how do I assign the second row to increasing ints?
nparrayRandomFloat[1] = np.arange(nparrayRandomFloat.shape[1])
print(nparrayRandomFloat)
print(nparrayRandomFloat.dtype)

# how do I assign the second row to 3* itself?


**Copying numpy arrays...**

In [None]:
# let's try the obvious thing
nparrayRandomFloat2 = nparrayRandomFloat
print("nparrayRandomFloat")
print(nparrayRandomFloat)
print("nparrayRandomFloat2")
print(nparrayRandomFloat2)

In [None]:
nparrayRandomFloat2[0,0] = 0
print("nparrayRandomFloat")
print(nparrayRandomFloat)
print("nparrayRandomFloat2")
print(nparrayRandomFloat2)

# whaaat just happened??
# how do we stop that happening?? hint, what are we doing? we are *copying*

**Doing other things to a whole row or column...**

In [None]:
# (review!) how do we assign value(s) to a row or column?
nparrayRandomFloat[:1] = np.zeros(nparrayRandomFloat.shape[1])
print(nparrayRandomFloat)

In [None]:
# let's sum across each column
np.sum(nparrayRandomFloat, axis=0)

# how would we sum across each row?



In [None]:
# what if we had a tensor?
nptensorFloat = np.ones([3, 4, 5])
print(nptensorFloat)

np.sum(nptensorFloat, axis=2)

In [None]:
# what if we don't specify an axis?

In [None]:
# what other functions can we apply across axes?

In [None]:
# let's take it up a notch

nparrayRandomInt = np.random.randint(low=0, high=10, size=(3,4))
print(nparrayRandomInt)

print(nparrayRandomInt - np.min(nparrayRandomInt, axis=0))

# whaaat just happened? let's look at the shapes


In [None]:
# why is this cool?
import timeit

def sumLoop():
    '''Use for loop to sum a row vector'''
    longRow = np.array([i for i in range(1, 1000000000)])
    theSum = 0
    for i in range(len(longRow)):
        theSum += longRow[i]

def sumVectorized():
    '''Vectorized version of summing a row vector'''
    longRow = np.array([i for i in range(1, 1000000000)])
    theSum = np.sum(longRow)

print(timeit.timeit(lambda: sumLoop))
print(timeit.timeit(lambda: sumVectorized))

In [None]:
# what if we try to do the subtract-min thing across axis 1?
print(nparrayRandomInt - np.min(nparrayRandomInt, axis=1))


In [None]:
# how can we fix that? make the arrays shape-compatible!
print(nparrayRandomInt - np.min(nparrayRandomInt, axis=1)[:, np.newaxis])

In [None]:
# is there another way to achieve this?
print(nparrayRandomInt - np.min(nparrayRandomInt, axis=1, keepdims=True))


Five Jupyter tips
------------------

1. To run a cell, you can hit Ctrl+Enter 
2. Quite often, a Jupyter "mistake" happens if you forget that this cell has all the memory of every cell that was already run, 
3. and only those cells, 
4. and only the last time they were run
5. To go into "select mode", hit Esc, then you can easily navigate from cell to cell

Markdown
--------

These cells that look like fancy text are in Markdown. Markdown cheat sheet: https://www.markdownguide.org/cheat-sheet