## Some fundemental elements of programming I
### Loops

The core of data science is computer programming. To really explore data, we need to be able to write code to 1) wrangle data into a suitable shape for analysis and 2) do the actual analysis and visualization.

If data science didn't involve programming – if it only involved clicking buttons in a statistics program like SPSS – it wouldn't be called data science. In fact, it wouldn't even be a "thing".

What we are going to do in this tutorial is to start looking at some of the core elements of computer programs – the things that, in fact, really make computer programs useful. These elements are

* loops
* conditional tests and Boolean logic
* control flow
* functions

We will explore these things using fairly simple examples (that will also give us practice with indexing). Later, we will see how useful these core elements are when they are combined!

Today, we will look at...

### Loops

A lot of things in life are repetitive. We need to do an entire process over when only little thing has changed. For example, most of us follow the same exact routine every morning (shower/brush teeth/shave/make up/whatever) even though the only thing that has changed is one little number on a calendar. The same is true for computational tasks; a teacher might need to go through the exact same steps to compute a grade for each student, or a data scientist might need to go through the exact same steps to create a plot for several different but identically structured data sets.

Such repetitive tasks are very boring for humans (and bored humans tend to make mistakes!). While computers can't brush our teeth yet (still waiting for those tarter-eating nanobots!), they can help with reapeating calculations over and over using ***loops***.

There are two kinds of loops. There are

* `for` loops, which run a calculation *for* a pre-determined number of times
* `while` loops, which run a calculation *while* some critereon is met

Let's look at these in turn.

##### for loops

The `for` loop will be your workhorse for a lot of tasks. 

Let's look at a very simple `for` loop and then dissect it.

In [None]:
myNewList = [1, 2, 3, 4, 5]
for i in myNewList :
    print(i)

The first line, `myNewList = [1, 2, 3, 4, 5]`, creates a Python list of numbers. The list in Python is a kind of ***iterable***, which is a Python object that will automatically spit out its values one-at-a-time if it's put in a `for` loop.

The next line, `for i in myNewList:`, sets up the for loop. It says that:

* each value in myNewList (the iterable) will be assigned to the variable `i` in turn
* every *indented* line under this line is executed with each value of `i` in turn

The third line self-explanitory; we are just printing the values of `i` to confirm that `i` is, in fact, getting assigned each value of `myNewList` in turn.

(The use of `i` here is by convention only. You can use anything, like `Phredrick` even, as the name of your looping variable. But, just like having numpy nicknamed np, the use of `i` will generally make your code more readable to others, including future you!)

**Note:** The indentation in the `for` loop is key. Python was designed from the ground up to be a very human readable programming language. Appropriate indentation helps make code pretty and readable. As such Python, unlike virtually every other programming language, enforces its use in certain circumstances, like inside a `for` loop. The indentation tells Python "Yep, this line is inside the `for` loop." and the end of indentation tells Python (and you) "Okay, now we're back outside the `for` loop."

Let's experiment with this by computing the square root of some numbers. This `for` loop should run as expected.

In [None]:
aList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in aList :    
    root = i**0.5  # the double splat, "**", is "raise to the power of"
    print('The square root of ', i, ' is ', root)
print('Now the loop is over.')

Now let's see what this one does:

In [None]:
aList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in aList :    
root = i**0.5
print('The square root of ', i, ' is ', root)
print('Now the loop is over.')

Whoopsie!!!

Even if we try to make our intent clear with blank lines, the indentation rules:

In [None]:
aList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in aList :    
    root = i**0.5
    print('The square root of ', i, ' is ', root)

    print('Now the loop is over.')

And because indentation is important, we can't indent willy-nilly just because we feel like it:

In [None]:
aList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for i in aList :    
    root = i**0.5
print('The square root of ', i, ' is ', root)

    print('Now the loop is over.')

Write yourself a `for` loop to compute the square of the first few even numbers. We'll get you started with a list:

In [None]:
myEvens = [2, 4, 6, 8, 10]


Notice that when you hit return after typing the `for ... :` line, Python indented the next line automatically for you. How nice! But sometimes you'll want to go back and edit a `for` loop, or add lines to one, etc. So...

**Important!** When you have to indent code manually, use ***4 spaces*** to indent! Do not use a tab, do not use 3 spaces, do not use 5 spaces, ***use 4 spaces***. This is one thing that Python can be really mean about.

When you become a master coder, you can experiment with this. But don't come running back to me crying when the Python gods smite you and leave you all alone out in the cold having to pick up the pieces of the tattered shambles of your former life.

###### Ranges
Python has a handy dandy thing called a `range()` that works really well with `for` loops. A `range()` spits out a sequence of numbers perfectly suited to feeding a hungry `for` loop. By default, the range starts at zero and increments by one. Like this:

In [None]:
for i in range(5) :
    print(i)

But you can change this by providing a start and a stop, or even a start, stop and step.

In [None]:
for i in range(2, 11) :
    print(i)

In [None]:
for i in range(2, 9, 2) :
    print(i)
print('Who do we appreciate?')

###### for loops and numpy arrays

One great thing about `for` loops is that we can use them to go through the rows or columns of an array (or both!) in turn, repeating some operation on each one. Let's say we need to put the numbers of the binary sequence in the columns of a 10x5 array for some future simulation.

We could do that this way:

In [None]:
import numpy as np
nRows, nCols = 10, 5   # Python let's us do this!
myArraySize = (nRows, nCols)  # we'll make a 10x5 array. Rows always come first!
anArray = np.zeros(myArraySize)

anArray[:,0] = 2
anArray[:,1] = 4
anArray[:,2] = 8
anArray[:,3] = 16
anArray[:,4] = 32

anArray

That works, no doubt. But 

1. there's a lot of "hand coding", which is prone to mistakes
2. it would be a pain to scale up to huge arrays (as we already know)
3. it's ugly 

Now let's do this a cleaner and much more scalable way using a `for` loop.

In [None]:
import numpy as np
nRows, nCols = 10, 5   # Python let's us do this!
myArraySize = (nRows, nCols)  # we'll make a 10x5 array. Rows always come first!
ourNumbers = [2, 4, 8, 16, 32] # numbers that we'll set each column to
anArray = np.zeros(myArraySize)

for i in range(nCols) :
    anArray[:,i] = ourNumbers[i]
    
anArray

So we've swapped this:

`anArray[:,0] = 2
anArray[:,1] = 4
anArray[:,2] = 8
anArray[:,3] = 16
anArray[:,4] = 32`

(Yuk.)

for this:

`for i in range(nCols) :
    anArray[:,i] = ourNumbers[i]`
    
(Nice.)

which is already a huge improvement. But imagine if we were working with a 1000 or 10,000 element array! Doing it the first way – well – you can imagine. But doing it the second way, all we would have to do is change `nCols` and be a bit clever and compute `ourNumbers` automatically.

 Wait, what? How would we compute the binary sequence – the powers of 2 – automatically? 
 
 With a `for` loop of course! Let's do that!

In [None]:
ourNumbers = list() # Make an empty Python list
for i in range(nCols) :
    ourNumbers.append(2**(i+1))
ourNumbers

Now we can write our code in way that is completely scalable using a single `for` loop:

In [None]:
import numpy as np
nRows, nCols = 10, 5   # Python let's us do this!
myArraySize = (nRows, nCols)  # we'll make a 10x5 array. Rows always come first!
anArray = np.zeros(myArraySize)

for i in range(nCols) :
    anArray[:,i] = 2**(i+1)
    
anArray

Notice that, now, the ***only*** thing we need to change to compute and add more or fewer powers of 2 to our array is a single value – nCols in this case – everything else is done automatically!

---

###### Coding challenges!

Write code (using a `for` loop of course) to compute the cube of the odd numbers from 1 to 9. (Remember that `range()` can take a step argument.)

Write scalable code to compute the first "n" numbers of the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number). The Fibonacci sequence starts with the numbers 0 and 1, and each number after that is the sum of the previous two numbers. (Galileo, da Vinci, and Franco aren't the only famous Italian scientists/mathematicians!). 

##### while loops

Sometimes we wish to repeat a calculation (or something) until some critereon is reached. For example, perhaps we need 100 samples from the the positive half of the standard normal distribution.

#### Boo! Logical Tests and Boolean Operators

Believe it or not, everything that happens on your phone or computer comes down to lots (and I mean **LOTS**) of little decisions based on one or two inputs that can be either "True" or "False", and an output that can also be "True" or "False". Seriously, everything on any digitial device – from Tic Toc videos to your Python code – comes down to a whole bunch of truths and falsehoods (ones and zeros) that are themselves the result of decisions based on other truths and falsehoods. The "decision makers" are actual physical (but teeny teeny tiny) devices  that are combinations of things called *transistors* that can act as:

* ***unary*** operators like = (equals) and > (greater than) that yield "True" or "False"
* ***binary*** operators that use ***Boolean logic***, which compares two "True" or "False" inputs

Let's play with this. It might seem a bit silly and obvious now, but the power of logical tests will reveal itself soon.

Let's set a variable `x` to 11. (Why only go to ten when you can go to 11?)

In [None]:
x = 11
x

Now let's do some logical tests on our variable `x`. Let's see if it's less than 42.

In [None]:
x < 42

Now you test if x is greater than 42.

We can also test for equality.

In [None]:
x == 42

In [None]:
x == 11

And we can test for inequality.

In [None]:
x != 42

The exclamation point here means "not", so the experession `x != 42` can be read as "Is x not equal to 42?"

And the answer is "That's `True`! The variable `x` is not equal to 42!"

Now you test `x` to see if it's not equal to 11.

#### Control Flow

In [None]:
x = 3
if x > 5 :
    print('big!')
else :
    print('small!')

In [None]:
temp = 120
if temp >= 110 :
    print('Too hot!')
elif temp <= 85 :
    print('Too cold!')
else :
    print('Just right!')

In [None]:
import numpy as np
myRnds = np.random.randn(1, 5)
myRnds

In [None]:
myRnds > 0

In [None]:
myRnds[myRnds > 0]

#### Functions

In [None]:
def coinflip() :
    import numpy as np
    flip = np.random.randint(2) # return a random 0 or 1
    if flip == 0 :
        result = 'Tails!'
    else :
        result = 'Heads!'
    return result

coinflip()

### Optional: "List Comprehension" in Python

Python itself has a cute way to make lists called *list comprehension*

#### range objects

Python has a special kind of object called a "range". It produces, on demand, range of integers without having to make a list of integers "by hand".

Let's make a range object, look at its type, and look at its contents.

In [None]:
myRange = range(10)

In [None]:
type(myRange)

So, note that it is *not* a list of integers, it is a "*range*". We can see this by looking at its contents:

In [None]:
print(myRange)

So, again, `myRange` is not a list of 0, 1, 2, ..., 9. Rather, it is an object that will produce these integers on demand in a for loop. 

In [None]:
# square roots of the first
roots = [] # make empty Python list
for i in myRange:
    this_root = i*i
    roots.append(this_root)
print(roots)