# QBio REU Intermediate Python 
## Week 1: Review of Python Basics 

Prepared by John Russell (johnrussell@g.harvard.edu)

June 2020

This tutorial will guide you through the basics of Python. For now, scroll through the cells and run them(`shift+enter`) to see what they each do. Feel free to modify or add to them by double clicking in a cell.

[Jake Vanderplas ](https://jakevdp.github.io/) runs a great blog exploring scientific computing in Python. He has also written a free book that goes into more detail on the topics introduced in this section. [Whirlwind tour of python](https://jakevdp.github.io/WhirlwindTourOfPython)

### Installing some stuff and getting familiar with anaconda

If you're on Mac/Linux open Terminal and if you're on Windows open Anaconda Prompt.

Type the following stuff into the prompt. Press enter after each line

`conda create -n reu python=3`

`conda activate reu`

`conda install -c conda-forge jupyter pandas matplotlib`

`conda install -c conda-forge scipy=1.5`

### Hello World

In [31]:
print("Hello World")

Hello World


This is very simple but print statements are key for debugging.

### Variables

Much like in algebra, python allows us to assign values to particular symbols and refer to them later. We can also build up new variables by carrying out operations on existing variable. Unlike algebra, however, the values we associate with a variable do not need to be numbers.

In [33]:
3*13

39

In [34]:
x = 7
y = 11
z = x+y
a = "Look! A String!"

In [35]:
print(x)
print(y)
print(z)
print(a)

7
11
18
Look! A String!


In [37]:
print(f"the value of x+z is {x+z}")

the value of x+z is 25


### Arithmetic and Comparison


$2 \left(x-3\right)+\frac{7y}{2}$

In [38]:
2*(x-3) + 7*y/2

46.5

$x^3 - 5x^2 -10x +1$

In [39]:
x**3 - 5*x**2 - 10*x + 1

29

#### Equals

In [40]:
x == y

False

#### Not Equals

In [41]:
x != y

True

#### Multiple Comparisons

In [42]:
x>3 and y<=10

False

In [43]:
x>3 or y<10

True

In [44]:
10 < y < 20

True

Modular division

In [45]:
y%x

4

### Lists

Sometimes you dont want to create a variable for every number or string you have, especially if all the values are related in some way. To make this more convenient, we can store values in a list. Lists are initialized by square brackets surrounding comma separated values.

In [46]:
my_list = [12, 6, 21]

Lists are mutable which in this case means we can `append` new items to them.

In [47]:
my_list.append(8)
print(my_list)

[12, 6, 21, 8]


In [48]:
len(my_list)

4

Lists are ordered so we can *index* them. Python is zero-indexed so to get the first element we would call `my_list[0]`

We can also change single elements in a list e.g. with `my_list[2] = 25`

In [49]:
print(my_list[0])
print(my_list[2])
print(my_list[-1])

12
21
8


In [50]:
print(my_list)

[12, 6, 21, 8]


In [51]:
my_list[2] = 16
print(my_list)

[12, 6, 16, 8]


In [52]:
my_list[1] = my_list[1]+2

In [53]:
print(my_list)

[12, 8, 16, 8]


We can select portions of a list using slicing

In [54]:
print(my_list[1:3]) #includes the first element, excludes the last
print(my_list[2:]) # from element 2 onwards
print(my_list[:-1]) # from the beginning up to but not including the last element

[8, 16]
[16, 8]
[12, 8, 16]


Lists can also contain strings or anything else. They can even contain multiple different types of things. In many ways this is really convenient, and is part of python's signature flexibility. It also makes lists slow.

In [55]:
names = ["Timmy", "Tommy", "Tammy"]
misc_list = [14, "James", 1.635, [3,4,5]]

Python makes it very easy to check whether items are in lists (and other data structures)

In [56]:
"John" in names

False

In [57]:
14 in misc_list

True

Most importantly, we can create lists of lists, or lists of lists of lists, and so on.

In [58]:
ptriples = [[3,4,5],[5,12,13],[7,24,25],[8,15,17]]

In [59]:
print(ptriples)

[[3, 4, 5], [5, 12, 13], [7, 24, 25], [8, 15, 17]]


In [61]:
ptriples[1][-1]

13

In [62]:
len(ptriples)

4

In [63]:
len(ptriples[0])

3

This list of lists is naively how you would represent a matrix in python. We'll talk later about a better way.

### Loops

Loops are way to do things many times over automatically. Most commonly used is the `for` loop.

In [64]:
for i in range(5):
    print(i**2)

0
1
4
9
16


The loop above is executed for every `i` in the list [0,1,...4], or N times total. As long as we provide an iterable, i.e a list or an array to the for loop, we dont need to simple iterate of over all integers.

In [65]:
nums = [3,17,42,111,138]
print("n","n^2")
for n in nums: #Lists are "iterable" we can loop over them without indexing
    print(n, n**2)

n n^2
3 9
17 289
42 1764
111 12321
138 19044


In [66]:
for i in range(len(nums)):
    print(nums[i], nums[i]**2)

3 9
17 289
42 1764
111 12321
138 19044


There are other sorts of loops by `for` is the most common by far and likely all you need for this course. Another very useful command is `enumerate` 

In [67]:
names = ['Jim',"Bob", "Sue", "Ann"]
for i, name in enumerate(names):
    print(i,name)

0 Jim
1 Bob
2 Sue
3 Ann


So it counts your progress through the list and pulls the item from the list without your needing to index explicitly. 

In [68]:
ptriples

[[3, 4, 5], [5, 12, 13], [7, 24, 25], [8, 15, 17]]

In [69]:
#Looping over a 2D list
for i in range(4):
    squares = []
    for j in range(3):
        squares.append(ptriples[i][j]**2)
    print (squares[0]+squares[1],squares[2])

25 25
169 169
625 625
289 289


In [71]:
i = 0 
while i**2 < 100:
    print(i)
    i += 1

0
1
2
3
4
5
6
7
8
9


### Conditionals



So far, we have just shown that python is a calculator, in fact it is capable of much more. A key to performing more complicated calculations is conditional statements.

In [75]:
v = 10
if v>10:
    print("v is greater than ten")
elif v==10:
    print("v equals ten")
else:
    print("v is smaller than ten")

v equals ten


To see the other behavior of the `if` statement, change the value of `x` in cell 7. Below is an example of a slightly more complicated use of if statements.

In [77]:
w = 3.5
R = 3
if w>= -R and w <= R:
    print((R**2 - w**2)**0.5)
else:
    print ("|w| is too large. Function not defined.")

|w| is too large. Function not defined.


A final example:

In [78]:
for i, trip in enumerate(ptriples):
    if i%2==0:
        print(trip)

[3, 4, 5]
[7, 24, 25]


In [79]:
for i, trip in enumerate(ptriples):
    if 5 in trip:
        print(trip)

[3, 4, 5]
[5, 12, 13]


### Dictionaries


Dictionaries are a different way of storing information. Instead of indexing them numerically each element contains a *key* and an associated *value*. We then lookup the key to get back its value. As an example, lets say we want to store some constants from physics. We could make the following dictionary

In [80]:
physics_constants = {'c':3e8, 'g':9.8, 'e':1.6e-16, 'eps0':8.85e-12}

Then to retrieve the value of the charge of a proton we would call

In [81]:
physics_constants['e']

1.6e-16

We can add some more constants from quantum mechanics:

In [84]:
physics_constants['h'] = 6.626e-34
physics_constants['alpha'] = 1./137
print(physics_constants)

{'c': 299800000.0, 'g': 9.8, 'e': 1.6e-16, 'eps0': 8.85e-12, 'h': 6.626e-34, 'alpha': 0.0072992700729927005}


In [83]:
physics_constants['c'] = 2.998e8

The if you need to look up values based on keys a dictionary is the appropriate structure. It is faster to look things up in a dictionary that to find them in an array or list. Also keep in mind that the keys should be integers or strings but the values can be anything, numbers, strings, lists, arrays, even dictionaries.

In [86]:
for k in physics_constants:
    print(k, physics_constants[k])

c 299800000.0
g 9.8
e 1.6e-16
eps0 8.85e-12
h 6.626e-34
alpha 0.0072992700729927005


In [87]:
len(physics_constants)

6

### Functions

In math we know that a function is roughly an object that takes a number or numbers in and spits out a different number or numbers. One such function is:

$$f(x) = \dfrac{x^3 - 2(x-3)^2}{1+x^2}$$ 

Python functions can act just like these mathetmatical functions. I'll define this one below.

In [88]:
def f(x): #def name_of_function(argument):
    #Indent once - write as many lines as you want
    num = x**3 - 2*(x-3)**2
    denom = 1+x**2
    return num/denom #return function_output

In [97]:
def g(x,y,z):
    return 3*x+2*y-z

In [99]:
g(0,2,5)

-1

In [89]:
f(0)

-18.0

In [90]:
f(5.23)

4.694753164579285

In [93]:
f(x)

6.22

Python functions, in many ways, are more general than mathematical functions. For one, they can take any python object or combination of objects as input and return any python object(s). Lets make such a function now. The function below takes a list of strings and returns the longest one.

In [94]:
def longest_word(word_list):
    longest = '' #empy string can be a useful thing
    for word in word_list:
        if len(word)>len(longest):
            longest = word
    return longest

In [95]:
my_words = ["screen", 'phone', 'keyboard']

In [96]:
longest_word(my_words)

'keyboard'

To retiterate: the key parts of a function definition are:
- `def` statement
- function name
- arguments
- indentation
- return

Finally, I'd emphasize that if you find yourself copying code more than three times, you should turn that 

## Fixing errors
No matter how skilled you are at programming you will inevitably make mistakes in your code. Here we discuss a few common mistakes and how to identify and fix them.

In [101]:
L

[3, 4, 16, 23]

In [103]:
L = [3,4,16,23]
L[-1]

23

In [105]:
for x in L:
    print(x**3-2*(x-3))

27
62
4070
12127


In [109]:
[3] + ptriples[2]

[3, 7, 24, 25]

In [112]:
ptriples[1][2]

13

### Jupyter Tips and Tricks

Jupyter notebooks have become the go to environment for writing python code for data scientists (and subsequently for courses). They allow you to nicely present your code alongside descriptions and discussions. The cells that contain text are written in Markdown. If you double click on any of the Markdown cells in this notebook, you can see some of the basic commands. A straightforward guide to using Markdown can be found [here](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). If you want to learn some fancier tricks and get introduced to some handy keyboard shortcuts, check out [this article](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/).

A few important shortcuts: 
- `esc + m` turns a code cell into a Markdown cell and `esc + y` turns a Markdown cell into a code cell. 
- To insert a new cell above the current one, press `esc + a`. To insert below, `esc + b`. 
- To delete a cell press `esc + d + d`. 
- If you are running a cell and want it to stop, press `esc + i + i`. 

A great feature of Markdown is that it can typeset $\LaTeX$ the standard typesetting software for math and science. Latex has its own syntax that you can google about but the basic idea is this: putting markdown in between two `$`s tells Markdown to render in Latex.

` $ \int_{-\infty}^\infty e^{-x^2} dx = \sqrt{\pi}$`

Will appear as

$ \int_{-\infty}^\infty e^{-x^2} dx = \sqrt{\pi}$

Using `$$` renders in "display mode", i.e. centered and slightly larger.

$$ \int_{-\infty}^\infty e^{-x^2} dx = \sqrt{\pi}$$


There is lots to be done in Jupyter but for now, just consider using the different headers to label your work and including a cell after a problem to discuss the results if that is required.

Depending on the exact nature of your research, using notebooks in this way can be very helpful for you and for people who want to look at your work. 

## For next week.

- Give the exercises a shot.
- If you want to prepare take a look at the [Intro to numpy](https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html) from the Python data science handbook. We will go through pretty much everything covered there next week.
- You could also take a look at the first two sections of the [Intro to Matplotlib](https://jakevdp.github.io/PythonDataScienceHandbook/04.00-introduction-to-matplotlib.html) which we will begin to cover next week but may be helpful for your research in the meantime.