# Discussion 0: Intro to Python

This is an extra discussion notebook for people who would like an introduction or refresher of the basics of Python programming. If you feel comfortable with basic programming principles and Python syntax, feel free to jump right to Discussion 1.

## 1) Jupyter notebooks
This file is called a Jupyter notebook. A notebook is a place to write programs and view their results. In this class, we are going to be running Jupyter Notebooks in Google Colab, which is a platform that allows for editing of Jupyter Notebooks without any offline setup (kind of like Google Drive but for Jupyter Notebooks).

### Text cells
In a notebook, each rectangle containing text or code is called a *cell*.

Text cells (like this one) can be edited by double-clicking on them. They're written in a simple format called [Markdown](http://daringfireball.net/projects/markdown/syntax) to add formatting and section headings.  You don't need to learn Markdown, but you might want to.

After you edit a text cell, simply click outside the cell to confirm any changes. (Try not to delete any instructions already in the notebook.)

**Exercise**: This paragraph is in its own text cell.  Try editing it so that this sentence is the last sentence in the paragraph.  This sentence, for example, should be deleted.  So should this one.

### Code cells
Other cells contain code in the Python language. Running a code cell will execute all of the code it contains.

To run the code in a code cell, first click on that cell to activate it.  It'll be highlighted with a little green or blue rectangle.  Next, either press the "run cell" button on the left side of the cell, or hold down the `control` key and press `return` or `enter`.

Try running this cell below:

In [None]:
print("Hello, World!")

The fundamental building block of Python code is an expression. Cells can contain multiple lines with multiple expressions. When you run a cell, the lines of code are executed in the order in which they appear. Every `print` expression prints a line. Run the next cell and notice the order of the output.

In [None]:
print("First this line is printed,")
print("and then this one.")

## Writing Jupyter notebooks
You can use Jupyter notebooks for your own projects or documents.  When you make your own notebook, you'll need to create your own cells for text and code.

To add a cell, move your cursor below any current cell, and click (+ Code) for a code cell or (+ Text) for a text cell

**Exercise**: Add a code cell below this one.  Write code in it that prints out:
   
    A whole new cell!

Run your cell to verify that it works.

## Errors
Python is a language, and like natural human languages, it has rules.  It differs from natural language in two important ways:
1. The rules are *simple*.  You can learn most of them in a few weeks and gain reasonable proficiency with the language in a semester.
2. The rules are *rigid*.  If you're proficient in a natural language, you can understand a non-proficient speaker, glossing over small mistakes.  A computer running Python code is not smart enough to do that.

Whenever you write code, you'll make mistakes.  When you run a code cell that has errors, Python will sometimes produce error messages to tell you what you did wrong.

Errors are okay; even experienced programmers make many errors.  When you make an error, you just have to find the source of the problem, fix it, and move on.

We have made an error in the next cell.  Run it and see what happens.

In [None]:
print("This line is missing something."

You should see something like this (minus our annotations):

<img src="error.jpeg"/>

The last line of the error output attempts to tell you what went wrong.  The *syntax* of a language is its structure, and this `SyntaxError` tells you that you have created an illegal structure.  "`EOF`" means "end of file," so the message is saying Python expected you to write something more (in this case, a right parenthesis) before finishing the cell.

There's a lot of terminology in programming languages, but you don't need to know it all in order to program effectively. If you see a cryptic message like this, you can often get by without deciphering it.  (Of course, if you're frustrated, ask a neighbor or a TA for help.)

Try to fix the code above so that you can run the cell and see the intended message instead of an error.

# 2) Numbers

Quantitative information arises everywhere in data science. In addition to representing commands to print out lines, expressions can represent numbers and methods of combining numbers. The expression `3.2500` evaluates to the number 3.25. (Run the cell and see.)

In [None]:
3.2500

Notice that we didn't have to `print`. When you run a notebook cell, if the last line has a value, then Jupyter helpfully prints out that value for you. However, it won't print out prior lines automatically.

In [None]:
print(2)
3
4

Above, you should see that 4 is the value of the last expression, 2 is printed, but 3 is lost forever because it was neither printed nor last.

You don't want to print everything all the time anyway.  But if you feel sorry for 3, change the cell above to print it.

## Arithmetic
The line in the next cell subtracts.  Its value is what you'd expect.  Run it.

In [None]:
3.25 - 1.5

Many basic arithmetic operations are built in to Python. The common operator that differs from typical math notation is `**`, which raises one number to the power of the other. So, `2**3` stands for $2^3$ and evaluates to 8. 

The order of operations is what you learned in elementary school, and Python also has parentheses.  For example, compare the outputs of the cells below. Use parentheses for a happy new year!

In [None]:
1+6*5-6*3**2*2**3/4*7

In [None]:
1+(6*5-(6*3))**2*((2**3)/4*7)

In standard math notation, the first expression is

$$1 + 6 \times 5 - 6 \times 3^2 \times \frac{2^3}{4} \times 7,$$

while the second expression is

$$1 + (6 \times 5 - (6 \times 3))^2 \times (\frac{(2^3)}{4} \times 7).$$

**Exercise**: Write a Python expression in this next cell that's equal to $5 \times (3 \frac{10}{11}) - 50 \frac{1}{3} + 2^{.5 \times 22} - \frac{7}{33}$.  That's five times three and ten elevenths, minus fifty and a third, plus two to the power of half 22, minus 7 33rds.  By "$3 \frac{10}{11}$" we mean $3+\frac{10}{11}$, not $3 \times \frac{10}{11}$.

Replace the ellipses (`...`) with your expression.  Try to use parentheses only when necessary.

*Hint:* The correct output should start with a familiar number.

In [None]:
...

# 3) Variables
An effective strategy for writing code is to define names for data as we compute it.

In Python, we do this with *variables*. To assign a value to a variable, we put the variable name on the left side of an `=` sign and an expression to be evaluated on the right.

In [None]:
ten = 3 * 2 + 4

When you run that cell, Python first evaluates the first line.  It computes the value of the expression `3 * 2 + 4`, which is the number 10.  Then it gives that value the name `ten`.  At that point, the code in the cell is done running.

After you run that cell, the value 10 is bound to the name `ten`:

In [None]:
ten

The statement `ten = 3 * 2 + 4` is not asserting that `ten` is already equal to `3 * 2 + 4`, as we might expect by analogy with math notation.  Rather, that line of code changes what `ten` means; it now refers to the value 10, whereas before it meant nothing at all.

If the designers of Python had been ruthlessly pedantic, they might have made us write

    define the name ten to hereafter have the value of 3 * 2 + 4 

instead.  You will probably appreciate the brevity of "`=`"!  But keep in mind that this is the real meaning.

**Exercise**: Try writing code that uses a name (like `eleven`) that hasn't been assigned to anything.  You'll see an error!

A common pattern in Jupyter notebooks is to assign a value to a variable and then immediately evaluate the name in the last line in the cell so that the value is displayed as output. 

In [None]:
close_to_pi = 355/113
close_to_pi

Another common pattern is that a series of lines in a single cell will build up a complex computation in stages, naming the intermediate results.

In [None]:
bimonthly_salary = 840
monthly_salary = 2 * bimonthly_salary
number_of_months_in_a_year = 12
yearly_salary = number_of_months_in_a_year * monthly_salary
yearly_salary

Variables in Python can have letters (upper- and lower-case letters are both okay and count as different letters), underscores, and numbers.  The first character can't be a number (otherwise a name might look like a number).  And names can't contain spaces, since spaces are used to separate pieces of code from each other.

Other than those rules, what you name something doesn't matter *to Python*.  For example, this cell does the same thing as the above cell, except everything has a different name:

In [None]:
a = 840
b = 2 * a
c = 12
d = c * b
d

In [None]:
# Change the next line so that it computes the number of
# seconds in a decade and assigns that number the name
# seconds_in_a_decade.
seconds_in_a_decade = ...

# We've put this line in this cell so that it will print
# the value you've given to seconds_in_a_decade when you
# run it.  You don't need to change this.
seconds_in_a_decade

### Comments
Notices the lines in the above cell that have # in the beginning of a line.

That is called a *comment*.  It doesn't make anything happen in Python; Python ignores anything on a line after a #.  Instead, it's there to communicate something about the code to you, the human reader.  Comments are extremely useful.

### Data types

Variables can have different types, called *data types*. 3 basic data types in Python are:

- int: an integer.
- float: a generic rational number.
- string: a piece of text.
- boolean: a True/False.

Below are examples of ints, floats, and strings:

In [None]:
# an integer
x = 5

# a float
x = 0.25

# a string
x = "Hello there!"

# a boolean
x = True

# 4) Functions

The most common way to combine or manipulate values in Python is by calling *functions*. Python comes with many built-in functions that perform common operations. Some examples are below:

In [2]:
# Absolute value
print(abs(-3))

# Maximum and minimum value
print(max(1, 3, -2, 5, -10))
print(min(1, 3, -2, 5, -10))

# Get the type of a variable
print(type(5))


3
5
-10
<class 'int'>


It's also often very helpful to write your own functions. Python allows you to do this using the `def` keyword:

In [3]:
def add(x, y):
    return x + y

print(add(3, 4))

7


Defining a function involves specifying a function name, in the above "add", that is used to call the function in the future. After the function name are the function *parameters* or *arguments*, which tells python what the inputs to your function should be. After the function definition there is a colon, and all subsequent lines are indented. Finally, the last line in the function is the *return statement*, which specifies what the function should return.  

**Exercise**: Write a function in a new code cell below that takes two numbers, $x$ and $y$, and returns $\frac{x^2 + y^2}{2}$. Print the value that your function returns for the following inputs: $(x, y) = (1, 3), (x, y) = (-2, 4)$. 

Functions can also have default arguments. Consider the code below:

In [4]:
def add(x, y=5):
    return x + y

print(add(3, 4))
print(add(3))

7
8


In the add function, the default value for `y` is 5, while `x` has no default value. This means that if we did not input a value for `x`, we would get an error, but we are allowed to not input a value for `y`, in which case it defaults to its default value.

# 5) If/Else statements

A very common concept in programming is the concept of *conditional statements*, which allows the code to execute different sections based on the value of a boolean. Below is an example:

In [5]:
def add_if_greater(x, cutoff=5):
    if x > cutoff:
        return x + 1
    else:
        return x

print(add_if_greater(6))
print(add_if_greater(4))

7
4


We see that the above code only adds 1 to `x` is `x > cutoff`, allowing for conditional execution.

### Else If

The above code only has 2 possible conditional options. If there are more than two options, we can add additional "else if" statements, which are encoded as `elif` in Python:

In [13]:
def add_if_between(x, cutoff1=5, cutoff2=10):
    if x < cutoff1:
        return x - 1
    elif x > cutoff2:
        return x - 2
    else:
        return x + 1

### Booleans from variables

You've seen now that in if/else statements, we very often generate booleans as functions of variables. Python has some built-in functions for very common operations that yield booleans:

In [12]:
x = 5

# less than
print(x < 5)

# less than or equal
print(x <= 5)

# equal
print(x == 5)

# not equal
print(x != 5)

False
True
True
False


### Boolean logic

Python allows for the following ways to combine booleans:

In [11]:
x = 5

# Logical AND
print(x <= 6 and x >= 4)

# Logical OR
print(x >= 4 or x <= 3)

# Logical negation
print(not (x >= 4))

True
True
False


# 6) Lists and Dictionaries

In Section 3 we talked about some very common and simple data types in Python. Python also has some more advanced data types that are very useful. Lists and dictionaries are two built-in data types in Python that are very common. In short, a list defines an ordered collection of values, and a dictionary defines a mapping from keys to values.

### Lists

Below is an example of a typical list in Python.

In [2]:
l = [1, 4, 6, 8, 10]

The values in a list can be of any data type! Python has several built-in methods for dealing with lists...

In [3]:
# Length of list
print(len(l))

# Get element from list by index value
print(l[0])

# add element to list
l.append(14)
print(l)

5
1
[1, 4, 6, 8, 10, 14]


### Slice Notation

One very convenient way to access elements in a list in Python is using *slicing*. A slice is defined using the syntax `start:end:skip`, where

- `start`: the desired starting index in the list (inclusive)
- `end`: the desired ending index in the list (not-inclusive)
- `skip`: (optional) the desired skip in the sequence, e.g. a skip of 2 means taking every other index.

When either `start` or `end` are omitted, there is no start/end. For example, the slice `2:` means all indices on and after the 2nd index. Indices can also be negative, in which case the list in index starting from the end. For example, `-3:` means the last 3 indices. Some examples are below.

In [5]:
l = [1, 4, 6, 8, 10]

# Get everything but the first element
print(l[1:])

# Get everything except the last element
print(l[:-1])

# Get only the odd-index elements
print(l[1::2])

[4, 6, 8, 10]
[1, 4, 6, 8]
[4, 8]


Lastly, it is worth noting that Lists (and other data types in Python) are stored using references. Consider the following example:

In [2]:
x = [1, 2, 3]
y = x
y[1] = 0
print(x)
print(y)

[1, 0, 3]
[1, 0, 3]


Notice that when we run `y = x`, `y` does not store a new list with the same values as `x`, but actually just stores a reference to the same underlying list. Thus, when we make a change to `y`, it also changes `x`. If you ever want to avoid such behavior, you can use `.copy()` as such:

In [3]:
x = [1, 2, 3]
y = x.copy()
y[1] = 0
print(x)
print(y)

[1, 2, 3]
[1, 0, 3]


### Dictionaries

Dictionaries are a very common and convenient data type in Python. A dictionary defines a mapping of keys to values. Dictionaries are defined using `{}`. Below is an example of some of the functionality of dictionaries.

In [6]:
# Defining an empty dictionary
d = {}
print(d)

# Defining a dictionary with elements
d = {
    "A": 1,
    "B": 2,
    "C": 3,
}

# Accessing dict values
print(d["A"])

# Adding elements
d["D"] = 4
print(d)

{}
1
{'A': 1, 'B': 2, 'C': 3, 'D': 4}


The benefit of using a dictionary is that the values in the dictionary can be found very quickly if we know the corresponding key.

# 7) For Loops

Often we want to iterate through the elements of a list, dictionary, or different data structure. The most common way to do this in Python is using for loops. The basic syntax of a for loop is `for element in data`, where `data` is some iterable data structure and `element` stores the value of an individual data point in `data` as the code iterates through `data`. 
Some examples of for loops are below.

In [9]:
# Iterating through a list
l = [1, 4, 6, 8, 10]
for x in l:
    print(x)

# Iterating through a dictionary
d = {
    "A": 1,
    "B": 2,
    "C": 3,
}
for k, v in d.items():
    print(k, v)

# For loop using range
for i in range(5):
    print(i**2)

1
4
6
8
10
A 1
B 2
C 3
0
1
4
9
16
