# Introduction to Python and Jupyter

This follow-along notebook will introduce basic Python syntax and constructs to familiarize people with the language. We will also familiarize people with the flow of Jupyter notebooks.

## Jupyter basics: commands at the top of the notebook

Jupyter-notebooks follows a modular block format. You can create and delete cells, and also dictate the purpose of the cell. For example, this cell and the previous cell are in **Markdown** format, which is a language that allows people to format their text. To change between Python and Markdown, click on the drop-down menu right next to the fast forward button above.

In **Markdown** cells, you'll notice that there is no ```In [ ]:``` on the left of the cell. This is because they do not contain code to run, but rather have more lengthy blocks of text like this description. You'll know if a block is Python if there is ```In [ ]:``` on the left.

In [1]:
# This is a Python block now. Try clicking the run button on the top of this page
# or type Shift + Enter to run this cell.
2 + 2

4

Notice how when you ran the previous cell, ```In [ ]:``` became ```In [1]:```. This is a handy way to keep
track of all the order you run your cells. While ideally we run everything in sequential order, we still have the freedom to modify the cells and run them in whatever order we please. The ordering of the blocks can help you debug issues related to variables.

Another observation is that after running blocks of Python code, there will be a corresponding ```Out[]:``` cell. Jupyter is great in that it also modularizes the output of all of your code - ```Out[1]:``` will only print out output from the same ```In``` cell!

In [2]:
# Try running the below code and look at the next cell.
while True:
    pass

KeyboardInterrupt: 

Notice how on the left,```In[*]``` doesn't become ```In[2]```. This is because I made
this a simple infinite loop, and the block will never finish. The purpose of this is for you to try out the 
square button next to **Run** on the top of the notebook to cancel any cells that are still processing.

Next to the square, we have the restart and the fast forward buttons. The restart erases all variables from the kernel, and the fast forward does the restart then reruns each cell. These are great for making sure that all of your steps in the right order and that your notebook works from start to finish.

Finally, we have the cut, copy, and paste buttons to the left of the **Run** button. The arrow keys also the cells up and down. These act directly on the cells and give you the freedom to manipulate cells pretty easily. To insert a cell, click the plus button next to the save button on the very left. To delete a cell, select the cell of interest by clicking the `In [ ]: ` section (it should be highlighted in blue) and typing `dd`. You can click the **Help** button and go down to **Keyboard Shortcuts** to see what shortcuts are available. You can also edit and create new shortcuts to your own liking.

# Python Tutorial

Now that we have a sense of how Jupyter works, let's proceed with a basic introduction to Python. Python is an interpreted language, which means that we do not need to compile code (eg. C++, Java) in order to program. This means that all we need to do is open a Python interpreter (eg. the blocks of code with `In []:`, type code, and immediately see the execution. Thus, we can test and learn Python syntax relatively quickly.

## Print statement

The first thing to go over is the `print` statement. Python's is very straightforward: all you type is `print()`, and then the value that you want to print in the parentheses. Print will operate on many base data types, and it will also print any statements to be evaluated within the parentheses.

In [None]:
print("Hello world!")
print("40.0")
print(1 + 2)

In programming, all values have a type. Most programming languages share common types, eg. int, float, string. In Python, the list is a very common type of data that is used as well. The `type()` function will tell you what type the value in the parentheses is.

In [None]:
print(type(8))
print(type(42.1))
print(type("hi"))
print(type([]))

## Variables
A variable is a container that stores some value. We use variables to contextualize all of the values that we store. Python is dynamically typed, which means that we don't need to specify the type of a variable when we create it. This is great because we can easily move between data types if our program calls for it. The syntax for defining a variable is: `variable = value`.

In [None]:
x = "string"
print(x)

x = 5
print(x)

# defining a list - lists can have multiple types of data in them!
y = [1, "hi", 81.9, x]
print(y)

In [None]:
# Exercise: Define a variable z that is equal to x times 2 + 1, and then print the value of z.


Indexing for lists is much like R, except we start at 0. We use the following syntax to access the i'th element of the list: `list_name[i]`, where i is the index of the list. Python also has slicing, where you can access multiple elements of a list with `list_name[i:j]`. Slicing excludes the j'th element. Finally, a really cool trick to read from the back of the list is to use negative indexing. `list_name[-i]` will access the i'th element starting from the back of the list.

In [None]:
#Exercise: access the 2nd element of y


In [None]:
#Exercise: access the first three elements of y using slicing


In [None]:
#Exercise: access the last element of y using negative indexing


## Control Flow: If, else, elif

Control flow refers to the methods that determine which sections of our program run. We have three main constructs: `if`, `else`, and `elif`. The three statements look at a condition and see if it evaluates True or False. This is useful to organize code into chunks that are conditional/dependent on the values of particular variables. These statements are pretty universal in programming, and the syntax in Python is quite straightforward: `if/elif condition:`. Remember that you can chain conditions using the `and`/`or` keywords.

We'll use the input function to allow you to enter numbers and see control flow in action. After you run the cell, the same cell wil prompt you to give some input.

In [9]:
# var is casted as an int because by default, input is read as a string
var = int(input("Enter a number between 0 and 10: "))
if var >= 7:
    print("Big number!")
elif var >= 3 and var < 7:
    print("Medium number!")
else: # implies var < 5 and var > 0
    print("Small number!")

Enter a number between 0 and 10: 8
Big number!


In [None]:
# Exercise: experiment with the 'or' keyword by rewriting
# the above example

## Loops

Code that is within a loop is executed for the number of times that the loop will run. In Python, there are two types of loops: the `while` and `for` loops. `For` loops are much more common, and in Python they also have more functionality, so we will only go over for loops.


### For Loops
The structure of a for loop is similar to for loops in R and C++, with a more simple format. 

In [None]:
# the range function produces a sequence of numbers from 0 to the first argument minus 1.
# use range(x, y) to start from x and end at y - 1.
for j in range(3):
    print(j)

for j in range(10,12):
    print(j)

In [None]:
# For loops are often used to access elements of a list
for j in range(3):
    print(y[j])

In [None]:
# Exercise: Print every element of y using a for loop. Note that len(list_name) returns the length of the list. 
# What is the relationship between the final index of a list and the length of a list?
print(len(y))

# Exercise: range() accepts a third argument, the step size. Print every element of y in
# reverse order by adding the third argument (think: how to iterate backwards?)


Alternatively, you can loop through the actual elements of a list using `for element in list_name:`

In [None]:
#Exercise: Print every element of y using this format.


# In this syntax, "element" can be any word!

The above method is the more "Pythonic" way of doing things - we typically recommend this method of going through lists instead of the first method (although both are fine). Finally, you can use `enumerate` to combine both strategies above. `for index, element in enumerate(list_name):` loops through the list in the parethensis. The index for each iteration of the for loop will start at 0 and increase by one until it reaches the final index of the list.  

In [None]:
for i, elem in enumerate(y):
    print("element at ", i, "is ", elem) #notice how you can also use commas in the print statement

## Import Statements
Similar to R, we need to install and load in any packages that we want to use in our Python code. When you set up your conda environment, we had you install the packages listed in `requirements.txt`. One main difference between package installation in R and Python is that Python packages need to be installed in the Terminal. If you ever want to import a package but it is not found, then go to your Terminal and type `conda install <package_name>`.

## Processing Files

Processing files is pretty easy in Python. When looking for files, Python searches in the directory from which you run Python code. Changing directories requires the use of a different library. To open a file, use `with open(filename, 'r/w') as name:` for the syntax. `r/w` stand for read/write modes; we specify one or the other to indicate our purpose for opening a file. Sometimes, we just need to read in data, and other times we want to directly edit the file. Note that if we do `w` mode, then we can open a file that doesn't exist yet and write into it. Being explicit about the purpose is good practice.

After we read in the file, it is stored as a Python object. We need to use other functions to read or write the actual content of the files. Here, we will introduce `.readlines()`, though there are other ones you can find by Googling. Below are two examples of the above syntax:

In [54]:
# We added a test.txt file for you to run this code

# Read in two-line test file, notice how the end of each
# line, the newline (\n) character, is not removed. Google
# a method to try and remove \n from the readlines() function.
# make sure that when you read files, you're in "r"! "w" mode
# might overwrite it

with open("test.txt", "r") as fin:
    i = fin.readlines()
# observe that .readlines() reads each line into a separate string
# and concatenates the strings into a list. The line numbers in
# i and the file you read in are identical.
print(i)
print(i[0])

# Write to a new file and observe that it works!
# In a terminal in the same directory as this notebook,
# type `cat output.txt` and see what's in it
with open("output.txt", "w") as fout:
    fout.write("With open function is so cool!\n")


['Hello world!\n', 'Goodbye world!\n']
Hello world!



## Python Lists

Lists are powerful data structures in Python. As you saw before, they can contain a mixture of different data types (though usually don't in practice), and Python also offers a few operations and functions to parse information from a list. Lists are known as a **mutable** data type - you can edit them as you please. An example of an immutable data type is a tuple. Tuples are a bit niche though and lists see more use by far. Here are a few examples of some of the operations that Python supports.

In [19]:
# add lists together
[3, 5] + [10, 11]

[3, 5, 10, 11]

In [20]:
# multiply a list
[40.0] * 10

[40.0, 40.0, 40.0, 40.0, 40.0, 40.0, 40.0, 40.0, 40.0, 40.0]

In [34]:
# the in operation checks if the list contains some element
# use not in to negate
l = [3, 10, 14]
print(1 in l)
print(3 in l)
print(10 not in l)

False
True
False


In [33]:
# min() and max() are like len() and extract the
# values from a list. use them
# to get the min/max from l

14

In [35]:
# count number of occurrences of some value in a list
print(l.count(0))
print(l.count(3))

0
1


In [36]:
# use .append() to add elements to back of list
l.append(100)
l

[3, 10, 14, 100]