# Boston University Bioinformatics Python Tutorial

Presented here is a tutorial on the *basics* of the Python programming language. If you want to write a program for a particular computational task, there is a way to do it in Python; it's a matter of *how* not *if*. Google and [stackoverflow](https://stackoverflow.com/), as usual, are your friends for programming questions of all levels, including the [Python Documentation](https://docs.python.org/3/) itself or the documentation for whatever library you are trying to use.

Before getting to Python exercises, a quick note on using the Jupyter IDE. Jupyter allows you to interact with iPython (`.ipynb`) notebooks. These is a particularly powerful programming environment for data science because it allows you to isolate chunks of codes into cells that run independently, which can be convenient when you'd like to load your data, then use, then change some parameters, then visualize some things, etc. While a normal Python script allows you to run through an entire program linearly, Jupyter allows you to run each part of a "program" separately and include `Markdown` documentation (like this cell here). Here are some nifty Jupyter keyboard shortcuts for working more efficiently:

* The last element of a cell will be displayed as output (even without using `print`)
* Run a cell with Cmd + Enter. Use Shift + Enter to run a cell and advance to the next one
* Use tab-completion to complete variable names (technically an iPython keyboard shortcut)
* Use shift-tab-completion to see what methods are available to a Python object, for example: df.<Shift+Tab> will pull up a dropdown menu of available methods (also technically an iPython shortcut)
* When a cell is selected press A to create a new cell above or B to create a new cell below
* When a cell is selected press C to copy that cell or X to cut that cell and then V to paste it
* Use Z to undo the cutting of a cell

You can find more Jupyter tips here: https://michaelsilverstein.github.io/workflow.html.

# Python Data Types and Basic Operations
Run of each of the cells below to execute the code

## Numerical

### Types

In [1]:
"""Integers"""
1

1

In [4]:
"""Floats"""
1.0

1.0

### Operations

In [13]:
"""Operations"""
# This is a comment, nothing here will be executed (things in triple quotes are comments too! They show up in red in Jupyter, which is chill)
# Here are numerical operations in python
# We'll use the `print()` function to display each value

# Addition (obviously you can subtract too...). Notice that numerical operations can be mixed with integers and floats
print('Addition')
print(1 + 5.5)
print()

# Multiplication
print('Multiplication')
print(5 * 10)
print()

# Division (here we are printing 10/3 using a "f-string". More on that later)
print(f'10/3 calculted with one "/":  10/3 = {10/3}' )
print(f'However, "//" rounds down:    10//3 = {10//3}')
print()

# Exponent
print('Exponent')
print(3 ** 2)
print()

# Modulo (calculates the remainder)
print('Modulo')
print(10 % 3)
print()

Addition
6.5

Multiplication
50

10/3 calculted with one "/":  10/3 = 3.3333333333333335
However, "//" rounds down:    10//3 = 3

Exponent
9

Modulo
1



## Variables

In [15]:
"""Using variables"""
# Anything Python object can be saved to a variable. That variable can then have the same operations performed on it as if it was the original object
x = 4
y = 20.

x * y

80.0

## Strings

### Types

In [16]:
'Strings can be created with single quotes'

'Strings can be created with single quotes'

In [17]:
"Or with double quotes"

'Or with double quotes'

In [19]:
multi = """Multiline
strings can be created with triple quotes"""
print(multi)

Multiline
strings can be created with triple quotes


In [25]:
'Double "quotes" can be used in single quote strings and vice versa'

'Double "quotes" can be used in single quote strings and vice versa'

### Operations

In [22]:
"""Concatenation"""
print('You can ' + 'add strings together to' + ' concatenate them')
print('Or ' + 'repeat ' * 3 + 'them by "multiplying"')

You can add strings together to concatenate them
Or repeat repeat repeat them by "multiplying"


In [32]:
"""Counting a character"""
x = 'asdflihjasfiha'
x.count('g')

0

In [31]:
"""Stripping"""
# Stripping removes a character from the left (.lstrip()) right (.rstrip()) or either end (.strip()), the default characters are whitespace
x = 'sdf '
x.strip()

'sdf'

In [33]:
"""Splitting"""
s = 'If you want to split a string into a list on a certain character, you can use the .split() method'
s.split(',')

['If you want to split a string on a certain character',
 ' you can use the .split() method']

In [None]:
x.join()

In [40]:
"""Joining"""
'**'.join(['You can join a list', 'with a string', 'using the .join()', 'method'])

'You can join a list**with a string**using the .join()**method'

*Note*: A method (a `.` following an object) is an operation performed on a particular Python object. Each type of object has different methods associated with it.

### String substitution
These allow you to place strings right into other strings which can be particularly convenient when you'd like to include a value in a string that will derive from a variable.

In [26]:
"""The % method"""
name = 'Priya'
adj = 'funny'
"Let's say hello to %s, they are super %s!" % (name, adj)

"Let's say hello to Priya, they are super funny!"

In [54]:
"""F-strings"""
adj = 'cool'
sf = f"""F-strings are super {adj}, all you have to do is add an "f" to the beginning of a string, and then you can interperolate on the fly buy using {{}}.
You can have any Python code {{}}, like 3 + 5 = {3 + 5}
"""
print(sf)

F-strings are super cool, all you have to do is add an "f" to the beginning of a string, and then you can interperolate on the fly buy using {}.
You can have any Python code {}, like 3 + 5 = 8



## Boolean
For boolean objects you can use `True`, `False`, `1` (another way of saying `True`), or `0` (another way of saying `False`). 

In [55]:
"""Booleans encode the truth of a statement 🤯"""
4 > 2

True

In [58]:
"""You can use all of your favorite boolen operators"""
# Greater/less than or equal to
print(4 <= 5)

# Equal to
print(100 == 0)

# Not equal to
print(100 != 0)

True
False
True


In [106]:
"""Evaluate the boolean value of an object"""
# use the `bool()` function to assess the boolean value of an object
bool(1), bool(0), bool('test'), bool()

(True, False, True, False)

## None

In [57]:
x = None
x

# Data structures
These allow you to store, retrieve, and in some cases, edit, sets of Python objects

## Lists
Lists are created with square brackets: `[]`. Lists are "mutable" meaning that the elements of a list can be changed.

In [70]:
"""Manually creating a list"""
x = 3
l1 = ['lists can contain a mix of objects', x, 5.]
l1

['lists can contain a mix of objects', 3, 5.0]

### Indexing
You can access the elements of a list by indexing it with square brackets and an integer representing the position (starting with 0).

In [72]:
"""Index elements"""
# The first element of the list
print(l1[0])

# The second element of the list
print(l1[1])

# The last element of the list
print(l1[-1])

# The second to last element of the list
l1[-2]

lists can contain a mix of objects
3
new stuff for the first list


5.0

In [74]:
"""Updating the elements of a list"""
# You can update the elements of a list by re-assigning it's index
l1[0] = 'the new first element'
l1

['the new first element', 3, 5.0, 'new stuff for the first list']

In [None]:
"""Indexing portions of a list"""
# Get the second to fourth element
print(l1[1: 4])
print()

# Index from the beginning up to the second element
print(l1[:2])
print()

# Index from the third element to the end
print(l1[2:])
print()

# Index every other element
print(l1[::2])
print()

*Note*: When indexing a chunk of a list, notice that Python indexes **up to** the position of the second value (like `4` in `l1[1: 4]`, this displays elements at 1, 2, and 3, but not 4 itself).

### Operations

In [71]:
"""Appending to a list"""
# Append using the .append() method
l1.append('new stuff for the first list')
l1

['lists can contain a mix of objects', 3, 5.0, 'new stuff for the first list']

In [97]:
"""Sort a list"""
# Use the `sorted()` function
sorted(['b', 'c', 'a'])

['a', 'b', 'c']

In [102]:
"""Query a list"""
# Use the `in` operator to ask if an element is in the list. This returns a bool
3 in l1

True

In [79]:
"""Adding numeric elements of a list"""
# Use the `sum()` function to add numeric elements of a list
sum([1, 2, 10.5])

13.5

A list of other built-in python functions like `sum()` can be found [here](https://docs.python.org/3/library/functions.html).

In [80]:
"""Find the length of a list"""
len(l1)

4

In [82]:
"""TASK: Compute the average of the following list"""
a = [1, 19, 300]

## Tuples
Tuples are created with parenthesis:`()`. You can index tuples in the same way that you can index a list, however they are "unmutable" - you cannot update tuples after they have been created. Tuples are 

In [90]:
"""Create a tuple"""
t = (1, 2, 3)
t

(1, 2, 3)

In [91]:
"""Try to edit it"""
t[2] = 'test'

TypeError: 'tuple' object does not support item assignment

**Woah**: wtf is that red thing?! This is an error. The last line of this red block will indicate the specific cause of the error and the part above it will show you the line the error occurred on. If the cause of your error comes from another script (like if you call a function (which we have spoken about yet) and there's are error *within* that funciton) then Python will trace the error back as far as it can go.

## Dictionaries
Dictionaries are created with curly braces: `{}` and are used to store "key, value" pairs. As the name implies, dictionaries allow you to establish a relationship between a certain key and a corresponding value. Unlike lists and tuples, the order of a dictionary is not (necessarily) preserved! Instead of indexing based on position, dictionaries are indexed based on the key. 

Each item of the dictionary is defined as `key`: `value` and items are separated by `,`, like: `{key1: value1, key2: value2, ...}`.

In [93]:
"""Creating a dictionary"""
d = {'key': 'value', 1: 'dog', 'l': ['storing a list too?', 1]}
d

{'key': 'value', 1: 'dog', 'l': ['storing a list too?', 1]}

### Indexing

In [94]:
"""Indexing a dictionary with a key"""
# Get the values of 1
d[1]

'dog'

In [95]:
"""Reset the value of l"""
d['l'] = 'no longer a list'
d

{'key': 'value', 1: 'dog', 'l': 'no longer a list'}

In [96]:
"""Add a new item on the fly"""
d['new'] = 10 + 5
d

{'key': 'value', 1: 'dog', 'l': 'no longer a list', 'new': 15}

### Operations

In [98]:
"""Get the keys of a dictionary"""
d.keys()

dict_keys(['key', 1, 'l', 'new'])

In [99]:
"""Get the values"""
d.values()

dict_values(['value', 'dog', 'no longer a list', 15])

In [101]:
"""Get the items"""
# This can be particularly useful
d.items()

dict_items([('key', 'value'), (1, 'dog'), ('l', 'no longer a list'), ('new', 15)])

# Conditionals
Create logic with `if`, `elif`, and `else` with conditional statements followed by `:`

In [107]:
"""Conditional example"""
x, y = 1, 2

if x > y:
    print('x wins')
elif y > x:
    print('y wins')
else:
    print('x = y')

y wins


# Loops
Alright! Now we're getting programmatic. We've got data types, data structures, conditionals, and now we can really start to leverage the iterative advantages of computation. For and while loops allow you to perform iterative operations on scaled that would be anywhere from annoying to basically impossible for you to do otherwise.

## `for` loops
For loops allow you to operate on each element of an array

In [109]:
"""Loop through each element of a list and print it"""
x = [3, 4, 10]

for element in x:
    print(element)

3
4
10


In [110]:
"""Loop through the items of a dictionary"""
d = {'Bob': 3, 'Sally': 10}

# Here we will "unpack" each key and value of the items in d
for k, v in d.items():
    print(k, v)

Bob 3
Sally 10


In [114]:
"""Use the `range` function to generate integers within a range"""
# The syntax for range() is:
# 1) range(stop): If only one argument is provided, generate a range from 0 to stop (exclusive!)
# 2) range(start, stop): If two arguments are provided, go from start to stop (exclusive!)
# 3) range(start, stop, step): Go from start to stop with size step

for x in range(10, 15):
    print(x, x ** 2)

10 100
11 121
12 144
13 169
14 196


In [115]:
"""Add elements to a list on the fly"""
# Create an empty list
y = []

for r in range(5):
    # Add 10 to r at each step
    n = r + 5
    # Append this value to y
    y.append(n)
y

[5, 6, 7, 8, 9]

### List comprehensions
List comprehensions are a super convenient, but initially very bizzare looking, way to generate a list from a loop. List comprehension syntax can get fairly complicated depending on how nuanced of a loop you want to represent, but the basic synatx is:

Regular for loop
```python
new = []
for el in array:
    # Perform an operation on el
    o = operation(el)
    # Add to new
    new.append(o)
```

This same operation can be performed using the list comprehension syntax:
```python
new = [operation(el) for el in array]
```
So *sleek*!

Dictionaries can also be generated from dictionary comprehensions

In [None]:
"""TASK: Recreate y from above with a list comprehension"""
y = [<<PUT LIST COMPREHENSION HERE>>]

## `while` loops

In [111]:
"""Double x until it's greater than 100"""
x = 2

while x < 100:
    # This is the same as saying x = x * 2
    x *= 2
    # You could also say x += 2 as a way of saying x = x + 2
    
print(x)

128


In [None]:
"""TASK: Compute the mean of all of the odd numbers between 1 and 250"""

# Functions
Functions all you to perform the same operation on a set of arguments. They are created with the following syntax:

```python
"""Create a python funciton"""
def functionName(arg1, arg2):
    """
    We can write some stuff here to described `functionName`. This is called the "docstring"
    """
    # We can do stuff with arg1 and arg2 or anything we'd like
    r = arg1 + arg2
    
    # We can then return Python object(s)
    return r


"""Use that python function"""
y = functionName(1, 2) # y -> 3
```

In fact, you have already been using built-in Python functions through this tutorial. Any python command that ends with parenthesis is a Python function.

In [116]:
"""Create a function to add two numbers together"""
def addNums(a, b):
    return a + b

# Use addNums()
addNums(1, 2)

3

# Common Python libraries
* Numpy
* Pandas 
* Seaborn