# Intermediate Python for Biologists
## Comprehensions

### List Comprehensions

Let's say we want to raise 2 to the power of every whole number from 0 to 10. First, let's write it in a way that we are familiar with from the introductory course, using a for loop to loop over the list.   

We also learned another way to rewrite this (using map) in the functional programming lesson. 

And finally, let's write this using a list comprehension. 

### List Comprehensions with Conditionals

Let's say we have a list of animals. We want to create a new list of anomals that contains only the animals that have a vowel as their first letter. 

In [17]:
animals = ['shark', 'penguin', 'ocelot', 'armadillo', 'cow', 'bat']

Here's how we would do it with conditionals and for loops.

And here's how we would do this with built-in higher order functions.

And finally let's do it with a list comprehension.

What if we wanted to get the length of each element of the list animal that satisfied the condition that it began with a vowel?

### Converting map Functions to List Comprehensions
We'll practice converting from the patterns that we learned for built-in higher order function to list comprehensions so we can continue to build an understanding of the patterns. 

In [20]:
kilometer = [49.2, 30.1, 27.4, 37.8]

[161417.32308, 98753.28099, 89895.01325999999, 124015.74822]


Here are some steps that help me think through this:
- start with the square brackets
- add the body of the lambda function (or expression you want to calculate) to the beginning of the square brackets
- then, add the `for` keyword and put the element variable name after it (`x`)
- then put the `in` keyword and the name of the iterator (list)

In [1]:
# there are 3280.8399 feet in a kilometer


### Converting filter Functions to List Comprehensions
Let's do this again with a filter function to practice using list comprehensions with conditionals

We would like to create a new list that only contains the even numbers from a list of numbers. 

In [22]:
numbers = [2, 3, 45, 23, 593, 23, 54]

Here's how we would do it with a filter function. 

We can use the same steps as before with the addition of one more step at the end to handle the conditional. 
- start with the square brackets
- add the body of the lambda function (or expression you want to calculate) to the beginning of the square brackets (in this case we don't have one)
- then, add the `for` keyword and put the element variable name after it (`x`)
- then put the `in` keyword and the name of the iterator (list)
- add the conditional statement to the end

### Comprehensions with Multiple Conditions

Let's take a look at the following program so that we can convert it to a list comprehension. It takes the numbers from 0 up to and not including 1000 and creates a list of those that are divisible by *both* 3 *and* 5. 

In [29]:
div = []

for x in range(1000):
    if x%3 == 0 :
        if x%5 == 0:
            div.append(x)
print(div)

[0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240, 255, 270, 285, 300, 315, 330, 345, 360, 375, 390, 405, 420, 435, 450, 465, 480, 495, 510, 525, 540, 555, 570, 585, 600, 615, 630, 645, 660, 675, 690, 705, 720, 735, 750, 765, 780, 795, 810, 825, 840, 855, 870, 885, 900, 915, 930, 945, 960, 975, 990]


To create the list comprehension we add the conditionals to the end of the statement in square brackets. Be careful to only use this pattern when you want to check for both conditions to be satisfied (not one or another). 

Here's another way to write this. To me, this way is a bit clearer.

What if we want to make a list of the numbers that satisfy *either* condition (divisible by 3 or by 5)?

### if-else Comprehensions
We can also use the if-else conditional pattern in a list comprehension.

Let's say we have a list of insect species. We'd like to create a new list that has empty strings for any element that is `None`

In [47]:
insects = ['Solenopsis invicta','Euphydryas editha', None, 'Drosophila melanogaster', None, 'Neotibicen canicularis']

Here's how we would do it with for loops and conditionals.

And here's how we would do it with a list comprehension. 

### Nested List Comprehensions
We can also nest list comprehensions within other list comprehensions. This is useful when working with a lists of lists. 

In [3]:
list_of_lists = [[1,2,3],[4,5,6],[7,8]]

With the tools we learned in the introduction course, we would flatten this list in this way.

In [4]:
flattened = []
for list in list_of_lists:
    for item in list:
        flattened.append(item)
print(flattened)

[1, 2, 3, 4, 5, 6, 7, 8]


We can also flatten the list of lists using a list comprehension.

We can also create a matrix. Here is how we would do this before. 

In [65]:
matrix = []
for col in range(3):
    nested = []
    matrix.append(nested)
    for row in range(4):
        nested.append(0)
print(matrix)

[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]


And here is how we can do it with a list comprehension.

We can also transpose a matrix.

Here is how we would do it with for loops.

In [69]:
transposed = []

for i in range(3):
    transposed_row = []
    for row in matrix:
            transposed_row.append(row[i])
    transposed.append(transposed_row)
    
print(transposed)

[[1, 4, 7], [2, 5, 8], [3, 6, 9]]


And now with list comprehensions

### Dictionary Comprehensions
We can also use comprehensions on dictionaries (or any other iterable). It looks very similar to a list comprehension except it has curly brackets around it

### Generators
Unlike map, filter, and reduce, the output of a list comprehension is not lazy. Python does have a lazy equivalent of list comprehensions ocalled generators. 

The syntax is nearly the same as a list comprehension but you use parentheses.

In [80]:
# list comprehension
sum([i * i for i in range(1000)])

332833500

In [81]:
# generator
sum(i * i for i in range(1000))

332833500

Note that generator expressions are exhaustible, which means you can only iterate over them once (like file objects). Beware!

In [83]:
gen = (x * 2 for x in range(5))

In [84]:
for i in gen:
    print(i) # this line runs five times
# now the generator is exhausted

0
2
4
6
8


In [85]:
# won't return anything because the generator is exhausted
for i in gen:
    print(i) 

## Independent Work

### DNA Length
In the lesson on functional programming, we wrote a program that gave a list of the length of DNA sequences. We did it with a for loop and with a higher-order built-in function. Now let's do the same thing with a list comprehension.

Feel free to use the example list below or create your own. 

**Bonus**: Create a list of lengths of the DNA sequences only for the sequences that begin with 'T'

In [32]:
dna_list = ['ATG', 'TAGC', 'ACGTATGC', 'ACGGCTAG', 'GATCGCGC', 'TCGCGCAAAAAA']

### Last Codon
Write a list comprehension that returns the last three bases of each element in the list of DNA sequences. 

Feel free to use the example list below or create your own. 

In [86]:
dna_list = ['ATG', 'TAGC', 'ACGTATGC', 'ACGGCTAG', 'GATCGCGC', 'TCGCGCAAAAAA']

### Writing a FASTA file
Here is a problem from the introductory course. Let's write the program this time using a comprehension. 

FASTA is a file format that is used to store DNA and protein sequence data. The header row has a greater than symbol (>) and the accession name. There may be multiple sequences in one file.

Write a Python program that will make FASTA files for the following sequences. Make sure all are in uppercase letters. 

SEQ1: atcggccatctagccgg

SEQ2: ACTGTACATGTGCGCTAG

SEQ3: ccatctagcTGTAC

In [93]:
sequences = {"SEQ1": "atcggccatctagccgg", "SEQ2": "ACTGTACATGTGCGCTAG", "SEQ3": "ccatctagcTGTAC"}

In [95]:
# how we did it with for loops
seq_file = open("seq.fasta", "w")
for key, value in sequences.items():
    fasta = ">sequence_" + key[-1] +"\n"+ value.upper() + "\n"
    seq_file.write(fasta)
seq_file.close()

In [99]:
# now do it with a list comprehension


### Bonus 
Below is a dictionary that consists of samples and their weights in grams. Create a list of the names of the samples with weights below 5000 grams. Make all of the sample names in title case (with first letter capitalized).

In [100]:
samples = {"xyz123": 1500, "zyt345": 2000, "yug392": 2500, "tty443": 1600, "vrx455": 2400, "xyr334": 13600, "bbt333": 7, "trr210": 110}

### Bonus: BLAST Processor
Rewrite the BLAST processor you wrote in the functional programming lesson to use list comprehensions. 