## Transforming Each Element of a Collection with a List Comprehension

Often, you want to do something to the data inside your collection--double it, increase it, compute some metric, etc.  
At the end, you still have the same *number* of elements, but the values themselves have changed.  
We will be looking at lots of ways to accomplish this in  Python, but first we're going to use a **for-loop** in a format called a **comprehension**.

Comprehensions produce a new collection containing new values *"for each"* value *in* the original collection.  They look like this:

```python
>>> data = [1, 2, 3]
>>> squared = [x ** 2 for x in data]
>>> squared
[1, 4, 9]
```

```python
>>> data = [1, 4, 25]
>>> roots = [math.sqrt(x) for x in data]
>>> roots
[1, 2, 5]
```

**Exercises**

Get a list that added 1 to each value in data:


In [None]:
data = [-2, -1, 0, 1, 2, 3]

Get the absolute value of each element in data, using the built-in abs() function:

In [None]:
data = [-2, -1, 0, 1, 2, 3]

Round all these numbers to the nearest integer (use the "round()" function):

In [None]:
data = [-2, -1, 0, 1, 2, 3]

Get all the first letters of each name in the list:

In [None]:
names = ["John", "Harry", "Moe", "Luke"]

**Exercises with DNA**

Count the length of these sequences:

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

Get the first codon (first three nucleotides) from each sequence

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

Count the number of Adenosines in each sequences

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

Make all these sequences formatted the same way

In [None]:
seqs = ["GTAATCG", "gtaccaaa", "GGtAGtACCaC"]

Reverse all the sequences

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

## Filtering Collections in a List Comprehension

What if you only want to include *some* values in a collection?  With the **if** statement and a **logical expression**, you can do it in a comprehension!  For example:

```python
>>> data = [1, 2, 3, 4]
>>> [x for x in data if x > 2]
[3, 4]
```

This can be combined with various transformations as well!

```python
>>> data = ["John", "Harry", "Moe", "Luke"]
>>> [x[0] for x in data if len(x) < 5]
["J", "M", "L"]
```

Get All positive values in the following list:

In [None]:
data = [-6, 3, -1, 10, -5, 0]

Make a list of all names that start with the letter "L":

In [None]:
names = ["John", "Harry", "Moe", "Luke"]

Only keep sequences with more than 1 Tyrosine

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

Only keep sequences shorter than 9 nucleotides

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

Lower-case the sequences (i.e. make them have small letters), only keeping ones that end with Cytosine 

In [None]:
seqs = ["GTAATCG", "GTACCAAAC", "GGTAGTACCAC"]

Remove the Missing data from the list

In [None]:
seqs = ["GTAATCG", None, None, "GTACCAAA", None, "GGTAGTACCAC", None]

## Conditional Transformations 

Sometimes you want to transform different data differently, conditioned on some parameter in your analysis.  For example, what if you want to lowercase the names in the list than have 4 letters, and uppercase the other names?  

```
>>> data = ["John", "Harry", "Moe", "Luke"]
>>> [x.lower() if len(x) == 4 else x.upper() for x in data]
["john", "HARRY", "MOE", "luke"]
```

...and only keep the names that end in the letter "e"?

```
>>> data = ["John", "Harry", "Moe", "Luke"]
>>> [x.lower() if len(x) == 4 else x.upper() for x in data if x[-1] == 'e']
["MOE", "luke"]
```

**Exercises**

If the second nucleotide is T, lowercase the letters.

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

If the sequence is shorter than  8 nucleotides, make it twice as long.

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

If the sequence ends in "A", count the number of A in the sequence.  Else, count the number of Cs

In [None]:
seqs = ["GTAATCG", "GTACCAAA", "GGTAGTACCAC"]

If the sequence length is divisible by 3, count the number of codons.  Else, show only `None` for that sequence

In [None]:
seqs = ["GTAATCG", "GTACCAAAC", "GGTAGTACCAC", "GCATTA"]

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=79696a98-709a-4729-b1aa-af4bf3c33168' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>