# Basic Data Structures

Introduction:

In a previous lecture, we learned about a few 'types' that Python variables can have, such as __int__ (1, 42, -100), __float__ (3.14, 1.618, 6.022e23) or __str__ ("hello", "goodbye"). We also learned how to work with them using __functions__, and how to control the tasks we perform through loops and Boolean operators.

Now we are going to learn how to work with ordered data. Just like books are stored on shelves in libraries, we have to store our data in an orderly fashion such that it can be accessed and manipulated systematically. Instead of shelves, we have many structures to choose from. This morning we will introduce the __list__ and the __tuple__ - two ways to store an ordered collection of items.

## Topics:
- __Lists__
- Modifying a list
- Lists, loops, and List Comprehensions
- __Tuples__
- Lists of lists


# Lists

A __list__ provides a way of storing an ordered series of values in a structure referenced by a single variable. 


We can define a list as a succession of elements separated by commas and enclosed in square brackets __[ ]__. For example, we can create a list of characters as follows:

In [5]:
# How to create a list
li1 = ['Monday', 'Tuesday', 2, 'Thursday', 'Friday', 'Saturday', 'Sunday']

Here, we have defined a variable, __li1__, as a list.

In [6]:
# How to access the data in a list
print li1[2]

2


In [3]:
# See the contents of the whole list
print li1

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


In [4]:
# How do I check if li1 is a list?
print type(li1)

<type 'list'>


### Types within types

Lists are interesting because they are containers that hold other types inside of them.

Here our lists are holding strings (type str).  However, they can hold any type, and not all entries in a list need to be the same type.


In [7]:
print type(li1[0])

<type 'str'>


In [8]:
mixed_list = ["Beethoven", 27, 2.718];

print type(mixed_list[0])  # Print the type of the first item
print type(mixed_list[1])  # ..................the second item
print type(mixed_list[2])  # etc..

<type 'str'>
<type 'int'>
<type 'float'>


You can also define lists using other variables.  When doing this, it's as if you took whatever was stored in the variable and typed it in instead.

In [9]:
# For example, these both create identical lists

sample_list1 = ['Bach', 'Beethoven', 'Mozart']

print sample_list1

['Bach', 'Beethoven', 'Mozart']


In [10]:
# same as

a = 'Bach'
b = 'Beethoven'
c = 'Mozart'

sample_list2 = [a, b, c]

print sample_list2

['Bach', 'Beethoven', 'Mozart']


Lists have lots of really useful features. 

One is that they are __ordered__, which means the order of items in a list __does not change__ (this is not true for dictionaries, as we will see later). This means you can access individual items in a list or entire sections by indexing or slicing. 

You can also manipulate your list using built-in methods (more on what this means later this week). For example, we can add to the list by using the __append()__ method.

# Modifying a list

### Adding to a List

In [11]:
#add single item to list - using 'append'
li1 = [4, 8, 15, 16, 23]
print 'Before'
print li1

li1.append(42)

print 'After'
print li1

Before
[4, 8, 15, 16, 23]
After
[4, 8, 15, 16, 23, 42]


In [12]:
#combine two lists - using 'extend'
li1 = ['Monday', 'Tuesday', 'Wednesday', 'Thursday']
li2 = ['Friday', 'Saturday', 'Sunday']

li1.extend(li2)

print "After using extend"
print "li1: ", li1

After using extend
li1:  ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


Note that when we used __append()__ and __extend()__, we did not assign the modified list to a new variable. This is because these methods modify the list object that already exists; they do not create a new one. A common mistake when learning to work with lists is to assign the output of __append()__ or __extend()__ to a new variable. Doing so will asign 'None' to the new variable.

In [14]:
li3 = ['Bach', 'Beethoven', 'Mozart']
print 'li3: ', li3
li4 = li3.append('Vivaldi')

print 'li4: ', li4
print 'li3: ', li3

li4 = li3
print li4

li3:  ['Bach', 'Beethoven', 'Mozart']
li4:  None
li3:  ['Bach', 'Beethoven', 'Mozart', 'Vivaldi']
['Bach', 'Beethoven', 'Mozart', 'Vivaldi']


### Note: extend vs append

The difference between extend and append can be confusing at first

Make sure to note the difference:

`a.append(5)` increases the length of list `a` by one, putting the int 5 in the final slot

`a.extend(b)` takes all of the elements in list `b` and adds them to the end of list `a`.  Here `b` must be another list

In [15]:
#combine two list by concatenation
li1 = ['one', 'two']
li2 = ['three', 'four']

li4 = li1 + li2 # Note: This creates a NEW list, so we assign it to a variable

print "li1: ", li1
print "li2: ", li2
print "li4: ", li4

li1:  ['one', 'two']
li2:  ['three', 'four']
li4:  ['one', 'two', 'three', 'four']


### Empty lists

You can create an empty list by not assigning any objects to it. Empty lists are useful to create new data structures to which you can add new elements using __append__ or __extend__.

In [16]:
# Creating empty lists using brackets
empty1 = []
print empty1
# Appending objects to an empty list
empty1.append('Beethoven') 
print empty1

[]
['Beethoven']


In [17]:
# Creating empty lists using list()
empty2 = list()
print empty2
# Extending an empty list
empty2.extend(['Bach', 'Beethoven', 'Mozart'])
print empty2

[]
['Bach', 'Beethoven', 'Mozart']


### Slicing a list

'slicing' refers to grabbing a specific piece of a list

Here are some examples

In [22]:
#print out slices of list
my_list = ['Africa', 'America', 'Asia', 'Australia', 'Europe']

my_slice = my_list[0:3]
print my_slice

['Africa', 'Americas', 'Asia']


In [19]:
my_slice = my_list[2:4]
print my_slice

['Asia', 'Australia']


Notice how slicing 2:4 prints element 2 and element 3, but doesn't include element 4

When you slice a list, you get back another list!

In [23]:
my_slice = my_list[0:2]

print type(my_slice)

<type 'list'>


### Some slicing tricks

In [24]:
# If you leave out the first number, '0' is assumed
print my_list[0:2]
print my_list[:2] 

['Africa', 'Americas']
['Africa', 'Americas']


In [25]:
# If you leave out the last number, the size of the list is used
print my_list[2:]
print my_list[2:5]

['Asia', 'Australia', 'Europe']
['Asia', 'Australia', 'Europe']


In [26]:
# If you leave out both, you get a copy
print my_list
print my_list[:]

['Africa', 'Americas', 'Asia', 'Australia', 'Europe']
['Africa', 'Americas', 'Asia', 'Australia', 'Europe']


In [27]:
# Negative indices can be used, these count from the end of the list
print my_list[-1] # Get the last element

Europe


In [28]:
# Get the last 2 elements as a slice
print my_list[-2:]

['Australia', 'Europe']


Lastly, a third number can be used with slices to take every 2nd, every 3rd (etc) items

In [29]:
number_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];

print number_list[0:6:2]
print number_list[3:9:4]

[0, 2, 4]
[3, 7]


In [30]:
# Shortcuts can be used with these as well
print number_list[1::2] # take a slice with every other element.  End index is omitted so end of the list is assumed
print number_list[::2] # take a slice with every other element.  Start is also omitted, so 0 is assumed

[1, 3, 5, 7, 9]
[0, 2, 4, 6, 8]


In [34]:
# You can even use -1 to count backwards
print number_list[::-2]

[9, 7, 5, 3, 1]


### Other useful list methods
You can find more documentation for python lists [here](https://docs.python.org/2/tutorial/datastructures.html)

The above commands illustrate several of the most common ways to grow lists:

1) The list method __append()__, which adds a single item to the end of a list

2) The list method __extend()__, which adds a whole list to the end of the list you ask to extend itself

3) The list concatenation operator, which stitches two things together to make a new whole, without changing either original list.

The __insert()__ method is another way to add to a list. This method takes two arguments (in order): an index to insert at, and the object to insert. You can also insert by slicing, something like this: __li[2:2] = [var_to_insert]__. The first one is somewhat clearer, so it might be preferred unless you have very particular reasons for doing the other one.

In [37]:
#using insert
li1 = ['Tuesday', 'Wednesday', 'Thursday', 'Friday']

print 'Before Insert'
print li1
print

li1.insert(0,'Monday')

print 'After Insert at index 0'
print li1
print

Before Insert
['Tuesday', 'Wednesday', 'Thursday', 'Friday']

After Insert at index 0
['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']



### Removing items from a list

In [44]:
#remove any item from the list with 'del'
li= ['One','Ring','to','rule','them','all']
print li

del li[3:5]

print 'After using del:'
print li

['One', 'Ring', 'to', 'rule', 'them', 'all']
After using del:
['One', 'Ring', 'to', 'all']


In [42]:
#remove the last item from the list with pop
print 'Before pop'
print li
word=li.pop(0)
print 'After pop'
print li
print "Thing we popped:", word

Before pop
['One', 'to', 'them']
After pop
['to', 'them']
Thing we popped: One


Here, we are removing things from lists in two ways:

1) The built-in function __del__ removes a particular item from the list.

2) The list method __pop()__ removes the last item from the list and returns the variable.

### Changing lists in place

In addition to adding things to lists and taking them away again, we can also __change lists in place__.

In [45]:
#create list of zeros
noLi = 4*[0]

# This is equivalent to:
#  noLi = [0] + [0] + [0] + [0]

#  Once you understand how the addition operator works, the multiplication operator
#  works in an analogous way.
#  3*4 is the same as adding 3 to itself 4 times, so
#  [3] * 4 is equivalent to adding the list [3] to itself 4 times.

print noLi

[0, 0, 0, 0]


In [46]:
#modify items in list
mice_brain = 10
rat_brain = 20
human_brain = 500

noLi[1] = rat_brain
noLi[2] = human_brain
noLi[3] = mice_brain

print noLi

[0, 20, 500, 10]


In [47]:
#sort list
print 'sorted list!'
noLi.sort()
print noLi

sorted list!
[0, 10, 20, 500]


In [48]:
#reverse order
print 'reverse the list'
noLi.reverse()
print noLi

reverse the list
[500, 20, 10, 0]


In [49]:
# sorting into a new list
print 'sorted() vs .sort()'
another_noLi = sorted(noLi)
print noLi
print another_noLi

sorted() vs .sort()
[500, 20, 10, 0]
[0, 10, 20, 500]


In [50]:
#sort string list
li= ['One','Ring','to','rule','them','all']
li.sort()

print li
# Why is it not sorting in alphabetical order?

['One', 'Ring', 'all', 'rule', 'them', 'to']


Now, we've modified the list in a couple of important ways:

1) Overwritten items in the list using slices.

2) Sorted the list using the method __sort()__ and the function __sorted()__. __sort()__ works on the list in place, while __sorted()__ returns a new list.

3) Reversed the order of the list using the method __reverse()__.

These are, by the way, all demonstrations of the mutability of lists: unlike when you change a string or a number, you don't have to perform an operation and store the result because these operations change the list in place.

### Characterizing lists

In [51]:
#figure out how long a list is with 'len'
li= ['One','Ring','to','rule','them','all']
print len(li)

6


In [52]:
# max and min
noLi = [0, 10, 20, 500]
print len(noLi)
print 'Max =', max(noLi)
print 'Min =', min(noLi)

4
Max = 500
Min = 0


In [53]:
#find where something is stored
li= ['One','Ring','to','rule','them','all']
idx = li.index('rule')
print idx
print li[idx]

3
rule


Here, we have started to characterize our lists.

1) The built-in functions __len()__, __max()__ and __min()__ tell us how many items are in the list and the maximum and minimum values in the list.

2) The list method __index()__ tells us where an item is in the list.

3) We can iterate over each item in the list and print it using the syntax __for x in mylist__:

# Lists, loops, and List Comprehensions

We've already seen in a previous lecture that it is easy to loop through a list

In [54]:
months = ["January", "February", "March",
         "April", "May", "June", "July",
         "August", "September", "October",
         "November", "December"]

# This loop for printing the months
for x in months:
    print x

January
February
March
April
May
June
July
August
September
October
November
December


```python
# Is the same as if you did:
month = months[0]
print month
month = months[1]
print month
month = months[2]
print month
# Etc...
```

**range** is a special built-in function.
All it does, is create a **list** of numbers.

In [55]:
a = range(10)
print type(a)
print a

<type 'list'>
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [56]:
a = range(3, 8)
print a

[3, 4, 5, 6, 7]


In [57]:
a = range(4, 20, 3)
print a

[4, 7, 10, 13, 16, 19]


What if we want access to the number (position in the list) for each month?

In [58]:
N = len(months)             # Get the length of our list
numbers = range(N)          # Create a list of numbers from 0 to N-1

for i in numbers:           # Loop through the list
    print i+1, months[i]    # Print out each month and its number

1 January
2 February
3 March
4 April
5 May
6 June
7 July
8 August
9 September
10 October
11 November
12 December


In [59]:
# Or, more succintly:

for i in range(len(months)):
    print i+1, months[i]

1 January
2 February
3 March
4 April
5 May
6 June
7 July
8 August
9 September
10 October
11 November
12 December


Another useful short-hand is the enumerate method
Often in a loop, you might find that you need the number of the element, as well as the element itself

To avoid having to do:
```python
for i in my_list:
    my_element = my_list[i]
    # Some code involving 'i' and 'my_element'
```

The **enumerate** function lets you loop over both at once

In [60]:
for i, month in enumerate(months):
    print i+1, month

1 January
2 February
3 March
4 April
5 May
6 June
7 July
8 August
9 September
10 October
11 November
12 December


### Transforming/Filtering lists using list comprehensions
What if you wanted to change the list so that all the letters in the names of the months are in CAPS?

In [61]:
# One way to do this
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]

caps_months = []
for month in months:
    caps_months.append(month.upper())  # Remember, if you have a string, then the '.upper()' method will return an all-caps version
    
print caps_months

['JANUARY', 'FEBRUARY', 'MARCH', 'APRIL', 'MAY', 'JUNE', 'JULY', 'AUGUST', 'SEPTEMBER', 'OCTOBER', 'NOVEMBER', 'DECEMBER']


In [62]:
# Another way - use a list comprehension
caps_months = [month.upper() for month in months]

print caps_months

['JANUARY', 'FEBRUARY', 'MARCH', 'APRIL', 'MAY', 'JUNE', 'JULY', 'AUGUST', 'SEPTEMBER', 'OCTOBER', 'NOVEMBER', 'DECEMBER']


__List comprehensions__ are a type of python short-hand

Use them if you are comfortable with them and feel like they make the code easier to read

They are *slightly* faster than for loops, but the difference doesn't matter in most cases

In [63]:
# Another example - create a list of upper-case month names that start with the letter 'J'

j_months = [];
for month in months:
    if month[0] == 'J':
        j_months.append(month.upper())
        
print j_months


['JANUARY', 'JUNE', 'JULY']


In [64]:
# With a list comprehension

j_months = [month.upper() for month in months if month[0] == 'J' and month[1] == 'u']
print j_months

['JUNE', 'JULY']


In [65]:
# Or, if you didn't want to uppercase the month names

j_months = [month for month in months if month[0] == 'J']
print j_months

['January', 'June', 'July']


In [66]:
# You can use list comprehensions with enumerate too

j_months = []
for i, month in enumerate(months):
    if month[0] == 'J':
        j_months.append(i+1)
    
print j_months

# Or

j_months = [month for i, month in enumerate(months) if i % 2 == 0]

print j_months

[1, 6, 7]
['January', 'March', 'May', 'July', 'September', 'November']


# Tuples

A tuple is essentially a list that you can not change. You can index, slice them and add them together to make new tuples but not use __sort()__, __reverse()__, delete or remove items from them. If you ever have a tuple that you want to change, you have to turn it into a list.

In [71]:
SNP = ('chrII', '378445')
print type(SNP)
print
print SNP
print SNP[0]

<type 'tuple'>

('chrII', '378445')
chrII


In the same fashion as in __lists__, you can iterate through __tuples__.

In [68]:
for x in SNP:
    print x

chrII
378445


However, attempting to modify a __tuple__ will return an error.

In [69]:
#Let's see if python will let us change the SNP tuple:
SNP[0] = 'chrV'
print SNP

TypeError: 'tuple' object does not support item assignment

In [72]:
#What if we first coerce the tuple to a list?
SNP = list(SNP)
print type(SNP)
SNP[0] = 'chrV'
print SNP

<type 'list'>
['chrV', '378445']


Now that the tuple was converted into a list, we could change the first element without an error. Notice that the function __list()__ is used to transform an ordered data structure into a list.

### So...what's the point of a Tuple then?

Tuples, as other immutable objects, are lighter than lists. Using tuples allow programmers to optimize their code.

In [1]:
import timeit
# Create 10 million lists
print 'list time: ', timeit.timeit("['Bach', 'Beethoven', 'Mozart']", number=10000000)

# Create 10 million tuples
print 'tuple time: ', timeit.timeit("('Bach', 'Beethoven', 'Mozart')", number=10000000)



list time:  1.04983997345
tuple time:  0.131025075912


Just creating a tuple is substantially faster than creating a list!

Most of the time, you'll just use a list.  However, you need to be aware of tuples because they are commonly used by functions to return multiple values.

In [74]:
def maxmin(numeric_list):
    min_val = min(numeric_list)
    max_val = max(numeric_list)
    return max_val, min_val

out = maxmin([34, 4, 54, 103, 6])

# What exactly is out?
print type(out)

print out[0]
print out[1]

<type 'tuple'>
103
4


In [75]:
# However, even then you don't need to work with the tuple directly usually

largest, smallest = maxmin([34, 4, 54, 103, 6])
print largest
print smallest

103
4


# Summary So Far...

__Lists are:__

1) ordered collections of arbitrary variables.

2) accessible by slicing.

3) can be grown or shrunk in place.

4) mutable (can be changed in place).

5) defined with list = [X,Y]

List methods include: 

__append(x)__: Add 'x' to the end of the list  
__extend(Z)__: Add the contents of list 'Z' to the end of the list  
__insert(x)__: Add item 'x' to the start of the list (or to a specified position)  
__pop()__: Remove an item from the end of the list  
__reverse()__: Reverse the list (in-place)  
__index(x)__: Find the location (index) of x in the list  
__sort()__: Sort the list (in-place)  

Built in functions include: 

__sorted(Z)__:  Get a sorted copy of 'Z' (doesn't modify Z itself)  
__len__:  Get the number of items in the list (i.e., its length)  
__max__:  Get the maximum of all list items  
__min__:  Get the minimum of all list items  
__type__: Get the type of a python variable  

Questions?

# List of Lists

Let's first start by making a couple of lists.

In [76]:
# Things related to research
time_wasters = ['facebook', '9gag', 'snapface', 'reddit'] # instead of working, this is what we do
lab_space = ['wet lab', 'cold room', 'shared space'] # potentially where we waste time

print 'time_wasters:', time_wasters
print 'lab_space:', lab_space

time_wasters: ['facebook', '9gag', 'snapface', 'reddit']
lab_space: ['wet lab', 'cold room', 'shared space']


Incidentally, making a list of lists is fairly simple -- we can just create a new list variable and fill it with lists that we've already defined. Another way would be to manually input everything ourselves.

In [77]:
research = [time_wasters, lab_space]

# time_wasters and lab_space are both lists already?  So what's in research now?

print research

[['facebook', '9gag', 'snapface', 'reddit'], ['wet lab', 'cold room', 'shared space']]


In [78]:
print 'Number of items in `research`', len(research)
print type(research[0])
print type(research[1])

Number of items in `research` 2
<type 'list'>
<type 'list'>


In [80]:
# Or you could define a list of lists all in one statement

research = [
    ['facebook', '9gag', 'snapface', 'reddit'],
    ['wet lab', 'cold room', 'shared space'],4
]
# each list within the main list is contained in its own square brackets
print research

[['facebook', '9gag', 'snapface', 'reddit'], ['wet lab', 'cold room', 'shared space'], 4]


### Retrieving Elements in List of Lists
Getting elements in a list of lists is similar to getting elements in a list. The difference is that we add another index. Let's first see what happens when we try to retrieve the first and second elements of "research".

In [81]:
# Let's get the first list in research
List_a = research[0]
List_b = research[1]

print 'List_a has ', List_a
print 'List_b has ', List_b

List_a has  ['facebook', '9gag', 'snapface', 'reddit']
List_b has  ['wet lab', 'cold room', 'shared space']


Let's now try to retrieve '9gag' and 'cold room' from each list. The natural way, now that we have 2 different lists is simply to index them, but it can be a pain if you have a lot of lists nested within a list. We can use to sets of indexing instead.

In [82]:
# Long way
# Print the 3rd item OF the first list
List_a = research[0]
print(List_a[2])

# Print the 2nd item OF the second list
List_b = research[1]
print(List_b[1])


snapface
cold room


In [83]:
# Faster way without creating new variables for each nested list
a = (research[0])[2]
b = research[1][1]

print a
print b

snapface
cold room


This works for as many nested lists you have; just keep using as many indices until you get what you want

In [84]:
# E.g.

big_List = [
    [
        [1,2,3],
        [6,5,4],
        [7,8,2]
    ],
    [
        [11,12,13],
        [15,15,15],
        [83,94,19]
    ]
]

print big_List[1][2][0]

83


### Operations on Nested Lists
Like regular lists, all other list operations still work. Let's add a list of hangout places to *research*.

In [85]:
# Make a list of hangouts
time_wasters = ['facebook', '9gag', 'snapface', 'reddit']
lab_space = ['wet lab', 'cold room', 'shared space']
research = [time_wasters, lab_space]

hangout = ['Jupiter','Gardens','SF']
research.append(hangout) # research should now have a sublist of hangouts

print research
print "Number of items in research:", len(research)

[['facebook', '9gag', 'snapface', 'reddit'], ['wet lab', 'cold room', 'shared space'], ['Jupiter', 'Gardens', 'SF']]
Number of items in research: 3


What happens if you modify the *hangout* sublist under *research*?

In [86]:
# Add Starbucks to the hangout sublist under research
hangout.append('Starbucks')

print research[2]
print hangout

['Jupiter', 'Gardens', 'SF', 'Starbucks']
['Jupiter', 'Gardens', 'SF', 'Starbucks']


Notice how there's a change in hangout as well? That's because the 2 lists are the same exact one, just in 2 different locations. If this is a problem, simply copy of the elements of the *hangout* list to add to *research*.

In [87]:
# Using a copy of 'hangout' instead - Method 1
# Reinitialize the original research list
time_wasters = ['facebook', '9gag', 'snapface', 'reddit']
lab_space = ['wet lab', 'cold room', 'shared space']
research = [time_wasters, lab_space]

research.append([]) # add a new empty list ot be filled
research[2].append(hangout[0]) # start adding stuff from hangout list
research[2].append(hangout[1])
research[2].append(hangout[2])
research[2].append('Peets') # add a new thing - this ISNT added to hangouts since we created a new list
print hangout
print research[2] # Now they're different!

['Jupiter', 'Gardens', 'SF', 'Starbucks']
['Jupiter', 'Gardens', 'SF', 'Peets']


In [88]:
time_wasters = ['facebook', '9gag', 'snapface', 'reddit']
lab_space = ['wet lab', 'cold room', 'shared space']
research = [time_wasters, lab_space]

# Method 2
research.append(hangout[:])  # Using [:] makes a copy
research[2].append('Peets')
print hangout
print research[2]

['Jupiter', 'Gardens', 'SF', 'Starbucks']
['Jupiter', 'Gardens', 'SF', 'Starbucks', 'Peets']


In this case, we modified *research* without altering *hangout*.

*PythonTutor.com example*

## Nested Loops
These are very useful for generating nested data structures or pulling out data from nested data structures. As implied, these are simply loops within loops. First, let's make 2 lists that we want to work with.

In [89]:
# Create two lists of letters and numbers
letters = ['a','b','c','d']
numbers = [1,2,3,4]

Then we make a function that will create each pairwise combination of letters and numbers

In [90]:
def combo(list_a, list_b):
    for i in list_a:      # 'i' holds the item in list_a
        for j in list_b:  # 'j' holds the item in list_b
            print i, j
            
combo(letters, numbers)

a 1
a 2
a 3
a 4
b 1
b 2
b 3
b 4
c 1
c 2
c 3
c 4
d 1
d 2
d 3
d 4


In this case, changing the order of the lists simply changes the order in which it prints. It's up to you how you want your data to look.

In [91]:
combo(numbers, letters)

1 a
1 b
1 c
1 d
2 a
2 b
2 c
2 d
3 a
3 b
3 c
3 d
4 a
4 b
4 c
4 d


## Use With Nested Loops
How does this work with nested data structures? Let's use the original *research* list and find all the lab spaces and ways we can procrastinate.

In [92]:
# reinitialize research
time_wasters = ['facebook', '9gag', 'snapface', 'reddit']
lab_space = ['wet lab', 'cold room', 'shared space']
research = [time_wasters, lab_space]

for i in research[0]:
    for j in research[1]:
        print 'procrastinate with {} in the {}' .format(i, j)
            

procrastinate with facebook in the wet lab
procrastinate with facebook in the cold room
procrastinate with facebook in the shared space
procrastinate with 9gag in the wet lab
procrastinate with 9gag in the cold room
procrastinate with 9gag in the shared space
procrastinate with snapface in the wet lab
procrastinate with snapface in the cold room
procrastinate with snapface in the shared space
procrastinate with reddit in the wet lab
procrastinate with reddit in the cold room
procrastinate with reddit in the shared space


# Exercises

### 1) Get comfortable with new data structures (adapted from Learning Python)
Type the commands below. Add comments to your script describing what is happening in each line.

```python
L = [1,2,3] + [4,5,6]
print L
print L[:]
print L[:0]
print L[-2]
print L[-2:]

L.reverse()
print L

L.sort()
print L

idx = L.index(4)
print idx
```

### 2) Fun with numbers

Here we're going to get some practice with lists by writing a few functions to do some basic math operations on lists of numbers.  Later in the week, we'll learn about a python *module* called numpy that has all of this and more built-in.

**a. Write a function that multiplies two lists of numbers together **

It should take, as input arguments, two lists
It should return a new list, where each element comes from multiplying the two corresponding elements in the input lists together

Example Result:

```python
a = [1, 2, 3, 4, 5]
b = [5, 6, 7, 8, 9]
out = your_function(a, b)
print out #prints [5, 12, 21, 32, 45]
```

**b. Write a function that takes in a list of numbers and returns the median**

*Don't forget about the 'sort' function - it will be useful here*

Example Result:

```python
a = [3, 53, 57, 23, 41]
out = your_function(a)
print out #prints 41
```

**c. Write a function that filters an input list of numbers**

It'll have 3 arguments:

1. the list of numbers to filter.  
2. lower bound
3. the upper-bound.

And the function should return a new list, containing only elements from the original list that were greater than (or equal to) the lower bound and less than (or equal to) the upper bound

Example Result:

```python
a = [1, 2, 3, 4, 5, 6, 7, 8]
lower = 2
upper = 6
out = my_filter(a, lower, upper)
print out # prints [2, 3, 4, 5, 6]
```

### 3) A list of lists
You want to store the results of three different time series experiments, each with four data points. You should do this by creating a list of lists.

Data:

    run1: 2,3,5,5
    run2: 2,2,4,5
    run3: 3,3,4,6

a) Create an empty list and create a list for each of the 3 runs. Use the append function 3 times to generate the list of lists. Finally, print out the list of lists

b) Write a function that will loop through these lists and give us the min, mean, and max. The output should be tab delimited and look like:

    <run number>   <min>   <mean>   <max>

### 4) List transpose

Write a function that will compute the *transpose* of a list of lists

Reminder:  A transpose means that `new_list[x][y]` = `old_list[y][x]`.  In other words, rows becomes columns and columns become rows.

Example Result:

```python
input_list = [
    [1, 2, 3, 4],
    [5, 6, 7, 8]
]

output_list = transpose(input_list)

# Now output list is:
#  [
#    [1, 5],
#    [2, 6],
#    [3, 7],
#    [4, 8]
#  ]
```

### 5) Protein Sequences and Motifs (Challenge)

The code in the cell below this loads three lists from file which contain data from 1000 random genes in the human proteome.

    gene_name_list:  List of gene names
    header_list:     Fasta sequence header (can ignore this)
    seq_list:        Amino acid sequence for the gene's protein
    
Write a function that takes, as input, a short amino acid sequence and as output, returns a list of tuples, with each tuple containing the index of the gene where the sequence was found, the name of the gene which contains it, and the position in the amino acid string where it was found.

For Example:

```python

# Find sequence 'LLL' in strings in seq_list
result = find_sequence(gene_name_list, seq_list, 'LLL')

# result
# [
#    ...
#    (67, 'CCNF', 456),  # found in gene 67, with symbol CCNF, at sequence position 456
#    (69, 'GPRASP1', 1158), # found in gene 69, with symbol GRASP1, at sequence position 1158
#    ...
# ]
```

For an extra challenge, let your match sequence have wild-card characters.  For example, 'L..LL' can match one Leucine, following by any two amino acids, followed by two Leucines.

In [None]:
# For the bonus problem, loads data from file
import cPickle as pickle
with open('protein_data_sub.pkl', 'rb') as fin:
    gene_name_list, header_list, seq_list = pickle.load(fin)