## Lab 6
### Nested Lists, Nested Loops, and Pairwise Sequence Comparison

Today we will learn about nesting methods, such as lists within lists or loops within loops.
We’ve already encountered <font color='red'>if</font>-statements within <font color='red'>for</font>-loops, but today we’ll do multiple nested loops to demonstrate pairwise sequence comparison, e.g. as we would do it in global sequence alignment.

You will follow along in a Jupyter notebook, and run the code as we go through it together.
Feel free to try to play around as we progress to get a better feel for how things work.

#### REMEMBER, THE INTERNET IS YOUR FRIEND: [Python.org Tutorials](http://docs.python.org/tutorial/)
---

#### Part A: A review of lists

List is a sequential object with multiple individual elements that can be strings, numbers, or even other lists. We can easily _*nest*_ one list within another list, and therefore create a matrix or a table. First, let’s refresh our memories about lists and the elements they can store. Check out the code chunk below. Once you've read it, run the chunk, and see whether the output chunk jives with your understanding.

In [2]:
#First things first:
#to run the code press ctrl-enter
#the code below is in Python 2.7.  What would you do to change it to 3.x?

#lists are good for storing more than one piece of data
number_list = [1, 2, 3, 4, 5]
print(number_list)

#lists are good for storing different data types
misc_list = [1, 2.4, 'Dmitriy', 5, 'David']
print(misc_list)

#lists can also store other lists - this is called nesting
#this is a list of the two previous lists!
list_of_lists = [[1, 2, 3, 4, 5], [1, 2.4, 'Dmitriy', 5,'David']]
print(list_of_lists)

[1, 2, 3, 4, 5]
[1, 2.4, 'Dmitriy', 5, 'David']
[[1, 2, 3, 4, 5], [1, 2.4, 'Dmitriy', 5, 'David']]


Check out `list_of_lists` -- it's exactly what the variable name says! 

As a reminder, Python _indexes by 0_. So, The first element, `list_of_lists[0]`, is actually `number_list`, and `list_of_lists[1]` is the same as `misc_list`

---

#### Part B: Nested Lists
1 | 2 | 3
--- | --- | ---
4 | 5 | 6
7 | 8 | 9
10 | 11 | 12
13 | 14 | 15

We’ve learned that we can nest one list inside another list. This means that we can use
nested lists to create, for example, a matrix or a table of numbers (see above).
We’ve also learned how to access the individual elements of nested lists, by first accessing
the list itself using its index, and then accessing the element using another level of indexing.
Let’s try to represent the above matrix of numbers in Python, as a list of lists!

In [1]:
#Matrix of numbers: 5 rows by 3 columns
#rows = nested lists (5 nested lists)
#columns = elements within a list (3 elements per row)
matrix = [[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15]]

print ('full matrix:', matrix)
#total number of rows (length of full matrix)
print ('number of rows:', len(matrix))
#total number of columns (length of the first row)
print ('number of columns:', len(matrix[0]))

full matrix: [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15]]
number of rows: 5
number of columns: 3


What if we want to access individual elements in the matrix?
First, identify the row, then identify the column.
*E.g.* the number 11 is in row 4, column 2 (remember, Python is *zero-indexed!*). So, the actual code to access this number would be `matrix[3][1]`

---

#### Part C: Nested Loops
Now that we’ve learned how to make nested lists and access individual elements of nested
lists, let’s try sequentially accessing elements in nested lists, _without knowing their indexes
ahead of time_. We can go through every column of every row by having a loop within a loop.

In [6]:
#let's get the (row, column) index for every element in our matrix
#the first loop goes through each row
#the second loop goes through each column WITHIN THAT ROW
#r = row, c = column, i = item at position (r,c)
print ('row column element')

# two typ0s were introduced in the code below
#for row i in the matrix
for i in range(len(matrix)):
    #for column j in row i
    for j in range(len(matrix[i])):
        print (i, j, matrix[i][j], sep = "\t")

#what does the code chunk 'sep = "\t"' do?
               
# Hint: to add line numbers, type ctrl-m, then l # new version: shift + L

row column element
0	0	1
0	1	2
0	2	3
1	0	4
1	1	5
1	2	6
2	0	7
2	1	8
2	2	9
3	0	10
3	1	11
3	2	12
4	0	13
4	1	14
4	2	15


This script is only four lines of code, but it does a very complex operation; this is the beauty
of programming. Make sure you understand how it works!

Hint: imagine covering up the table and looking only at the 1st row (row 0), then looking only
at the 1st item (item 0): the item in (0,0) is 1. Similarly, the item in (4,0) is 13: 5th row, 1st
item. Remember, Python starts its counting at 0, so all indexes will seem off by one!

---

#### Part D: Comparing sequences using nested loops
Now we know how to make nested lists, and how to make nested loops. Now we have all the
necessary components to do pairwise sequence alignment. Remember, in Python, strings are
sequential objects, meaning that they contain multiple elements organized in a specific order
– _just like lists_.

In [14]:
#Let's compare the sequence GATTACA to GATAGA
x = 'GATTACA'
y = 'GATAGA'

#Previously, our nested loops accessed the same variable (matrix)
#We can set up our loops so that the outer loop accesses one variable,
#and the inner loop accesses another.

#i = position in sequence x; x[i] is the letter at the ith position of x
#j = position in sequence y; y[j] is the letter at the jth position of y
print ('i x[i]  j y[j] match?')

#For every letter of x...
# you'll need to correct the code to match Python3.6 syntax
for i in range(len(x)):
        #For every letter of y...
        for j in range(len(y)):
            #Compare each letter
            #If there's a match...
            if x[i] == y[j]:
                print(i, x[i], j, y[j], 'YES', sep ="   ")
            #If there's a mismatch...
            else:
                print(i, x[i], j, y[j], 'NO', sep ="   ")
                
#what does the code chunk 'sep = "   "' do? # separator between the arguments in print()

i x[i]  j y[j] match?
0   G   0   G   YES
0   G   1   A   NO
0   G   2   T   NO
0   G   3   A   NO
0   G   4   G   YES
0   G   5   A   NO
1   A   0   G   NO
1   A   1   A   YES
1   A   2   T   NO
1   A   3   A   YES
1   A   4   G   NO
1   A   5   A   YES
2   T   0   G   NO
2   T   1   A   NO
2   T   2   T   YES
2   T   3   A   NO
2   T   4   G   NO
2   T   5   A   NO
3   T   0   G   NO
3   T   1   A   NO
3   T   2   T   YES
3   T   3   A   NO
3   T   4   G   NO
3   T   5   A   NO
4   A   0   G   NO
4   A   1   A   YES
4   A   2   T   NO
4   A   3   A   YES
4   A   4   G   NO
4   A   5   A   YES
5   C   0   G   NO
5   C   1   A   NO
5   C   2   T   NO
5   C   3   A   NO
5   C   4   G   NO
5   C   5   A   NO
6   A   0   G   NO
6   A   1   A   YES
6   A   2   T   NO
6   A   3   A   YES
6   A   4   G   NO
6   A   5   A   YES


Again, this is only three lines of code, but it’s very complicated! The first `for` loop goes
through every letter of `X`. Then the second `for` loop goes through every letter of `Y`. So for
every _letter_ of `X`, the computer checks all of `Y = GATAGA`, _every time_! This is why you see
`GATAGA` printed multiple times. This is very tedious and not the way that humans would do it,
but it works for computers.

Hint: imagine looking at `X = GATTACA`, and looking at the first letter: `X1=G`. Then look at `Y =
GATAGA`, again only the letter: `Y1=G`. Then look at the second letter (`Y2 = A`), the third letter
(`Y3 = T`), etc. Once you’ve finished with Y, then look at the second letter of X: `X2 = A`! Then
look through Y1, Y2, Y3 again… and so on, and so on…

---

#### Part E: appending to lists and incrementing counters
So far, we have initialized lists with existing elements, but we can also initialize an empty
list, and add to the list later. This is similar to the idea of incrementing a counter. In Lab 03,
we learned advanced list manipulation, and learned how to add items to a list using the
`append()` function.

Tip: `str()` will change the object (such as an integer) into a string; `+` will concatenate
two strings together into one string. This is a new way to print strings and integers on a line
together.

In [9]:
#Let's count the purines (A, G) in the following sequence
#We'll use range(len()) to identify the index of the purine
#We'll use a counter t ocount the total number of purines
#and we'll use an empty list to save the purines in a new variable
sequence = 'CAGTCAGTAAACCTGG'
num_purines = 0    #counter
purines = []       #new EMPTY list for the purines

print ('Sequence:', sequence)
print ('purine (index):')

#Loop through each letter of the sequence:
for i in range(len(sequence)):
    #Use a variable to store the current nucleotide:
    nucleotide = sequence[i]
    #Check if the nucleotide is a purine
    if nucleotide in ['A', 'G']:
        num_purines = num_purines + 1  #We found a new purine!
        purines.append(nucleotide)     #Save that specific base
        #Print out the position in the sequence
        print(nucleotide + '(' + str(i + 1) + ')')
        
print ('Total number of purines:', num_purines)
print ('List of purines found, in order:', purines)

Sequence: CAGTCAGTAAACCTGG
purine (index):
A(2)
G(3)
A(6)
G(7)
A(9)
A(10)
A(11)
G(15)
G(16)
Total number of purines: 9
List of purines found, in order: ['A', 'G', 'A', 'G', 'A', 'A', 'A', 'G', 'G']


---

# Lab Task
For today’s lab task, we’ll use nested loops and lists to create a new number
matrix. We’ll use the `range()` function in a unique way, by making use of its step argument,
which allows it to increment on a number other than 1. First, let’s read through Python’s
documentation on `range()`:

In [10]:
#Remember, you can always learn how a function works by passing
#its name to the help() function
#help(range)

#...or by playing with examples:
m = [1,2,3]
print(m)

print ("Function 'range(stop)': ")
for i in range(5):
    print (i, end =" ")
print()

print ("Function 'range(start, stop)':")
for i in range(2,5):
    print (i, end =" ")
print()

print ("Function 'range(start, stop, step)': ")
for i in range(2,10,3):
    print (i, end =" ")
print()

print ("Function 'range(start, stop, step)': ")
for i in range(2,3,10):
    print (i, end =" ")
print()

print ("Function 'range(start, stop, step)': ")
for i in range(2,20,10):
    print (i, end =" ")
print("\n\n")

help(range)

[1, 2, 3]
Function 'range(stop)': 
0 1 2 3 4 
Function 'range(start, stop)':
2 3 4 
Function 'range(start, stop, step)': 
2 5 8 
Function 'range(start, stop, step)': 
2 
Function 'range(start, stop, step)': 
2 12 


Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      True if self else False
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=val

Your goal for today is to use range and append function of lists to make the following number matrix:

1 | 2 | 3 | 4 | 5
--- | --- | --- | --- | ---
6 | 7 | 8 | 9 | 10
11 | 12 | 13 | 14 | 15

**OBJECTIVES:**

In the Python cell below:
1. Create an empty list where you will subsequently store your matrix.
2. Use `range()` to generate a list of numbers that begin each row of the matrix. Save the result in a variable called `begin_numbers`; you will use it later to generate the row of your table. Remeber that Python adds `+1` when counting, so make sure `stop` is correct. (*Hint:* your list should be: [1, 6, 11]. What are the values of `start`, `stop`, and `step` that you need to enter in order to get such a list?
3. Use a `for` loop to go through `begin_numbers`. As you go through the loop, use `range()` again to generate that row of the table. For example, the first element in `begin_numbers[1]` can be used to generate the first row of the table (`[1, 2, 3, 4, 5]`). Once you've generated the row, save it in the matrix using the `append()` function. (This needs to be done in the `for` loop!)
4. Finally, print your completed matrix.
5. When your code is working properly, save a copy of this .ipynb file in the format "Lab5" + your name, and submit it on Blackboard for credit.
6. *Don't forget to write comments!*


In [23]:
#write your python code here
# to add line numbers, type ctrl-m, then l

#create an empty list to store matrix
my_matrix =[]

#creat a list to store begin numbers
begin_numbers =[]

#generate a list for begin numbers
for i in range(1,12,5):
    begin_numbers.append(i)
#check the first result
#print(begin_numbers) 

#construct each row for my matrix 
#initiate an empty row list
row_list=[]

#every row list begins with a begin number j
for j in begin_numbers:
#then increase to j+5
#step is 1
    for n in range(j,j+5):
        row_list.append(n)
    # add row list to final matrix
    my_matrix.append(row_list)
    # starts for next row with an empty initial list
    row_list=[]
    
for row in my_matrix:
    print(row, sep ="\t")# seperate the matrix

[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
[11, 12, 13, 14, 15]
