<div align ="right">Thomas Jefferson University <b>COMP 101</b>: Intro to Coding</div>

# Loops and control continued


In our last notebook we just skimmed the surface of what we can do with loops. We will continue with some additional loop functions here. 

## Evaluation within a loop

Loops are just a structure for handling data, usually it is what we do inside of the loop that matters. It's a simple matter to get data from some source, such as a list or other data structure, and *do something* with it inside of a loop. One of the most common things we might want to do with a loop is use it to perform the same calculation on a number of different values. 

### Exercise 1
To get warmed up and review loop syntax today use a for loop. For each item in the list of numbers given below output the statement "The square of 2 is 4." except replace the values with the values presented in the list. 

In [None]:
side_list = [1,2,5,10,13,25,1000057]

#
## Your code here
#

We can also use if-then statments in loops to cause the loop to behave differently depending on what data it encouters. An if-else statement embedded within the loop can cause the program to behave differently depending on the data it encouters. 

### Exercise 2
Repeat the program above, but this time for any value of a 'side' less than or equal to 10 output the statement 'You don't need a calculator for that!'

In [None]:
side_list = [1,2,5,10,13,25,1000057]

#
## Your code here
#

## Iteration or appending values within a loop

As we saw in the introduction to the `while` loop we can iterate or append values within a loop to create a counter or iterator within the loop to keep track of how many times we have run it. Another use of iteration within a loop is to use it for counting particular events. Look at the for loop below that is being used to iterate a string. If for whatever reason we didn't want to use the count method we could use a `for` loop to count the frequency of different nucelotides. 

In [None]:
dna = 'AGTGGCTATTACTACATGCCGAAGTTCCTTAAATTTAACTTACCAGGCTTAACCGGATGATGATTATTATTACCTTAATTTTA'

a = t = c = g = 0

for i in dna:
    if i == 'A':
        a +=1
    elif i == 'T':
        t +=1
    elif i == 'C':
        c +=1
    elif i == 'G':
        g +=1                               
        
print (a, 'A:', t, 'T:',c, 'C:',g, 'G')
   

So far we have been using our loops to print out values as output. But one thing we might want to do is store data in it's own list. A common technique for doing this is to use the `.append()` method which hopefully you recall can be used to add characters to a string or items to a list. This is how this would look in pseudocode:
```
a_list = [ a list of values ]
answers = []                                # before starting the loop, we make sure that the variable
                                            ## answers points to an empty list
for i in a_list:
    new_value = some transformation of i    # some calculation is performed on each item in the list
    answers.append(new_value)               # the result of that calculation is added to answers
```   
Now we end up with a list of the answers we calculated, and we can go on to do more calculations with those values. 

### Exercise 3
Using the `range` function that we learned about in the previous notebook and the technique described above write a program that produces two lists. One called `side` that contains the numbers 1-50 and a second list called `area` that contains the squares of those values. 

In [None]:
#
## Your code here
#

## The `continue` and `break` commands within loops

Sometimes we want a looping function to skip over particular elements. We can always set up `if-elif-else` statements to handle different sorts of conditions. But sometimes we may want an easy way to stop the loop code block from executing under certain conditions, or we may want to terminate a loop altogther if certain conditions are met. 

In the example below we have some data with `NA` values that mean that for some of the data the value is missing or . Let's say we are performing a simple calculation on the data and all we care about at the end of the program is having the set of values for which data *were* available. 

In [None]:
## Continue command example, convert fareheit to celcius degrees

F_data = [87, 62, 'NA', 77, 84, 82, 'NA',91]
C_data = []

for temp in F_data:
    if temp == 'NA':
        continue
    else:
        temp_C = (temp -32)* 5 / 9
        C_data.append(temp_C)
 
print(C_data)


### Question 
Describe in your own words the path that the for loop takes through the data because of the `continue` command?

Double click on this text to add your answer here:

- 
- 
- 
-

`break` serves a different purpose. It causes a loop to stop altogether. Take a look at the example below.

In [None]:
## Break command example
bad_dna = 'AGTGGCTAXTACTACATGCCGAAGTTCCTTAAATTTAACTTACCAGGCTTAACCGGATGATGATTATTATTAZCTTAATTTTA'

a = t = c = g = 0                                                             # initial value of all
                                                                              ## counters set to 0
for i in bad_dna:
    if i == 'A':
        a +=1
    elif i == 'T':
        t +=1
    elif i == 'C':
        c +=1
    elif i == 'G':
        g +=1 
    else:
        bad = bad_dna.index(i)
        print('There is a non-A,T,G, or C value ', i ,' at position', bad+1)   # note used index value
                                                                               ## +1 b/c in bioinfo we  
                                                                               ## count residues from 1
                
        print('numbers below represent counts through position ' , bad)        # here it works out
        print()
        break
        
print (a, 'A:', t, 'T:', c, 'C:', g, 'G')

You can see that what we are doing here is stopping short because we don't have good data to work with. Our program is set up to handle a limited set of inputs, when we realize that the input doesn't match our expectation, we stop the loop. This would be especially important if what we were doing was not just counting but was something computationally expensive, or if the size of the dataset we were attempting to analyze was very large. Why keep waste the computer time and energy to keep running a loop once a problem is identified. 

Note the use of `.index()` method above. 

Another good use of `break` is when a sufficiently good solution has already been calculated by the program. 

### Question 
Describe in your own words the path that the for loop takes through the data because of the `break` command in the example above? How was the `.index()` method used to make output more informative to the user of the program? Can you think of some other ways the `.index()` function might be used in a more complex dataset?

Double click on this text to add your answer here:

- 
- 
- 
-

# Nested Loops

We can use nested loops to extend looping operations into multiple dimensions. (You: Multiple dimensions!? Like the multiverse? Me: No. More boring.) So what could we mean by that?

Let's consider some data that is organized as a list of lists. For our example we will use a list that represents a gradebook. Each item in the list is itself a list that consists of a student id as a string and a number of integer test scores. So this list has more than one dimension. It has a dimension of the entries corresponding to each individual student, and another that corresponds to the data associated with each student. You can easily visualize this data as how data is typically stored in a table or a spreadsheet, with each side of the table representing one dimension of the data. 

Let's say we are one of those teachers who drops the lowest test score. We want to write a program that ouputs the scores in a list and highlights which is the lowest score for each student. 


In [None]:
gradebook = [['axc007',87,93,86,72],           # we can break line as commas, this all could have
             ['bxj387',94,66,85,91],           ## been written out linearly as well, just harder to
             ['tsa547', 97,94,93,98],          ## look at.
             ['xrd982', 74,81,83,88],
             ['jph921',81,62,77,84]]

for student in gradebook:                       # student is just taking the place of `i` here
                                                ## on the grounds that student within gradebook
                                                ## is easier to understand
    
    for ii in student:                          # ii now represents each item in the 'student' list
                                                ## this is the nested loop. I used ii instead of i 
                                                ## to remind myself that this is a second loop.
        
        if  student.index(ii) == 0:             # assume the first item in each list is ID
            print('student ID: ', ii) 
            
        elif ii == min(student[1:5]):           # if it's the lowest score value, indicate that
            print('score = ', ii, ' <-minimum score')
            
        else:                                   # otherwise just print the score
            print('score = ', ii)
        

Note how using some natural language made this program easier to understand. We didn't have to designate the items within the list gradebook as `student`, we could have just left it as `i`. But I think it's a lot easier to understand as written. I didn't use 'grade' in the second loop because the list consisted of more than just grades, and I thought I will use `ii` to designate each item to remind myself that this represents the items being examined in the **nested** loop, which just means what it sounds like - a loop contained entirely within another loop. Complex programs can have multiple layers of loops nested within one another. 

### Questions
In your own words describe the path that the nested `for` loops must be taking through the data.

*review* Where is a *slice* operation used in this program? Explain why it is coded the way it is?

Double click on this text to add your answer here:

- 
- 
- 
-

## Summary Exercises

### Exercise 4 
Use an if-then statement embedded within a loop to generate a new string called `my_comp_dna` that contains the complementary strand of DNA to myDNA. A-T and G-C are complementary pairs, so every time an 'A' appears in `my_dna` a 'T' should appear in `my_comp_dna`, and every time a 'T' appears in `my_dna` an 'A' should appear in `my_comp_dna`.  Print both strings side by side so that the answer can easily be checked. 


In [None]:
my_dna = 'AGTGGCTATTACTACATGCCGAAGTTCCTTAAATTTAACTTACCAGGCTTAACCGGATGATGATTATTATTACCTTAATTTTA'


#
# Your code here
#

### Exercise 5
Given the same gradebook data as before write a program that produces the following output. For each student in the data set give the student ID and give the lowest test score. Give the remaining scores that were used to calculate the average, and give the average both with and without the lowest test score removed. You may need to go back and consult the worksheet on list methods to remind yourself of operations that you can perform on a list that may be helpful here.  

Bonus challenge if you finish early - would your program run if there were a variable number of scores per student? The commented out gradebook data contains varialbe numbers of scores per student. See if you can make it work. 


In [None]:
gradebook = [['axc007',87,93,86,72],['bxj387',94,66,85,91],['tsa547', 97,94,93,98],['xrd982', 74,81,83,88],['jph921',81,62,77,84]]

#uncomment the variable below to attempt the challenge problem
#gradebook = [['axc007',87,93,86,72,89],['bxj387',94,66,85,91],['tsa547', 97,94,93,98,94],['xrd982', 74,81,83,88],['jph921',81,62,84]]



#
## Your code here
#


![Alt text that will appear on mouseover](images/TJU_logo_dummy_image.png "Dummy image")
