## Summary:

1. Lists and manipulations:
    * What are they and why do we use them?
    * How are they stored in memory?
    * Methods: Adding items
    * Methods: Removing items
  
# This module: loops and a few more pieces of lists


In [None]:
#let's remember how to create a file:
f = open("data.csv","w+")
# I have created a few files within the jupyter notebooks so that you didn't need to upload additional files.
# We can dissect out the list below and discuss the elements:
my_list=['Drosophila melanogaster,atatatatatcgcgtatatatacgactatatgcattaattatagcatatcgatatatatatcgatattatatcgcattatacgcgcgtaattatatcgcgtaattacga,kdy647,264\n', 'Drosophila melanogaster,actgtgacgtgtactgtacgactatcgatacgtagtactgatcgctactgtaatgcatccatgctgacgtatctaagt,jdg766,185\n', 'Drosophila simulans,atcgatcatgtcgatcgatgatgcatccgactatcgtcgatcgtgatcgatcgatcgatcatcgatcgatgtcgatcatgtcgatatcgt,kdy533,485\n', 'Drosophila yakuba,cgcgcgctcgcgcatacggcctaatgcgcgcgctagcgatgc,hdt739,85\n', 'Drosophila ananassae,ttacgatcgatcgatcgatcgatcgtcgatcgtcgatgctacatcgatcatcatcggattagtcacatcgatcgatcatcgactgatcgtcgatcgtagatgctgacatcgatagca,hdu045,356\n', 'Drosophila ananassae,gcatcgatcgatcgcggcgcatcgatcgcgatcatcgatcatacgcgtcatatctatacgtcactgccgcgcgtatctacgcgatgactagctagact,teg436,222\n']

for item in my_list:
    f.write(str(item))
f.close()

# Loops
* Repetition with variation
* Allows us to process **lists** or **other iterable objects**  (files, for instance, can be iterated) one element at a time

## For loops
When we need/want to apply the same procedure/manipulation to a bunch of items - for instance, not coincidentally, to a bunch of elements that are in a list!
We can iterate over each specified element in a list (we can also apply criteria/conditions to which elements are chosen)when we have a specific number of items to iterate through, we use **for** and we use **if** we don't know how many items (note: we might use while it comes with some baggage...)

### Syntax

__for number in my_list:__

Body of the loop - do something for each element in my_list
* The body of the loop must have the same indentation and you can use spaces or tabs but not a combination of both - indentation errors can result!
* **STICK TO FOUR SPACES of indentation**

We can exit a particular loop with the **break** keyword (note that if you use break on an inner for loop, it will continue looping through any outer for loops; it only 'breaks' out of the for loop in its **scope**)

we will probably encounter the **pass** keyword, too. This is used as a place holder for future code and mostly results from Python's use of whitespace.

### Scope

In the example in a couple of cells, num and i is a variable name that only exists - properly - in the loop; you can't call it outside of the loop and expect it to still be iterable. If you call it outside the loop, it should just give you the last element that was called (because that is what is stored in memory).

it automatically iterates through the provided list which means it is set to the next element of the list as it goes through the loop. This is markedly different from some other languages, say Java, which requires you to keep track of each iteration with an increment operator like i++ and explicitly set the initial variable to 0

### Indent the action in the loop

indentation is functionally the same as curly brackets in other languages - incorrect indentation will lead to indentation errors

![Loops](infiniteloop.png)

In [None]:
#stolen from codeacademy which had an easily readable section on for loops, if you are confused!
# This is a common approach to filling a list!
# empty list:
hobbies = []
print(hobbies)
# in this example, the iterator could be almost anything. In this case, it is 'i'
# For loops always start at '0' element unless explicitly told not to by lower bound on range.
# range behaves different ways depending on how many arguments are provided to it.
# We saw a similiar response with slicing which could behave differently depending on
# how many arguments were given: list_name[upper bound] versus list_name[lower bound:upper bound]
# list_name[lower:upper:increment]
# This behaviour is called "overloading" and we will see this is more detail below.
# For now, notice that range(3) can also be written as: range(0,3) or range(0,3,1).
for i in range(3):
    hobby=input("What's your hobby?: ")
    print("~~~~~~~")
    hobbies.append(hobby)
    print(hobbies)
    print("We are in the "+str(i)+" iteration")
    #if this was java, we would need to add 1 to the counter for each time through a loop
    # But it isn't java so we don't have to! Hurray!
print(hobbies)
#print(hobby)
print(i)

[]


What's your hobby?:  reading


~~~~~~~
['reading']
We are in the 0 iteration


What's your hobby?:  dancing


~~~~~~~
['reading', 'dancing']
We are in the 1 iteration


What's your hobby?:  traveling


~~~~~~~
['reading', 'dancing', 'traveling']
We are in the 2 iteration
['reading', 'dancing', 'traveling']
2


## Looping with ranges:
* range() is a built in function that generates lists of numbers for us to loop over
* behaviour of range() is dependent on how many arguments we give it (overloading):
    1. one number: range(n) --> 0 to n-1
    2. two numbers: range(lower number, higher number) --> lower number to higher number-1
    3. three numbers: range(lower number, higher number,increment size)
* inclusive on lower end, exclusive on upper end

In [None]:
# REVIEW THIS!
#loops with ranges examples:
#ranges
for number in range(6):
    print(number)

print("-----")
for number in range(3, 8):
    #print("Hey I am in the range(3,8) loop")
    print(number)

print("*****")
for number in range(2, 14, 4):
    print(number)

0
1
2
3
4
5
-----
Hey I am in the range(3,8) loop
3
Hey I am in the range(3,8) loop
4
Hey I am in the range(3,8) loop
5
Hey I am in the range(3,8) loop
6
Hey I am in the range(3,8) loop
7
*****
2
6
10


### You will iterate over a string or a list A LOT
* if you write a loop statement with a string where a list would be, the loop will process each character in the string as an element (one character at a time)
* even though we don't explicitly have an enumerator in Python (like we do in C++, java etc), we sometimes need one:
    * Built in function enumerate() which supplies an index to each element of the list as you go through it so you can count where each item is located

In [None]:
#Iteration over a String
thing = "spam!"

#iterate over each character in this string
for c in thing:
    print(c)
print("~~~~~~~")
word = "eggs!"
for a in word:
    print(a)

# You could also iterate over a list and use the built in function enumerate to keep track of the index
choices = ["Spam pizza", "Spam & pasta", "Spam & salad", "Spam nachos"]

print("Your choices are:")

for index_1, item in enumerate(choices):
    print(index_1, item)

s
p
a
m
!
~~~~~~~
e
g
g
s
!
Your choices are:
0 Spam pizza
1 Spam & pasta
2 Spam & salad
3 Spam nachos


### Sometimes, we want to iterate over MULTIPLE lists simultaneously
*Built in function **zip** which creates pairs (or more) of elements when passed two (or more) lists and will stop at the shorter list

In [None]:
print("And now for something completely different - multiple lists: ")
# iterating over multiple lists simultaneously
list_a = [3, 9, 17, 15, 40]
list_b = [2, 4, 17, 15, 30, 40, 50, 60, 70, 80, 90]

for a, b in zip(list_a, list_b):
    # we will learn about Boolean logic soon. != means not equal to
    if a != b:
        if a >b:
            print(a)
        else:
            print(b)
    else:
        print("a and b are equal")

And now for something completely different - multiple lists: 
3
9
a and b are equal
a and b are equal
40


### We can split a string to make a list!
* .split() <--works on strings to produce a list which we can then iterate over!
    * takes a single argument, a delimiter, which is the point at which the original string is split

In [None]:
names="melanogaster, simulans, yakuba, ananassae"
# note that the specified argument is what is being used as the split criteria and is not included in the resulting list
species=names.split(",")
print("This string is now a list: "+str(species))
# you could substitute 'an' instead of ',' and see what is printed out
#species2=names.split("an")
#print("This string is now a list: "+str(species2))
#---------------------------------
# NOTE: plit gives unanticipated behaviour if cut argument is at the very beginning or the very end of string
mySeq="GGGATGACATTTTATCCCATCGGA"
testlist=mySeq.split("AT")
print(testlist)
mySeqBeginningorEnd="ATGGGATGACATTTTATCCCATCGGAT"
testlist2=mySeqBeginningorEnd.split("AT")
print(testlist2)
# See? there are now two empty elements, corresponding to the AT at the beginning of the string and at the end

This string is now a list: ['melanogaster', ' simulans', ' yakuba', ' ananassae']
['GGG', 'GAC', 'TTT', 'CCC', 'CGGA']
['', 'GGG', 'GAC', 'TTT', 'CCC', 'CGG', '']


### We can iterate over a file object
* file object can be turned into a list for the purpose of looping (similar to how a string is turned into a list)
* When a file object is turned into a list, each line becomes an element that can be iterated over

__Warning: When reading data from a file use either read method (Module2B), which stores entire contents in a variable, or use loop method, which deals with each line separately. If you mix them you will get unexpected behaviour!__

In [None]:
#first store a list of lines in the file which, coincidentally, is what readlines() does!
file = open("data.csv")
#this will be a pointer to the file object, named 'file'
print(file)
print("*"*20)
all_lines = file.readlines()
#all_lines will print out all the content of the file object
print(all_lines)
# useful to note that there are hidden characters present, \n
print("*"*20)
# print the lengths. Let's use the handy enumerate function to list the index of each of the elements
for index,line in enumerate(all_lines):
    print(index," and what it contains: ",line)
    print("___"*10)
    print("The length is " + str(len(line)))

# print the first characters
for line in all_lines:
    print("The first character is " + line[0])

<_io.TextIOWrapper name='data.csv' mode='r' encoding='UTF-8'>
********************
['Drosophila melanogaster,atatatatatcgcgtatatatacgactatatgcattaattatagcatatcgatatatatatcgatattatatcgcattatacgcgcgtaattatatcgcgtaattacga,kdy647,264\n', 'Drosophila melanogaster,actgtgacgtgtactgtacgactatcgatacgtagtactgatcgctactgtaatgcatccatgctgacgtatctaagt,jdg766,185\n', 'Drosophila simulans,atcgatcatgtcgatcgatgatgcatccgactatcgtcgatcgtgatcgatcgatcgatcatcgatcgatgtcgatcatgtcgatatcgt,kdy533,485\n', 'Drosophila yakuba,cgcgcgctcgcgcatacggcctaatgcgcgcgctagcgatgc,hdt739,85\n', 'Drosophila ananassae,ttacgatcgatcgatcgatcgatcgtcgatcgtcgatgctacatcgatcatcatcggattagtcacatcgatcgatcatcgactgatcgtcgatcgtagatgctgacatcgatagca,hdu045,356\n', 'Drosophila ananassae,gcatcgatcgatcgcggcgcatcgatcgcgatcatcgatcatacgcgtcatatctatacgtcactgccgcgcgtatctacgcgatgactagctagact,teg436,222\n']
********************
0  and what it contains:  Drosophila melanogaster,atatatatatcgcgtatatatacgactatatgcattaattatagcatatcgatatatatatcgatattatatcgcattatac

### Sorting lists
*.sort() modifies the original list (strings will be alphabeticized) rather than returning a new list

In [None]:
#here is a boring list
start_list = [5, 3, 1, 2, 4]
#print(id(start_list))
# This is a common strategy: instantiate an empty list outside of a loop (due to 'scope')
# and then fill it up/manipuate it in a loop
square_list = []
for number in start_list:
    square_list.append(number**2)
print(square_list)
print("~~~~~~~~The list created in the loop, has been sorted: ~~~~~~~~~~~")
square_list.sort()
print(square_list)
print("~~~~~~~~~~The starting list is: ~~~~~~~~")
print(start_list)
print("~~~~~~~~~~~Here it is sorted~~~~~~~~")
start_list.sort()
print(start_list)
#print(id(start_list))
print("~~~~~~~You can see above that the sort method has changed the original list~~~~~~~~~~~~")

4406070912
[25, 9, 1, 4, 16]
~~~~~~~~The list created in the loop, has been sorted: ~~~~~~~~~~~
[1, 4, 9, 16, 25]
~~~~~~~~~~The starting list is: ~~~~~~~~
[5, 3, 1, 2, 4]
~~~~~~~~~~~Here it is sorted~~~~~~~~
[1, 2, 3, 4, 5]
4406070912
~~~~~~~You can see above that the sort method has changed the original list~~~~~~~~~~~~


## List Comprehensions!
* AKA: ternary expressions
* Generate lists according to rules and using for/in and if key words
* __Reduces loops to one line commands__
* Syntax is even more important than normal because they can be challenging to understand since they are 'short hand'
* Basic format:
        
            L=[expression for variable in sequence]

* the expression in a list comprehension will be evaluated once for every variable in a given sequence

In [None]:
#1. list comprehension that creates a list of even numbers up to and including 50
evens_to_50=[i for i in range(51) if i%2==0]
print(evens_to_50)

#2. We can revisit slicing within a list comprehension:
l = [i ** 2 for i in range(1, 11)]
#this is an example of slicing a list - we are only printing out a subset

print(l)
print(l[2:9:2])
# *****************************************
# I realized that I had not emphasized this useful point before so.....
# As an additional point, you can slice in reverse by using -increment like so:
print("List in reverse now:")
print(l[8:1:-2])
# *****************************************
#3. prints out a listof numbers that are divisible by 3 when they are doubled from 2-10. So it should result in 6.

doubles_by_3 = [x*2 for x in range(1,6) if (x*2) % 3 == 0]
print(doubles_by_3)

#4. Another example: This should only print out the power of 2 for even numbers so the result will be: 4,16,36,64,100

even_squares = [x**2 for x in range(1,11) if x%2 ==0]

print(even_squares)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[9, 25, 49, 81]
List in reverse now:
[81, 49, 25, 9]
[6]
[4, 16, 36, 64, 100]


In [None]:
#Expanded code that explains what the above 4 list comprehensions do
#1. The larger script that is equivalent to the first list comprehension is:
#initializing an empty list that you are going to fill using the for loop
# evens_to_50=[i for i in range(51) if i%2==0]
# print(evens_to_50)
evens_to_50=[]
# I decided to make another list that contains the odd numbers
#odds_to_50=[]
for i in range(51):
    if i%2==0:
        evens_to_50.append(i)
    #else:
     #   odds_to_50.append(i)
print(evens_to_50)
#print(odds_to_50)

#2.
l=[]
for i in range(1,11):
    l.append(i**2)
print(l[2:9:2])

#3
doubles_by_3 = []
for x in range(1,6):
    if (x*2)%3 == 0:
        doubles_by_3.append(x*2)
#print("what do I expect: 6. What do I get:  ")
print(doubles_by_3)

#4
even_squares = []
for x in range(1,11):
    if x%2 ==0:
        even_squares.append(x**2)

print(even_squares)

# Group Questions:  
_________________________________
1. (5 min) Pseudocode is sufficient! Let's see if we can combine our knowledge of lists and loops to create a string of random A,G,T,C values that is 25 nucleotides long. How about *n* nucleotides long? <-- we can use the same type of set empty list, fill it from within a loop strategy, but use empty string instead!

2. (10 min) Using a for loop turn the following sequence (it can be hard-coded into your cell) into a list of codons:  **5’- ATCGATCGATCGATCGACTGACTAATCATAGCTATGCATGCTACTCGATCGATCGATCGATCGATCGATCGATCGATCGATCATGCTAACATCGATCGATATCGATGCATCGACTAGTACTAT-3'**. You should end up with a list that prints ["ATC","GAT", etc].

4. (15 min) The following contains 5 DNA sequences, one per line. You will need to copy and paste these sequences into one plain text file in the same directory as your Jupyter notebooks so that you can open the file in your Jupyter notebook.
   

ATTCGATTATAAGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC
ATTCGATTATAAGCACTGATCGATCGATCGATCGATCGATGCTATCGTCGT
ATTCGATTATAAGCATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC
ATTCGATTATAAGCACTATCGATGATCTAGCTACGATCGTAGCTGTA
ATTCGATTATAAGCACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA

Each sequence starts with the same 14 base pair fragment (ATTCGAT at is a sequencing adapter that needs to be removed. Write a program to do two things:

1.	Trim this adapter and using a 'for loop'  write all five of the cleaned sequences to one new file, one cleaned sequence. te file).
2.	Print the length of the trimmed sequences to the screen.e screen.


#1
* I would use an empty string and fill it from within a loop

DNA=""

NT=["A","C","T","G"]

for nt in range(0,24):
   
    for nuc in NT:
    
        DNA=DNA+somewayofrandomlypickinganelementfromtheNTlist

print(DNA)

In [None]:
#2
#hard coded sequence
my_seq="AAATTTGGG"
#initialize an empty list
codon_list=[]
#I am using a lot of gratuitious print statements because
# I was trying to track down what I was doing wrong. You should use print statements everywhere
# while you are troubleshooting!
print(len(my_seq))
print("Here is the loop")
#Did something a tiny bit tricky here! :)
# I set the list to go by threes - as a codon does!
for nuc in range(0,len(my_seq),3):
    #I have put a print statement after every line to ensure that your code is actually
    # doing what you *think* - or intended it to do- it is doing.
    print(nuc)
    print("********")
    print(codon_list)
    #where are we in the given sequence?
    print(my_seq[nuc])
    # fill up the empty list by appending the nuc through nuc+3 items here
    codon_list.append(my_seq[nuc:nuc+3])
    print(codon_list)
print(codon_list)

9
Here is the loop
0
********
[]
A
['AAA']
3
********
['AAA']
T
['AAA', 'TTT']
6
********
['AAA', 'TTT']
G
['AAA', 'TTT', 'GGG']
['AAA', 'TTT', 'GGG']


# Question 3 - assignment question.
Pseudocode "hints"
1. all of five sequences have the same adapter that you need to remove and it is 14 nucleotides long.
2. You can read the file using readlines which will place each line as an element of a list -- which is iterable!
3. You can then send all five sequences through a for loop and slice off the first 14 nucleotides
4. Print them to the screen or send them to a new file etc.!

In [None]:
#3 This is one answer, but it doesn't give me the "correct" answer which is 42 for the first line. The last
# item has the correct length of 44.
# Why isn't this working the way that I expect it to?
# so far, I have investigated the rstrip method as a possible suspect, but this will need to invest a bit more troubleshooting effort.
# Update: rstrip returns an updated list. It works when I move it up to occur before taking the len(). I still don't fully understand
# this, but I think it is because of how rstrip returns the list.
# This is an excellent lesson for reasons why you should be suspicious and double check that your code produces the correct
# answer even when it runs!
InputFile=open("untrimmedFile.txt")
OutputFile=open("trimmedFile1.txt","w+")

# the readlines file object method takes the contents of a file and puts them into a list where where each element of the list
# corresponds to one line of the text file. You can then iterate over it!
for dna in InputFile.readlines():
    trimmed_dna=dna[14:].rstrip()
    print(len(trimmed_dna),trimmed_dna)
    OutputFile.write(trimmed_dna)

InputFile.close()
OutputFile.close()

42 TCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC
37 ACTGATCGATCGATCGATCGATCGATGCTATCGTCGT
48 ATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC
33 ACTATCGATGATCTAGCTACGATCGTAGCTGTA
44 ACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA


In [None]:
# Here is a slightly different way, although the same problem persists - we have a length of two extra characters
# except for the last line.
# NOTICE HOW I AM COMMENTING THE CODE? Commenting your code will help you!
# We are opening the txt file where we stored the strings (one on each of five lines)
InputFile=open("UntrimmedFile.txt")
# we are creating a new file where the five sequences will go after they have each had their first 14 characters removed.
OutputFile=open("TrimmedFile.txt","w+")
# readlines method reads in every line from the line and places each one as an element of a list so this is equivalent to iterating over a list
lines=InputFile.readlines()
print(lines)

#I unnecessarily used the enumerate function - remember that from class?- in this loop so we could keep track of which
# sequence we were processing. This is especially useful for troubleshooting.
for index,dna in enumerate(lines):
    trimmed_dna=dna[14:].rstrip()
    trimmed_length=len(trimmed_dna)
    print(index+1,len(trimmed_dna))
    OutputFile.write(trimmed_dna)

InputFile.close()
OutputFile.close()

['ATTCGATTATAAGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC \n', 'ATTCGATTATAAGCACTGATCGATCGATCGATCGATCGATGCTATCGTCGT \n', 'ATTCGATTATAAGCATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC \n', 'ATTCGATTATAAGCACTATCGATGATCTAGCTACGATCGTAGCTGTA \n', 'ATTCGATTATAAGCACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA']
1 42
2 37
3 48
4 33
5 44


In [None]:
# One thing to note: this text file should have the cursor at the end
# of the file located on a new line.
file = open("untrimmedFile.txt")

# open the output file. If you opened the output file with an a+, then you will need to delete the output file
# every time you run it or you will end up with a very large file with the five sequences repeated each time you run the loop
output = open("trimmed.txt", "w+")

# readlines method reads in every line from the line and places each one as an element of a list so this is equivalent to iterating over a list
for dna in file.readlines():

    # get the substring from the 15th character to the end
    trimmed_dna = dna[14:]

    # get the length of the trimmed sequence. You could do this the following simple way:
    #trimmed_length = len(trimmed_dna) - 1
    # you could also do this with the .rstrip("\n) method which is better programming!
    trimmed_length=len(trimmed_dna.rstrip())

    # print out the trimmed sequence each time you go through loop
    # By the way: what happens if you hash this line out? You should get a blank file when you open "trimmed.txt"
    output.write(trimmed_dna)

    # print out the length and the sequence to the screen so that you know which line you are working on.
    print("Item "+str(index)+" to be processed is sequence "+trimmed_dna+" has length " + str(trimmed_length))

file.close()
output.close()

Item 4 to be processed is sequence TCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC 
 has length 42
Item 4 to be processed is sequence ACTGATCGATCGATCGATCGATCGATGCTATCGTCGT 
 has length 37
Item 4 to be processed is sequence ATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC 
 has length 48
Item 4 to be processed is sequence ACTATCGATGATCTAGCTACGATCGTAGCTGTA 
 has length 33
Item 4 to be processed is sequence ACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA has length 44


In [None]:
# you can iterate over a file object, but not in the way that some individuals were trying to do.
file=open("untrimmedFile.txt")
print(file)
for item in file:
    print(item)

<_io.TextIOWrapper name='untrimmedFile.txt' mode='r' encoding='UTF-8'>
ATTCGATTATAAGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC 

ATTCGATTATAAGCACTGATCGATCGATCGATCGATCGATGCTATCGTCGT 

ATTCGATTATAAGCATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC 

ATTCGATTATAAGCACTATCGATGATCTAGCTACGATCGTAGCTGTA 

ATTCGATTATAAGCACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA


In [None]:
outputFile=open("trimmed1.txt","w+")
# this is great idea that I took and modified from a peer:
# if you are struggling with reading from the file, you can always create a test case
# in your jupyter cell of a list and then see if your loop is working. This helps narrow down where the issue
# in your code is happening in a systematic way.
DNA=["ATTCGATTATAAGCTCGATCGATCGATCGATCGATCGATCGATCGATCGATCGATC\n","ATTCGATTATAAGCACTGATCGATCGATCGATCGATCGATGCTATCGTCGT","ATTCGATTATAAGCATCGATCACGATCTATCGTACGTATGCATATCGATATCGATCGTAGTC","ATTCGATTATAAGCACTATCGATGATCTAGCTACGATCGTAGCTGTA","ATTCGATTATAAGCACTAGCTAGTCTCGATGCATGATCAGCTTAGCTGATGATGCTCA"]

for index,DNA in enumerate(DNA):
    # note I used replace here instead of rstrip()
    trimmed=(DNA[14:]).replace("\n","")
    trimmed_length=len(trimmed)
    print(len(trimmed))
    outputFile.write(trimmed)

outputFile.close()

42
37
48
33
44
