Name: Soji Ademiluyi  
Email:aademilu@uncc.edu


## Part 1 - Sequence Class

Write a Sequence class. In the __init__ method, you should initialized one attribute, a String that represents a DNA Sequence
This class should also have the following magic methods we discussed in class yesterday:

- __repr__ and __str__
- __eq__ and __lt__ (then use the decorator I demonstrated)

It is up to you to decide how these should be implemented. For instance, what criteria do you think makes the most sense for saying two sequences are equal to one another? What criteria for one sequence to be considered "less than" another?

Here I created str, repr, len, add, eq and lt dunder methods, with the choice method being a compliment generator.   
The `compliment` method creates a new sequence object that is the compliment of the sequence.  
`str` returns a readable summary of the object while `repr` returns the precise contents of the object.  
`len` gives the length of the string.  
`eq` checks if the strings are the same sequence.  
`lt` checks to see which string is longer.  
`add` concatenates the strings 

In [145]:
from typing import List
from xmlrpc.client import Boolean
#sequence class goes here
class Sequence:

    def __init__(self, sequence: str) ->None:
        #string the represents the dictionary sequence
        self.sequence = sequence

    def __str__(self):
        return 'A character sequence of nucleotides beginning with "{}"...'.format(self.sequence[0:10])

    def __repr__(self):
        return 'Sequence Object: {}'.format(self.sequence)

    def __len__(self):
        length = 0
        for i in self.sequence:
            length += 1
        return(length)

    def __eq__(self, other):
        #Find if the sequence string is equivalent.
        return self.sequence == other.sequence

    def __lt__(self, other):
        #lengths for sorting
        return self.__len__ < other.__len__

    def __add__(self, other):
        self.sequence += other.sequence
        return self.sequence

    def compliment(self):
        convert = {'a':'t', 'g':'c', 't':'a', 'c':'g'}
        inter = ''
        for i in self.sequence:
            j = convert[i]
            inter += j
        complimentarySequence = Sequence(inter)
        return complimentarySequence
        



The following testing method was used outside of the ipython notebook. I am posting it here with some of the functionality removed.

In [152]:
#Use this cell for testing your Sequence class. Show us what tests you ran to confirm your methods worked correctly
#Test Method

import unittest
#import Sequence
class Test(unittest.TestCase):

    global testSeq1
    global testSeq2
    testSeq1 = Sequence("agctagctagcagtcagtagcagatgatgatccacacacgccg")
    testSeq2 = Sequence("agcattgatccacacacgccg")
    
    def test_add(self):
        output = Sequence.add(testSeq1, testSeq2)
        self.assertEqual(output, "agctagctagcagtcagtagcagatgatgatccacacacgccgagcattgatccacacacgccg")

    def test_eq(self):
        output = Sequence.eq(testSeq1, testSeq2)
        self.assertEqual(output, False)
        output = Sequence.eq(testSeq1, testSeq1)
        self.assertEqual(output, True)
    def test_lt(self):
        output = Sequence.eq(testSeq1, testSeq2)
        self.assertEqual(output, False)
        output = Sequence.eq(testSeq2, testSeq1)
        self.assertEqual(output, True)
    def test_compliment(self):
        output = Sequence.eq(testSeq1)
        self.assertEqual(output, Sequence("tcgatcgatcgtcagtcatcgtctactactaggtgtgtgcggc"))
        output = Sequence.eq(testSeq2)
        self.assertEqual(output, Sequence("tcgtaactaggtgtgtgcggc"))
#unittest.main()
#if __name__ == '__main__':
   #unittest.main() 

## Part 2 - SequenceRecord Class

Write a class called Sequence Record. This class should have two attributes:

- A label/title (something that describe the source of the sequence, like the contents of a header line in a FASTA file)
- and a Sequence object 

Your initializer should attempt to confirm that the second attribute is, in fact, a Sequence object. Consider the following code and how it could be applied here

```
>>> s = "hello"
>>> type(s) == str
True 
```

You should also, at minimum, add a __str__ and __repr__ method.

The `SequenceRecord` class takes in a sequence object and a record string.  

The `eq` method is altered to compare the type of record and use this for equivalence.  
The `init` method checks to see if the incoming `sequenceO` object is a sequence object.



In [120]:
# SequenceRecord class goes here
class SequenceRecord:
    def __init__(self, record: List[str], sequenceO) -> None:
        if type(sequenceO) == Sequence:
            self.sequenceO = sequenceO
        else:
            print("Not a sequence file.")
            return False

        recordList = record.split(' ', 1)

        self.record = recordList

    def __str__(self):
        return 'A sequence record of a "{}"'.format(self.record[0])

    def __repr__(self):
        return 'SequenceRecord Object: {} {} | {}'.format(self.record[0], self.record[1], self.sequenceO.sequence)

    def __eq__(self, other):
        return self.record[0] == other.record[0]

    def __lt__(self, other):
        #lengths for sorting
        return self.sequenceO.__len__ < other.sequenceO.__len__

## Part 3 - Parsing using your new classes

Build yourself a test FASTA file with approx 3 simple records. Read in this file, and use it contents to create a SequenceRecords for each record in the file. 

- Please note this process is identical to what we did previously with FASTA parsing, only before we used a dictionary where the key stored the header info, and the value stored the sequence info. Now, our SequenceRecord object holds BOTH pieces.

Be sure to confirm your SequenceRecord objects hold the correct information.

For extra credit, write your parser as a generator.

In [130]:
def fastaGenerator(fasta):
    
    with open(fasta) as f:
        header = None
        sequence = ""
        for line in f:
            line = line.strip()
            if line.startswith(">"):
                    if header:
                        yield(header,sequence)
                    sequence = ""
                    header = line.lstrip(">")
                    
            else:
                sequence += line.strip()
    yield(header,sequence)

file = fastaGenerator("testDNA.fasta")
sequenceList = []
for i in file:
    newSequence = Sequence(i[1])
    newSequenceRecord = SequenceRecord(i[0], newSequence)
    sequenceList.append(newSequenceRecord)
print(sequenceList[:2])

[SequenceRecord Object: Lungfish Protopterus dolloi | atggcaacaaatatccgaaaaactcacccgctccttaaaatcgtaaacaactccctaattgacctgccaaccccatcaaacatttcagcatgatgaaacttcggctcacttcttggattctgccttattactcaaattctcacaggattattcttagctatacactacactgctgacacctcaacagccttctcatctatcgcacacatcgcccgcgacgtaaactatggctggctcctgcgcaacattcacgcaaacggagcatccatattttttatttgcatctacatccacattggtcgtggaatttattacggatccttcctatatacagagacctgaaatatcggagtagttctttttcttttaactataataactgcattcgtaggctacgttctcccgtgaggtcaaatatccttctggggtgccacagtcatcactaatctcctctcagccgtcccatacctaggagataccctagttcaatggatttggggcggattttctgtagacaacgccaccctcacccgattcttcgcttttcacttccttctccccttcatcatctctgcaataaccgccgcacactttttattcctccacgaaacaggctcaaataacccaacaggattaaactctaacctagacaaaatctcgttccacccgtattttactataaaagaccttttagggttcctaatacttgcttcttttctctgcctattagccctattttctcctaatcttctaggggacccagaaaattttaccccggctaatccacttgtcaccccaacccacatcaagccagagtgatacttcctctttgcatatgcaattctgcgctccatcccaaataaacttggaggcgtactagcacttatagcgtcgatccttattctttttatcattccgtttcttcaccgagcaaaacaacgcacta

# Part 4 - OOP Lab Part 2

## Add the following to your Sequence class:

 

- the __len__() magic function  -> return the length of the object  
- the __add__() magic function -> returns the result of two Sequence objects being added to one another (what should this be?)
 

Add one method of your choice to Sequence that is something you think a Sequence can/should do. Think about things we've done with sequences, and add one of these as a method to your Sequence Class.