Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completed gene finder #1

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

vivienyuwenchen
Copy link

@vivienyuwenchen vivienyuwenchen commented Sep 14, 2017

I finally understood the purpose of if name == "main": and moved the gene_finder(dna) portion under that branch.

Copy link

@SeunginLyu SeunginLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really good in general! I like your clean syntax practices and the code is really well documented with appropriate docstings. I added a few comments that are mostly stylistic guides.


"""

import random
from amino_acids import aa, codons, aa_table # you may find these useful
from load import load_seq
dna = load_seq("./data/X73525.fa")
import doctest
from pickle import dump, load

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow pickle toolbox! great job


"""

import random
from amino_acids import aa, codons, aa_table # you may find these useful
from load import load_seq
dna = load_seq("./data/X73525.fa")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you usually want to separate any function calls from these import statements. It would make more sense to put it under if name == "main":

elif nucleotide == 'T':
return 'A'

#doctest.run_docstring_examples(get_complement, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove comments in your final upload

I added two more doctests because each complementary nucleotide is in
its own if/else if branch. If one branch doesn't work and there is no
doctest for that branch, the mistake will not be caught. In fact, I
found out that I spelled nucleotide wrong in the G branch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great practice that you are adding your own tests


return comp_str

#doctest.run_docstring_examples(get_reverse_complement, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another comment to remove

for nuc in range(0, len(reversed_dna)):
comp_nuc = get_complement(reversed_dna[nuc])
comp_list.append(comp_nuc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you wanna get fancy with just one line of code :
return ''.join([get_complement(reversed_dna[nuc]) for nuc in dna[::-1]])

I added two more doctests: one in which the frame stop codon is TAA and
another in which there is no frame stop codon. This way, all the
branches are tested for error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


return dna

#doctest.run_docstring_examples(rest_of_ORF, globals(), verbose=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another comment to remove

frame1 = find_all_ORFs_oneframe(dna)
frame2 = find_all_ORFs_oneframe(dna[1:])
frame3 = find_all_ORFs_oneframe(dna[2:])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also use a for loop here
[find_all_ORFs_oneframe(dna[i:]) for i in range 3]

humongest = longest_ORF(dna_shuffles[i+1])

return len(humongest)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be more memory efficient to just save the max_length of longest_ORF than to save the actual longest_ORF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants