Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nRCA support in SeqUtils CodonUsage #1692

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ivanerill
Copy link

This pull request addresses issue #1688

I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.

I have read the CONTRIBUTING.rst file and understand that AppVeyor and
TravisCI will be used to confirm the Biopython unit tests and flake8 style
checks pass with these changes.

I have added my name to the alphabetical contributors listings in the files
NEWS.rst and CONTRIB.rst as part of this pull request, am listed
already, or do not wish to be listed. (This acknowledgement is optional.)

added normRelativeCodonAdaptationIndex class to support nRCA index computations [O'Neill, Or and Erill (PLoS One. 2013 Oct 7;8(10):e76177)]
defines a complete class equivalent to the existing CodonAdaptationIndex class that supports CAI index computations
https://www.ncbi.nlm.nih.gov/pubmed/24116094
https://www.ncbi.nlm.nih.gov/pubmed/20453079
updated test script to test the newly introduced normRelativeCodonAdaptationIndex class
added FASTA files for reference set (refset_Sharp.fas) and test data (sample_genes.fas)
@@ -53,6 +55,8 @@
'GLU': ['GAG', 'GAA'],
'TYR': ['TAT', 'TAC']}

# DNA bases that can occupy each codon position
CodonBases = {'A' : 0, 'C' : 0, 'G' : 0, 'T' : 0}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining this as a module level variable does not seem ideal. There is a risk of the end user altering it etc.

I think it would be better to define these dictionaries where you do CodonBases.copy() within those methods.



class normRelativeCodonAdaptationIndex(object):
"""A normalized Relative Codon Adaptation index implementation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style wise we try to follow PEP257 for our docstrings, which means a single line summary then a blank line. The rest of the docstring indentation should match the opening quote (i.e. four spaces indented here).

This would fix the flake8 style warnings like:

D208 Docstring is over-indented
D205 1 blank line required between summary line and description

See the CONTRIBUTING.rst file for more about how to run flake8 locally.

self._codon_count(fasta_file)

# compute total number of codons
total_codons = sum(self.codon_count.values()) + 0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the plus 0.0 is to turn this from an integer into a float, in order to avoid integer division under Python 2.

It might be cleaner to import the Python 3 style division?

from __future__ import division


# if no index is set or generated, the default ErillLabEcoliIndex will
# be used.
if self.index == {}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stye wise, I think if not self.index: considered more Pythonic.

# from CaiIndices import SharpIndex # you can save your dictionary in this file.
# X.SetCaiIndex(SharpIndex)

infilename = 'sample_genes.fas'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be relative to the Tests/ directory:

infilename = 'CodonUsage/sample_genes.fas'

# X.SetCaiIndex(SharpIndex)

infilename = 'sample_genes.fas'
outfilename = 'nRCA_test_output.csv'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would ideally be a temp file, e.g. via import tempfile, rather than generating output in the Tests/ folder.

@peterjc peterjc changed the title nRCA support in nRCA support in SeqUtils CodonUsage Jul 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants