New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nRCA support in SeqUtils CodonUsage #1692
base: master
Are you sure you want to change the base?
Conversation
added normRelativeCodonAdaptationIndex class to support nRCA index computations [O'Neill, Or and Erill (PLoS One. 2013 Oct 7;8(10):e76177)] defines a complete class equivalent to the existing CodonAdaptationIndex class that supports CAI index computations https://www.ncbi.nlm.nih.gov/pubmed/24116094 https://www.ncbi.nlm.nih.gov/pubmed/20453079
updated test script to test the newly introduced normRelativeCodonAdaptationIndex class
added FASTA files for reference set (refset_Sharp.fas) and test data (sample_genes.fas)
@@ -53,6 +55,8 @@ | |||
'GLU': ['GAG', 'GAA'], | |||
'TYR': ['TAT', 'TAC']} | |||
|
|||
# DNA bases that can occupy each codon position | |||
CodonBases = {'A' : 0, 'C' : 0, 'G' : 0, 'T' : 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining this as a module level variable does not seem ideal. There is a risk of the end user altering it etc.
I think it would be better to define these dictionaries where you do CodonBases.copy()
within those methods.
|
||
|
||
class normRelativeCodonAdaptationIndex(object): | ||
"""A normalized Relative Codon Adaptation index implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style wise we try to follow PEP257 for our docstrings, which means a single line summary then a blank line. The rest of the docstring indentation should match the opening quote (i.e. four spaces indented here).
This would fix the flake8
style warnings like:
D208 Docstring is over-indented
D205 1 blank line required between summary line and description
See the CONTRIBUTING.rst
file for more about how to run flake8
locally.
self._codon_count(fasta_file) | ||
|
||
# compute total number of codons | ||
total_codons = sum(self.codon_count.values()) + 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the plus 0.0
is to turn this from an integer into a float, in order to avoid integer division under Python 2.
It might be cleaner to import the Python 3 style division?
from __future__ import division
|
||
# if no index is set or generated, the default ErillLabEcoliIndex will | ||
# be used. | ||
if self.index == {}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stye wise, I think if not self.index:
considered more Pythonic.
# from CaiIndices import SharpIndex # you can save your dictionary in this file. | ||
# X.SetCaiIndex(SharpIndex) | ||
|
||
infilename = 'sample_genes.fas' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be relative to the Tests/
directory:
infilename = 'CodonUsage/sample_genes.fas'
# X.SetCaiIndex(SharpIndex) | ||
|
||
infilename = 'sample_genes.fas' | ||
outfilename = 'nRCA_test_output.csv' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would ideally be a temp file, e.g. via import tempfile
, rather than generating output in the Tests/
folder.
This pull request addresses issue #1688
I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.
I have read the
CONTRIBUTING.rst
file and understand that AppVeyor andTravisCI will be used to confirm the Biopython unit tests and
flake8
stylechecks pass with these changes.
I have added my name to the alphabetical contributors listings in the files
NEWS.rst
andCONTRIB.rst
as part of this pull request, am listedalready, or do not wish to be listed. (This acknowledgement is optional.)