nRCA support in SeqUtils CodonUsage #1692

ivanerill · 2018-06-19T22:52:41Z

This pull request addresses issue #1688

I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.

I have read the CONTRIBUTING.rst file and understand that AppVeyor and
TravisCI will be used to confirm the Biopython unit tests and flake8 style
checks pass with these changes.

I have added my name to the alphabetical contributors listings in the files
NEWS.rst and CONTRIB.rst as part of this pull request, am listed
already, or do not wish to be listed. (This acknowledgement is optional.)

added normRelativeCodonAdaptationIndex class to support nRCA index computations [O'Neill, Or and Erill (PLoS One. 2013 Oct 7;8(10):e76177)] defines a complete class equivalent to the existing CodonAdaptationIndex class that supports CAI index computations https://www.ncbi.nlm.nih.gov/pubmed/24116094 https://www.ncbi.nlm.nih.gov/pubmed/20453079

updated test script to test the newly introduced normRelativeCodonAdaptationIndex class

added FASTA files for reference set (refset_Sharp.fas) and test data (sample_genes.fas)

peterjc · 2018-06-20T12:53:13Z

Bio/SeqUtils/CodonUsage.py

@@ -53,6 +55,8 @@
    'GLU': ['GAG', 'GAA'],
    'TYR': ['TAT', 'TAC']}

+# DNA bases that can occupy each codon position
+CodonBases = {'A' : 0, 'C' : 0, 'G' : 0, 'T' : 0}


Defining this as a module level variable does not seem ideal. There is a risk of the end user altering it etc.

I think it would be better to define these dictionaries where you do CodonBases.copy() within those methods.

peterjc · 2018-06-20T12:55:25Z

Bio/SeqUtils/CodonUsage.py

+
+
+class normRelativeCodonAdaptationIndex(object): 
+    """A normalized Relative Codon Adaptation index implementation. 


Style wise we try to follow PEP257 for our docstrings, which means a single line summary then a blank line. The rest of the docstring indentation should match the opening quote (i.e. four spaces indented here).

This would fix the flake8 style warnings like:

D208 Docstring is over-indented D205 1 blank line required between summary line and description

See the CONTRIBUTING.rst file for more about how to run flake8 locally.

peterjc · 2018-06-20T12:57:30Z

Bio/SeqUtils/CodonUsage.py

+        self._codon_count(fasta_file) 
+
+        # compute total number of codons
+        total_codons = sum(self.codon_count.values()) + 0.0


I assume the plus 0.0 is to turn this from an integer into a float, in order to avoid integer division under Python 2.

It might be cleaner to import the Python 3 style division?

from __future__ import division

peterjc · 2018-06-20T12:58:35Z

Bio/SeqUtils/CodonUsage.py

+
+        # if no index is set or generated, the default ErillLabEcoliIndex will 
+        # be used. 
+        if self.index == {}: 


Stye wise, I think if not self.index: considered more Pythonic.

peterjc · 2018-06-20T13:01:32Z

Tests/test_CodonUsage.py

+# from CaiIndices import SharpIndex # you can save your dictionary in this file.
+# X.SetCaiIndex(SharpIndex)
+
+infilename = 'sample_genes.fas'


This should be relative to the Tests/ directory:

infilename = 'CodonUsage/sample_genes.fas'

peterjc · 2018-06-20T13:02:16Z

Tests/test_CodonUsage.py

+# X.SetCaiIndex(SharpIndex)
+
+infilename = 'sample_genes.fas'
+outfilename = 'nRCA_test_output.csv'


This would ideally be a temp file, e.g. via import tempfile, rather than generating output in the Tests/ folder.

ivanerill added 8 commits June 19, 2018 23:55

sa

271b507

updated to test nRCA class

3e4461a

updated test script to test the newly introduced normRelativeCodonAdaptationIndex class

added files for nRCA test

92c751a

added FASTA files for reference set (refset_Sharp.fas) and test data (sample_genes.fas)

added test file names

bedcc58

added myself to CONTRIB.rst

e441bce

Update NEWS.rst

9ee66f8

Update NEWS.rst

1077d4e

peterjc reviewed Jun 20, 2018

View reviewed changes

peterjc changed the title ~~nRCA support in~~ nRCA support in SeqUtils CodonUsage Jul 11, 2018

peterjc added the Enhancement label Jul 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nRCA support in SeqUtils CodonUsage #1692

nRCA support in SeqUtils CodonUsage #1692

ivanerill commented Jun 19, 2018

peterjc Jun 20, 2018

peterjc Jun 20, 2018

peterjc Jun 20, 2018

peterjc Jun 20, 2018

peterjc Jun 20, 2018

peterjc Jun 20, 2018



		class normRelativeCodonAdaptationIndex(object):
		"""A normalized Relative Codon Adaptation index implementation.

nRCA support in SeqUtils CodonUsage #1692

Are you sure you want to change the base?

nRCA support in SeqUtils CodonUsage #1692

Conversation

ivanerill commented Jun 19, 2018

peterjc Jun 20, 2018

Choose a reason for hiding this comment

peterjc Jun 20, 2018

Choose a reason for hiding this comment

peterjc Jun 20, 2018

Choose a reason for hiding this comment

peterjc Jun 20, 2018

Choose a reason for hiding this comment

peterjc Jun 20, 2018

Choose a reason for hiding this comment

peterjc Jun 20, 2018

Choose a reason for hiding this comment