New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Olgabot/use x if codon not found #529

Merged
merged 7 commits into from Sep 17, 2018

Conversation

Projects
None yet
3 participants
@olgabot
Copy link
Contributor

olgabot commented Aug 9, 2018

This adjusts the _dna_to_aa code to fallback on the "X" if the codon was not found. Hopefully addresses #502 because std::map::operator[] inserts a new element and returns it, in this case the empty string, and then the protein kmers built on DNA seuqences with "N"s are an incorrect length because they are too short.

Can't get sourmash to install via conda on my mac laptop right now so I haven't been able to run the tests.

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 9, 2018

Codecov Report

Merging #529 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #529      +/-   ##
==========================================
+ Coverage   90.76%   90.76%   +<.01%     
==========================================
  Files          33       33              
  Lines        5011     5013       +2     
  Branches       36       37       +1     
==========================================
+ Hits         4548     4550       +2     
  Misses        463      463
Impacted Files Coverage Δ
sourmash/kmer_min_hash.hh 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12aa76f...78d4693. Read the comment docs.

@olgabot

This comment has been minimized.

Copy link
Contributor Author

olgabot commented Aug 31, 2018

Any updates on this merge?

@@ -174,7 +174,14 @@ public:
unsigned int dna_size = (dna.size() / 3) * 3; // floor it
for (unsigned int j = 0; j < dna_size; j += 3) {
std::string codon = dna.substr(j, 3);
aa += (_codon_table)[codon];
std::map<std::string,std::string>::iterator translated = _codon_table.find(codon);

This comment has been minimized.

@luizirber

luizirber Sep 7, 2018

Member

quick comment: you can replace std::map<std::string,std::string>::iterator with auto and avoid having to figure out the proper type (yay C++11)

This comment has been minimized.

@olgabot

olgabot Sep 7, 2018

Author Contributor

Yay! Just made the change :)

@luizirber

This comment has been minimized.

Copy link
Member

luizirber commented Sep 7, 2018

Hi! Why did you choose X to be the replacement? Is that the equivalent to N in amino acid code? I usually try to follow this table for conversions, but I never saw X before...
(and if it is not clear, this is me confessing my ignorance, so any pointers to other refs would be really helpful =] )

@olgabot

This comment has been minimized.

Copy link
Contributor Author

olgabot commented Sep 7, 2018

Yep, X is the equivalent of N! Here's an official-looking (and Web 0.1-looking) reference: https://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/MolBioReview/iupac_aa_abbreviations.html

@luizirber

This comment has been minimized.

Copy link
Member

luizirber commented Sep 10, 2018

I think this sounds reasonable, any comments @ctb ?

@luizirber luizirber merged commit 2534989 into dib-lab:master Sep 17, 2018

4 checks passed

WIP ready for review
Details
codecov/patch 100% of diff hit (target 90.78%)
Details
codecov/project 90.78% (+<.01%) compared to c202750
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment