Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An eachkmer iterator fix #8

Merged
merged 5 commits into from
Jun 13, 2017
Merged

An eachkmer iterator fix #8

merged 5 commits into from
Jun 13, 2017

Conversation

TransGirlCodes
Copy link
Member

@TransGirlCodes TransGirlCodes commented Jun 12, 2017

@bicycle1885 This edits some lines of the eachkmer iterator, so as well as checking for ambigs, it also checks for gaps. Hope this is ok, if it is, the use of each_kmer_impl in some specific iterator I can write will actually allow me to do a lot of my work without the need for a whole new Codon type, which closes the issue and is a huge timesaving win!

@bicycle1885
Copy link
Member

Good catch! I haven't noticed this problem

@codecov-io
Copy link

codecov-io commented Jun 12, 2017

Codecov Report

Merging #8 into master will decrease coverage by 0.39%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master       #8     +/-   ##
=========================================
- Coverage   83.25%   82.86%   -0.4%     
=========================================
  Files          41       41             
  Lines        2885     2906     +21     
=========================================
+ Hits         2402     2408      +6     
- Misses        483      498     +15
Impacted Files Coverage Δ
src/eachkmer.jl 96.15% <100%> (ø) ⬆️
src/fastq/reader.jl 34.88% <0%> (-2.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78b39b1...ca7a6ed. Read the comment docs.

src/eachkmer.jl Outdated

# faster path: return the next overlapping kmer if possible
if it.step < k && pos + k - 1 ≤ endof(it.seq)
offset = k - it.step
if it.step == 1
nt = inbounds_getindex(it.seq, pos+offset)
kmer = kmer << 2 | trailing_zeros(nt)
has_ambiguous |= isambiguous(nt)
ambig_or_gap |= (isambiguous(nt) || isgap(nt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not iscertain(nt)? A quick benchmark on my machine suggests no performance degradation by switching isambiguous with iscertain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, adding test cases to make sure gap skipping is nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I should have seen iscertain was applicable here, I used it instead of isambiguous and isgap in the other site counting stuff I've done. I'll make the iscertain change now, and add a test case in the morning - it's too late here!

@TransGirlCodes
Copy link
Member Author

Ok here we go, a test explicitly checking a sequence with gaps. If this passes and is alright, I'd like to merge this fix when the checks are done.

Ben J. Ward added 2 commits June 13, 2017 11:24
Eliminates multiple negation operations.
@bicycle1885
Copy link
Member

👍 CI failure on Linux would not be a problem.

@TransGirlCodes TransGirlCodes merged commit 992a810 into master Jun 13, 2017
@TransGirlCodes TransGirlCodes deleted the kmer_iterator_fix branch June 13, 2017 12:00
@bicycle1885 bicycle1885 mentioned this pull request Jun 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants