Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Ignore empty lines #73

Merged
merged 2 commits into from Sep 25, 2012

Conversation

Projects
None yet
2 participants
Collaborator

martijnvermaat commented Sep 23, 2012

This would probably fix #71. Strictly, I think the spec doesn't allow for empty lines, but hey.

This implementation is just a suggestion, I'm not sure about its performance implication. It would probably also make sense to look for other line.rstrip() calls to do that only once. Might add another commit for that.

It doesn't interfere with tabix fetching, but empty lines within a fetched region are not handled.

Collaborator

martijnvermaat commented Sep 23, 2012

It seems tabix does not support lines starting with whitespace (and therefore empty lines), and trailing whitespace is already stripped from lines returned by the tabix iterator. So we can ignore whitespace in combination with tabix fetching.

Collaborator

martijnvermaat commented Sep 23, 2012

You could eliminate one of the two calls to line.strip() by for example stacking two generators, but I assume Python to do this for us. From the (limited) profiling I've done, this seems to be the case indeed.

@jamescasbon jamescasbon pushed a commit that referenced this pull request Sep 25, 2012

James Casbon Merge pull request #73 from martijnvermaat/ignore-empty-lines
Ignore empty lines
5da6932

@jamescasbon jamescasbon merged commit 5da6932 into jamescasbon:master Sep 25, 2012

@gotgenes gotgenes pushed a commit to gotgenes/PyVCF that referenced this pull request May 13, 2014

James Casbon Merge pull request #73 from martijnvermaat/ignore-empty-lines
Ignore empty lines
f540e21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment