Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove BytesSequenceRecord #66

Closed
rhpvorderman opened this issue Mar 25, 2022 · 2 comments · Fixed by #67
Closed

Remove BytesSequenceRecord #66

rhpvorderman opened this issue Mar 25, 2022 · 2 comments · Fixed by #67

Comments

@rhpvorderman
Copy link
Collaborator

When I put BytesSequenceRecord in there were two major reasons:

  • PyBytes_FromStringAndSize was much faster than PyUnicode_DecodeASCII
  • Retrieving pointers from strings was deemed difficult, as strings do not have the buffer protocol.

I think both issues are now gone.

  • ASCII checking the entire buffer and then using PyUnicode_New(..., 127) is only slightly slower than PyBytes_FromStringAndSize. (!0%)
  • Retrieving pointers from strings can be done very fast with PyUnicode_DATA. Since SequenceRecord ensures that strings can never be anything else than ASCII this is as fast as PyBytes_AS_STRING.

On top of that, strings are more useful than bytes. Names should be strings. Sequences of nucleotides work more intuitive as strings. And qualities, are phred scores. These are an ASCII representation of the proper score and thus work best as strings.

I was working on #65 when I realised that BytesSequenceRecord is now just a maintenance burden at this point.

@marcelm
Copy link
Owner

marcelm commented Mar 25, 2022

I’m glad this can be dropped. It’s good you noticed this before 1.0! This will also simplify the documentation a bit.

@rhpvorderman
Copy link
Collaborator Author

Glad you agree. In retrospect I should never have put it there, but there we go. "Voortschrijdend inzicht" as it is called in Dutch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants