New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode encoding in BGZF #2512
Comments
This turns out to be complicated. Python 2 text mode didn't do universal new lines mode by default, but Python 3 does. This means on examples like
Sadly attempting to implement universal new lines mode in BGZF text mode would be a major piece of work due to the special file offsets (the entire point of BGZF). However, we do need to document this limitation. |
Closed by #2517 - sticking with hard coded |
I should have closed this back in Jan 2020. It may not be impossible, but it would certainly be non-trivial to implement. |
The work and discussion in #2468 leaves the BGZF code in a state where it uses a mix of latin1 (used in the
Bio._py3k
conversion functions) and the default encoding.Properly when used in text mode, the BGZF code should follow the gzip library and take a user specified encoding.
Self-assigning issue.
The text was updated successfully, but these errors were encountered: