Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: bytes.decode #6462

Open
ivirshup opened this issue Nov 10, 2020 · 2 comments
Open

Feature request: bytes.decode #6462

ivirshup opened this issue Nov 10, 2020 · 2 comments

Comments

@ivirshup
Copy link
Contributor

Feature request

I would like to do b"abc".decode("utf-8") in no python mode.

My use case is that h5py has decided to return text data as bytes. I would like to be able to quickly decode text data stored in structured arrays without having the loop be in python. I'm sure there are more common use cases for this as well.

I would hope that cpython's utf-8 decoding (https://github.com/python/cpython/blob/master/Objects/stringlib/codecs.h) could be reused here, but don't really know enough about how unicode strings work to comment more.

@esc
Copy link
Member

esc commented Nov 10, 2020

@ivirshup thank you for submitting this, I have labelled it as a feature request.

@stuartarchibald
Copy link
Contributor

This is technically feasible but quite tricky, requires unicode_decode_utf8 implementing https://github.com/python/cpython/blob/a0c603cb9d4dbb9909979313a88bcd1f5fde4f62/Objects/unicodeobject.c#L5103 as the caller of the noted codec libraries, which will also need translating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants