Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode API naming convention #213

Closed
antocuni opened this issue Jun 10, 2021 · 4 comments
Closed

Unicode API naming convention #213

antocuni opened this issue Jun 10, 2021 · 4 comments

Comments

@antocuni
Copy link
Collaborator

Historically, Python 2 had two data types for strings:

  • str to represent sequences of chars, where each char is in the range 0-255
  • unicode to represent unicode strings

At the C level, the API functions related to str were called PyString_* and functions related to unicode were called PyUnicode_*.

The modern Python 3 has the same data types but with different names:

  • bytes is equivalent to the old str. The C functions are called PyBytes_*
  • str is equivalent to the old unicode. For the historical reasons explained above, the C functions are still called PyUnicode_*.

For HPy we have basically 3 options:

  1. follow the CPython convention and call them HPyBytes_* and HPyUnicode_*
  2. fix the mess and call them HPyBytes_* and HPyStr_* (or possibly HPyString_*, but I think that HPyStr is much better)
  3. middle ground solution, which could be useful for other parts of the API. We could declare HPyStr_* as the official API, but also provide a separate header (maybe hpy/compat.h or hpy/cpycompat.h?) which maps HPyUnicode_* to the equivalent HPyStr_*, possibly giving a warning.
  • HPyUnicode_* pros: it is easier for people to port their code from Python/C to HPy, and to compare to the CPython docs
  • HPyStr_* pros: it is cleaner and more consistent with the rest of the API.

Personally, I think that the best is the middle-ground solution.

@hodgestar
Copy link
Contributor

My 2c: I much prefer HPyUnicode_* and consider HPyStr_* at best confusing.

@uranusjr
Copy link

+1 to the middle ground approach, and IMO no need of a warning. A note in documentation saying Str is preferred would be enough.

@timfel timfel added this to the Version 0.9 milestone Nov 1, 2022
@mattip
Copy link
Contributor

mattip commented Dec 1, 2022

xref #354, which leans toward HPyUnicode for easy porting of C-API functions

@fangerer
Copy link
Contributor

I'll close this issue because as Matti mentioned, we decided to keep the C-level naming style (see dev call minutes)

@fangerer fangerer closed this as not planned Won't fix, can't repro, duplicate, stale Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants