Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about relation to HPy #1

Closed
timfel opened this issue Aug 19, 2022 · 7 comments
Closed

Question about relation to HPy #1

timfel opened this issue Aug 19, 2022 · 7 comments

Comments

@timfel
Copy link

timfel commented Aug 19, 2022

I am happy to see another run at the fence to propose a new C API for Python!

I am just wondering what may be the relation of this new API to HPy? It seems most of the ideas mentioned in DesignRules.md and DesignPrinciples.md are the same or very nearly so as in HPy. There are some notable omissions, however, such as no mention of subinterpreters or discussion around storing PyRef in C structures or globals and how that interacts with moving GCs, for example (these are among the issues that we fought with in HPy).

For HPy, we have ported a significant part of NumPy and some parts of Cython to observe the performance impact especially for these packages that have been very performance conscious (and sometimes relied on internal API and implementation-defined behaviour to do so). I think these lessons will be valuable for any project attempting to replace the C API and provide a way forward for projects such as these.

@markshannon
Copy link
Owner

It seems most of the ideas mentioned in DesignRules.md and DesignPrinciples.md are the same or very nearly so as in HPy.

I'd like to think of this API as learning from HPy, rather than competing with it.

Where are the design principles of HPy listed? I couldn't find an explicit list.

From my reading of HPy, I think HPy differs in a few important ways from the API proposed here:

  • This API treats error handling as at least as important as reference handling, HPy inherits the fragile approach of the current API.
  • HPy is a bit dogmatic in its approach to handle ownership. It uses the perjorative term "stealing" for transfer of ownership from caller to callee when making a call. We already transfer ownership from callee to caller on return.
  • HPy inherits some of the legacy features of C-API that impair portability and maintenance. E.g. using C long.

There are some notable omissions...

Of course there are omissions, this repo is only a few days old 🙂
Please open issues for specific omissions.

no mention of subinterpreters...

What is it specifically about subinterpreters that is of concern?

https://github.com/markshannon/New-C-API-for-Python/blob/main/DesignPrinciples.md#minimum-of-implicit-state
mentions the conflict between explicitly passing around the interpreter and making it implicit.

I might make sense to wait for a decision on https://peps.python.org/pep-0684/ before thinking about this too much.

discussion around storing PyRef in C structures

Want to open an issue specifically for this?
The very short answer is "declarative object layout". Extensions must declare how objects are laid out in a way that allows the VM to traverse them. Opaque blobs of data may not contain PyRefs.

Object layout in 3.12 is already more regular than in older versions, so a declarative approach can be efficient (at least for CPython 3.12+).

or globals

C globals are not supported as they are unsafe.

The docs for HPyGlobal says that they are an alternative to module state. Why have an alternative?
Ease of use? Efficiency?

@timfel
Copy link
Author

timfel commented Aug 22, 2022

Where are the design principles of HPy listed? I couldn't find an explicit list.

Indeed we haven't created an exhaustive list, mostly because we took this approach that we just start porting important packages and then we keep discovering issues while porting things like NumPy (which exposes it's own C API for consumption by other packages). There is https://github.com/hpyproject/hpy/wiki/c-api-next-level-manifesto and in that wiki a lot of resources we collected while porting things over time.

What is it specifically about subinterpreters that is of concern?

https://github.com/markshannon/New-C-API-for-Python/blob/main/DesignPrinciples.md#minimum-of-implicit-state
mentions the conflict between explicitly passing around the interpreter and making it implicit.

Ah, I had overlooked that. This is indeed our thinking in HPy, too, to avoid global state as much as possible. The additional HPyContext argument was inspired by APIs such as JNI or SDL2, which have something similar. SDL2 in particular uses this (implicit) global function table to be forward binary compatible. However, for Python there is an issue, since multiple extensions may be compiled against different versions, years apart, and still need to run in the same process. Without an explicit context that means any time any API needs to change we need a new global method, whereas with a context, this can be handled by the runtime passing the right context to each module.

In particular we in HPy take the same view as you do here:

All code written to the API will continue to work on future versions of Python without recompilation. Recompilation using newer versions may be more efficient, but code compiled to older versions of the API will continue to work.
[...]
Once added to the API, a feature will be removed only if there is a very strong reason to do so.

So we do want to be binary compatible forever, but still under important circumstances need to be able to change the API. With global functions, this would imply a new function, with a context argument, just a different context for the newer binaries.

C globals are not supported as they are unsafe.

We have found dozens of usages of plain C structures stored globally or reachable from other global variables in NumPy, for example. It seemed to us quite a burden to force extensions having to find completely different solutions to this. The thoughts around HPyGlobal aren't all fleshed out, but https://github.com/hpyproject/hpy/wiki/dev-call-20220303#hpyglobal has some notes, as this was driven by the NumPy port.


I think it would be useful if we (the current HPy devs, you, @encukou) could have a direct conversation about this, maybe you would have time for a call? The people working on HPy would love to see a better future C API and would be glad to help with the experience we gained from HPy.

@encukou
Copy link

encukou commented Aug 25, 2022

a direct conversation about this

I'm up to it, but about this proposal you should talk to Mark.

@markshannon
Copy link
Owner

a direct conversation about this

I'd rather have a public, written conversation, so that the reasoning behind any decisions are recorded.

@mattip
Copy link

mattip commented Sep 2, 2022

The problem with a written conversation like a discourse or a github project is that the asynchronous format leads difficulties in keeping the conversation on-topic and resolution-oriented. Perhaps we could find a middle ground: some kind of synchronous discussion(s) (IRC? Zoom?) with a transcript and a commitment to upload the raw recording for permanent storage (youtube or the "notes" section of the HPy documentation).

@markshannon
Copy link
Owner

Discussion is happening tomorrow 1pm UTC. https://www.twitch.tv/pypyproject

@markshannon
Copy link
Owner

Closing as there doesn't seem to be anything else to do here.

Feel free to open a new issue if there is something else to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants