Clarify what is a "sub-interpreter" and what is an "interpreter". #69

markshannon · 2020-10-19T19:59:21Z

PEP 554 is entitled "Multiple Interpreters in the Stdlib", yet the term "subinterpreters" is used throughout this repo.

There is the additional confusion of the C struct names.
It seems to me that the C struct PyInterpreterState corresponds to the sub-interpreter and that the C struct _PyRuntimeState corresponds to the interpreter.

Confusion about which is which makes the goals of this project unclear, and I fear may have resulted in some unnecessary work, as data structures are moved to PyInterpreterState that could more easily, and with less impact, been moved to (or left in) _PyRuntimeState.

The text was updated successfully, but these errors were encountered:

encukou · 2020-10-20T08:13:54Z

IMO, "subinterpreter" not a good term; generally we should aim to make all interpreters equal (though that can be a long-term goal).

markshannon · 2020-10-20T10:57:50Z

Sub-interpreters already exist, whether we like the term or not.
They share the same heap, although they cannot see each other's sub-heap, just common objects like builtin types and numbers.

Why not leave them working as they do now, and enable multiple interpreters?
That way seems easier to implement in practice, and causes less breakage (at least, no more breakage).

encukou · 2020-10-20T11:28:58Z

As far as I can see, "sub-interpreter" and "interpeter" are basically interchangeable terms at this point. See e.g. the first two sentences in the Py_NewInterpreter docs.
They shared some objctes like builtin types and numbers, which are immutable and currently OK to share – until you want per-interpreter GIL, which is one of the goals in this repo.
And unfortunately they also sometimes share some objects which they shouldn't, like anything that references a Python function's globals, so I'd rather fix them, not leave them working as they do now.

_PyRuntimeState holds the stuff that's common to all (sub-)interpreters, such as, well, the list of (sub-)interpreters. Everything else should be per-(sub)interpreter.

markshannon · 2020-10-20T13:10:29Z

The problem with that approach is that involves a lot of moving stuff from _PyRuntimeState to PyInterpreterState.
Wouldn't allowing several _PyRuntimeState be less work as it already has a GIL?
It would also allow subinterpreters to work as they currently do.

Until multiple interpreters can run in parallel, moving global state into _PyRuntimeState has no adverse impact on performance. Moving that state into PyInterpreterState slows things down.

encukou · 2020-10-20T13:38:58Z

Wouldn't allowing several _PyRuntimeState be less work as it already has a GIL?

I doubt it – you'd need to make a per-_PyRuntimeState GIL, whereas in the current approach you'd need to make a per-PyInterpreterState GIL. The main issues, like making sure threads don't mangle a shared object's refcounts, are basically the same.

What exactly do you mean by allowing subinterpreters to work as they currently do?

markshannon · 2020-10-20T15:12:33Z

All sub-interpreters share the same heap (even though they can see different parts of it) and share the GIL.

encukou · 2020-10-21T07:42:20Z

So, to clarify, under your proposal with multiple _PyRuntimeState, we would plan to make one GIL per _PyRuntimeState?
Would sub-interpreters from different _PyRuntimeStates not share the heap?

markshannon · 2020-10-21T11:12:23Z

Doesn't sharing a heap between interpreters require synchronization for the cycle GC?

markshannon · 2020-10-21T11:13:54Z

My main point is that without clearer naming, it is impossible to discuss these alternatives without a lot confusion.

encukou · 2020-10-21T12:40:48Z

OK. Here's my take.
You can have multiple interpreters in a single process. They should be isolated from each other; we're working on improving that isolation.
The term subinterpreter essentially means the same thing as interpreter. There are subtle differences:

If you start one interpreter from another, you'd call the child a "subinterpreter". (But you can also start interpreters from pure C code, and subinterpreters should be able to outlive their parents, though I don't think the high-level API is built for that.)
Saying "subinterpreters" makes it clear that you're working on better support for multiple interpreters, as opposed to improving other aspects of Python. Not a very good label, IMO, but it's what's used.

As for an earlier question, I don't think that moving stuff from _PyRuntimeState to PyInterpreterState is more work than allowing several _PyRuntimeState. But then, I'm not the one actually doing that work.

ericsnowcurrently · 2020-10-21T15:48:16Z

The key detail is that there is a "main" interpreter:

created during runtime initialization
used during runtime initialization
used during runtime finalization
the initial interpreter exposed to users
has the "main" thread

We have been calling all other interpreters in the runtime "subinterpreters".

FWIW, in the context of PEP 554, we start at the main interpreter. Each new interpreter then effectively ends up as a node in an implicit tree relative to "parent" interpreter under which the new one was created. However, that isn't fundamental at the C level.

ericsnowcurrently · 2020-10-22T14:55:15Z

FYI, the C-API docs have a paragraph explaining the distinction (thanks to @nanjekyejoannah).

@markshannon, do you think it would help to have more detail there? (IMHO, there isn't much more to say that we say there.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify what is a "sub-interpreter" and what is an "interpreter". #69

Clarify what is a "sub-interpreter" and what is an "interpreter". #69

markshannon commented Oct 19, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 21, 2020

markshannon commented Oct 21, 2020

markshannon commented Oct 21, 2020

encukou commented Oct 21, 2020

ericsnowcurrently commented Oct 21, 2020

ericsnowcurrently commented Oct 22, 2020 •

edited

Clarify what is a "sub-interpreter" and what is an "interpreter". #69

Clarify what is a "sub-interpreter" and what is an "interpreter". #69

Comments

markshannon commented Oct 19, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 20, 2020

markshannon commented Oct 20, 2020

encukou commented Oct 21, 2020

markshannon commented Oct 21, 2020

markshannon commented Oct 21, 2020

encukou commented Oct 21, 2020

ericsnowcurrently commented Oct 21, 2020

ericsnowcurrently commented Oct 22, 2020 • edited

ericsnowcurrently commented Oct 22, 2020 •

edited