-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: embed multiprocessing.Process
can be created with spawn context
#5238
Conversation
Nice test! Do you have any leads what needs fixing? (I have no idea tbh. I never use embedding in my own work.) |
Thanks, @rwgk! I think I understand why this is happening, but i'm not sure exactly why -- yet. In either case, this doesn't seem like something The long and short of it is You can see this in action using the following // main.c
#include <Python.h>
int main(int argc, char **argv) {
Py_Initialize();
int result = PyRun_SimpleString("import sys; print(sys.executable)");
Py_Finalize();
return result;
} compile with: gcc -Wall \
-I$(python3 -c 'import sysconfig; print(sysconfig.get_config_var("INCLUDEPY"))') \
-L$(python3 -c 'import sysconfig; print(sysconfig.get_config_var("LIBDIR"))') \
-lpython$(python3 -c 'import sysconfig; print(sysconfig.get_config_var("LDVERSION"))') \
-o main main.c On As a sanity check, I wrote up a small codeCompiled the same as above. #include <Python.h>
#include <pythonrun.h>
#include <stdio.h>
int main(int argc, char **argv) {
Py_Initialize();
FILE *fp = fopen("./t.py", "r");
int result = PyRun_SimpleFile(fp, "./t.py");
Py_Finalize();
fclose(fp);
return result;
} # t.py
import multiprocessing as mp
import sys
def foo():
print(f"child: {sys.executable}")
print(f"child: {__name__=}")
def main():
print(f"parent: {sys.executable}")
print(f"parent: {__name__=}")
proc = mp.Process(target=foo)
proc.start()
proc.join()
if __name__ == "__main__":
main() This issue seems to stem from platform specific resolution of certain fields in code#include
<Python.h>
#include <cpython/initconfig.h>
#include <stdio.h>
void print_config(const PyConfig *config) {
// input
printf("home: %ls\n", config->home);
printf("platlibdir: %ls\n", config->platlibdir);
printf("pathconfig_warnings: %d\n", config->pathconfig_warnings);
printf("program_name: %ls\n", config->program_name);
printf("pythonpath_env: %ls\n", config->pythonpath_env);
// output
printf("base_exec_prefix: %ls\n", config->base_exec_prefix);
printf("base_executable: %ls\n", config->base_executable);
printf("base_prefix: %ls\n", config->base_prefix);
printf("exec_prefix: %ls\n", config->exec_prefix);
printf("executable: %ls\n", config->executable);
printf("module_search_paths_set: %d\n", config->module_search_paths_set);
if (config->module_search_paths_set == 1) {
printf("module_search_paths: ");
for (int i = 0; i < config->module_search_paths.length; i++) {
printf("%ls", config->module_search_paths.items[i]);
if (i < config->module_search_paths.length - 1) {
printf(":");
}
}
printf("\n");
} else {
printf("module_search_paths:\n");
}
printf("prefix: %ls\n", config->prefix);
}
int main(int argc, char **argv) {
PyStatus status;
PyConfig config;
PyConfig_InitPythonConfig(&config);
status = Py_InitializeFromConfig(&config);
if (PyStatus_Exception(status)) {
goto exception;
}
PyConfig_Read(&config);
print_config(&config);
PyConfig_Clear(&config);
Py_Finalize();
return 0;
exception:
PyConfig_Clear(&config);
if (PyStatus_IsExit(status)) {
return status.exitcode;
}
/* Display the error message and exit the process with
non-zero exit code */
Py_ExitStatusException(status);
} I need to a little more digging to see if I can determine why there are discrepancies in |
I'm totally not an expert, so I just did a quick Google search ( https://stackoverflow.com/questions/15636266/embedded-python-multiprocessing-not-working Have you seen that already? (The Google search has more hits that look interesting, but I didn't actually look around.) |
I am also no expect hahah, cue just exploring what the heck is going on at the I'd not seen that link! Thanks for sharing. I'm sure that will be useful for someone who stumbled upon this in the future. For my use case ideally the code that is run by the embedded python interpreter does not need to be aware of the environment it is running on nor that it is running in an embedded interpreter. So, I am really hoping there is a straight forward way to change this behavior in the code that uses |
It seems the easiest way to get around this is to specify the path the the int main(int argc, char *argv[]) {
char const *const args[1] = {"/path/to/python"};
py::scoped_interpreter guard = py::scoped_interpreter(true, 1, args, false);
// add what `add_program_dir_to_path=true` effectively does when `argv` is not provided.
// if `add_program_dir_to_path=true` and `argv` is provided, parent dir of `argv[0]`
// will be prepended to `sys.path` which is likely not desirable.
py::exec("import sys; sys.path.insert(0, '')");
} |
I did more digging to find the source of the discrepancy in This seems like it stems from a kernel bug in Looking into the implementation, on Knowing what I do now, I don't think this is a bug in I am curious if the behavioral difference is still necessary, but I figure it's just a wart that we have to live with now. |
I'm a little surprised that you closed this PR, the test seems very useful!
How about changing your new test accordingly? That would make the solution more discoverable, we'd know for sure that it still works / we'd learn immediately if future macOS environment include changes that break the approach. If you don't want to tackle Windows at this point, we could just ifdef out the test. |
Yeah, sorry, I think I rushed to the wrong conclusion! Im happy to adjust the test accordingly. Im AFK at the moment, but will sort this out this afternoon. |
Description
Add a embedded interpreter test to verify that a multiprocessing process can be created, communicated with, and joined using
spawn
. The test is failing on mac and windows, however it shouldn't. It appears that on windows and mac the child process's name is not updated and a new instance of the parent process is created as a child. The child processes (at least on mac) are created with arguments similar to:build/tests/test_embed/test_embed -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=5, pipe_handle=7) --multiprocessing-fork
Related: