Describe the bug
Hi there,
I am having a super bizarre bug. When using the basic four-line Stable Diffusion Pipeline example on Mac from an embedded python interpreter inside a C++ program (code below), whenever I get to the inference call (pipe()), it seems to duplicate the C++ process multiple times. Each of these processes then start running the Python code again, and each one will in turn open more (at a rate of one every ~5-15 seconds), until I eventually Force Quit them all from Activity Monitor. Somehow, each process that opens continues to be able to write to the same Output window in Xcode (same thing happens if I run the built binary from CLI).
The Python code runs fine, even with the same Python interpreter I use in C++, when I run it outside of C++. (Although it does open a second python3.10 process in the background, that one doesn't ever go above ~25MB of memory usage and doesn't appear do be doing any CPU work, so I think it's unrelated).
Hopefully someone can shed some insight on this issue!
Thanks,
Jonah
P.S. - to set up the minimum reproducible example, you'll need to link to python3.x.dylib in Xcode project settings, put it in /usr/local/lib, put /path/to/python/dir/include/python3.x in Xcode's Header Search Paths, and a /path/to/python/dir/lib in Xcode's Library Search Paths. I used the latest version of pybind11 (since it's a headers only library it's very easy to set up - just download from Github and add the containing dir to the Header Search Paths in Xcode).
Reproduction
#include <pybind11/embed.h>
namespace py = pybind11;
int main() {
Py_SetPythonHome(L"path/to/python/dir"); # This is necessary to start the interpreter
py::scoped_interpreter guard{}; // start the interpreter
py::exec(R"(
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1")
# pipe.to("mps") # Happens with or without this
image = pipe("a tall horse", num_inference_steps=4, height=256, width=256)
# image.images[0].show() # Optional - to see the result
)");
}
Logs
(Note by the end of this, there were 8 HelloPybind processes open [visible from Activity Monitor] most of which were using 3-6GB of memory and lots of "% CPU")
2023-05-30 18:59:36.158366-0300 HelloPybind[26938:358650] Metal API Validation Enabled
0% 0/4 [00:00<?, ?it/s]2023-05-30 18:59:51.316046-0300 HelloPybind[26953:359197] Metal API Validation Enabled
25% 1/4 [00:09<00:28, 9.66s/it]
50% 2/4 [00:22<00:23, 11.76s/it]
0% 0/4 [00:00<?, ?it/s]
75% 3/4 [00:35<00:12, 12.32s/it]
25% 1/4 [00:17<00:52, 17.65s/it]2023-05-30 19:00:29.480803-0300 HelloPybind[27005:359794] Metal API Validation Enabled
100% 4/4 [00:48<00:00, 12.57s/it]
100% 4/4 [00:48<00:00, 12.21s/it]
System Info
I tried it on two different Python envs, both with the same results:
Env 1
diffusers version: 0.16.1
- Platform: macOS-13.2.1-arm64-arm-64bit
- Python version: 3.9.16
- PyTorch version (GPU?): 2.0.0 (False)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.28.1
- Accelerate version: 0.18.0 (I tried it with accelerate uninstalled to no avail)
- xFormers version: not installed
Env 2
diffusers version: 0.16.1
- Platform: macOS-13.2.1-arm64-arm-64bit
- Python version: 3.10.11
- PyTorch version (GPU?): 2.0.1 (False)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.29.2
- Accelerate version: 0.19.0
- xFormers version: not installed
Describe the bug
Hi there,
I am having a super bizarre bug. When using the basic four-line Stable Diffusion Pipeline example on Mac from an embedded python interpreter inside a C++ program (code below), whenever I get to the inference call (pipe()), it seems to duplicate the C++ process multiple times. Each of these processes then start running the Python code again, and each one will in turn open more (at a rate of one every ~5-15 seconds), until I eventually Force Quit them all from Activity Monitor. Somehow, each process that opens continues to be able to write to the same Output window in Xcode (same thing happens if I run the built binary from CLI).
The Python code runs fine, even with the same Python interpreter I use in C++, when I run it outside of C++. (Although it does open a second python3.10 process in the background, that one doesn't ever go above ~25MB of memory usage and doesn't appear do be doing any CPU work, so I think it's unrelated).
Hopefully someone can shed some insight on this issue!
Thanks,
Jonah
P.S. - to set up the minimum reproducible example, you'll need to link to python3.x.dylib in Xcode project settings, put it in /usr/local/lib, put /path/to/python/dir/include/python3.x in Xcode's Header Search Paths, and a /path/to/python/dir/lib in Xcode's Library Search Paths. I used the latest version of pybind11 (since it's a headers only library it's very easy to set up - just download from Github and add the containing dir to the Header Search Paths in Xcode).
Reproduction
Logs
System Info
I tried it on two different Python envs, both with the same results:
Env 1
diffusersversion: 0.16.1Env 2
diffusersversion: 0.16.1