Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to restart in between generations of higher resolution images #2031

Open
DiffusionAllusion opened this issue Dec 9, 2023 · 1 comment

Comments

@DiffusionAllusion
Copy link

DiffusionAllusion commented Dec 9, 2023

I've noticed that if I try to generate 640x640 images (the highest I can without running out of memory) I have to restart Shark after each generated image, otherwise it crashes. I suspect this is because some part of the system isn't unloading the used memory. Would be great if this could be looked into as it's quite annoying. Also let me know if there is anything I can do to help debugging this.

I'm on AMD (Windows)

Logs: (when trying to make a second image)

File "shark\shark_inference.py", line 224, in load_module
  params = load_flatbuffer(
           ^^^^^^^^^^^^^^^^
File "shark\iree_utils\compile_utils.py", line 475, in load_flatbuffer
  vmfb, config, temp_file_to_unlink = load_vmfb_using_mmap(
                                      ^^^^^^^^^^^^^^^^^^^^^
File "shark\iree_utils\compile_utils.py", line 411, in load_vmfb_using_mmap
  ctx.add_vm_module(mmaped_vmfb)
File "iree\runtime\system_api.py", line 271, in add_vm_module
File "iree\runtime\system_api.py", line 268, in add_vm_modules
SystemExit: Error registering modules: C:\actions-runner\w\SRT\SRT\c\runtime\src\iree\hal\drivers\vulkan\native_allocator.cc:315: RESOURCE_EXHAUSTED; VK_ERROR_OUT_OF_DEVICE_MEMORY; vkAllocateMemory; failed to allocate buffer of length 1719042240; while invoking native function hal.allocator.allocate; while calling import;
[ 1]   native hal.allocator.allocate:0 -
[ 0] bytecode module@1:33446 -

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "starlette\routing.py", line 686, in lifespan
File "uvicorn\lifespan\on.py", line 137, in receive
File "asyncio\queues.py", line 158, in get
asyncio.exceptions.CancelledError
@Torva01
Copy link

Torva01 commented Mar 1, 2024

Same here, I was getting it before, one at a time, but out of nowhere I have to restart after every generation. Did you find any solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants