Check failed: error == cudaSuccess (2 vs. 0) out of memory #28

chiwing4 · 2018-12-16T14:35:24Z

K4YT3X Edit: Temporary Solution

The issue is caused by waifu2x-caffe not having sufficient memory. A temporary solution to this problem is to reduce the number of threads used.

Original Issue

Hi, the program failed and I am having the following exception when I use --gpu to enlarge.
I believe my machine has enough memory to run it.
It there any way to fix it?
Many thanks. This is a useful program.

[+] INFO: Reading video information
[+] INFO: Framerate: 59.94005994005994
[+] INFO: Starting to upscale extracted images
2018-12-16 22:27:58.707744 [+] INFO: [upscaler] Thread 3 started
2018-12-16 22:27:58.707744 [+] INFO: [upscaler] Thread 4 started
2018-12-16 22:27:58.708742 [+] INFO: [upscaler] Thread 0 started
2018-12-16 22:27:58.708742 [+] INFO: [upscaler] Thread 1 started
2018-12-16 22:27:58.709739 [+] INFO: [upscaler] Thread 2 started
Could not create log file: File exists
COULD NOT CREATE LOGFILE '20181216-222802.16176'!
F1216 22:28:02.183454 12168 syncedmem.cpp:78] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Could not create log file: File exists
COULD NOT CREATE LOGFILE '20181216-222802.15444'!
F1216 22:28:02.189438 10304 syncedmem.cpp:78] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Could not create log file: File exists
COULD NOT CREATE LOGFILE '20181216-222802.17140'!
F1216 22:28:02.238307  1736 syncedmem.cpp:78] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Could not create log file: File exists
COULD NOT CREATE LOGFILE '20181216-222802.22424'!
F1216 22:28:02.290169 10188 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Could not create log file: File exists
COULD NOT CREATE LOGFILE '20181216-222803.18460'!
F1216 22:28:03.546811  1888 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
2018-12-16 22:28:03.889894 [+] INFO: [upscaler] Thread 2 exiting
2018-12-16 22:28:04.882241 [+] INFO: [upscaler] Thread 1 exiting
2018-12-16 22:28:04.925128 [+] INFO: [upscaler] Thread 0 exiting
2018-12-16 22:28:04.962030 [+] INFO: [upscaler] Thread 4 exiting
2018-12-16 22:28:05.021869 [+] INFO: [upscaler] Thread 3 exiting
[+] INFO: Upscaling completed

The text was updated successfully, but these errors were encountered:

chiwing4 · 2018-12-16T14:42:20Z

For your reference, just in case.
my command:

py video2x.py -v testvid.mp4 -o new.mp4 --width 3840 --height 2160 --gpu

using Python 3.7.1
spec:

32GB RAM
RTX 2080ti
running in a disk with 450+GB space

Thanks

k4yt3x · 2018-12-17T03:59:48Z

Thank you for the issue.

This looks weird. I saw the issue this morning but has yet located the problem. Just know that I've already see the problem and is working on it.

There's a big update coming soon. This issue might have to be fixed after the update which is already in process.

chiwing4 · 2018-12-17T04:38:21Z

Thanks for your quick reply!
I look forward to have the new update.

asimonf · 2018-12-17T15:16:26Z

Just hit this right now. Hopefully you'll be able to figure it out

WSADKeysGaming · 2018-12-17T18:09:05Z

This also happens to me, even after reducing the amount of threads from 5 to 4, because I have insufficient memory available

k4yt3x · 2018-12-17T19:46:30Z

Could you please monitor the system to make sure that there's sufficient memory available for both the system RAM and the GPU RAM?

Btw I stuffed your output in a code block to make it easier to read.

k4yt3x · 2018-12-17T20:27:01Z

I was just reading some other similar errors, but I'm still not sure how to fix this problem on my side.
I still have quite a few questions:

Does CUDNN work?
Does single thread work?

Some other caffe-library-related issues as a reference:

syncedmem.cpp:78 Check failed: error == cudaSuccess (2 vs. 0) out of memory BVLC/caffe#5353
Check failed: error == cudaSuccess (2 vs. 0) out of memory CMU-Perceptual-Computing-Lab/caffe_rtpose#44
Check failed: error == cudaSuccess (2 vs. 0) out of memory in solver phase. BVLC/caffe#2747

asimonf · 2018-12-17T20:28:23Z

I followed your suggestions and it turns out that my system is running out of memory. I had to limit thread count to 2 threads on my particular test to get it to run. My laptop has a GTX 965M and it only has 4GB of VRAM. System memory was more than enough with over 11 GB of RAM free.

k4yt3x · 2018-12-17T20:34:35Z

@asimonf so that is indeed the problem. My program is able to monitor the system memory and give suggestions on how many threads should be used, and warns the user when insufficient memory is available. However, this program doesn't have the capability to monitor GPU memory, which reflects as a CUDA error in @chiwing4 's ouput as "cudaSuccess out of memory".

I remember that I've already looked into that, but didn't find an elegant solution for monitoring cuda GPU memory. I will keep looking into it.

Therefore, for now, if this issue arises, reduce the number of threads.

asimonf · 2018-12-17T20:38:01Z

My current testing with your file shows around 1.4 GB of memory per thread. I don't know if memory usage grows with output size, but if it does, it might even more (testing output at 1440x1080). I'll test a bit more. A good suggestion could be to limit threads assuming a maximum usage of 2GB per thread?

k4yt3x · 2018-12-17T20:39:38Z

Maybe it's possible. I'll have to look into if waifu2x-caffe has any options for that.

chiwing4 · 2018-12-18T13:02:46Z

OK I can confirm this is GPU out of memory problem.

GPU memory usage

when idle : 1.9GB
with 1 thread: 6GB
with 2 threads: 9.6GB
with 3 threads: out of memory

My testing show ~4GB VRAM per thread.

Output to 1080p and 4K has the same memory usage.
I though 2080ti could handle this well. lol
https://imgur.com/a/ALTNPpR

chiwing4 · 2018-12-18T13:46:21Z

I just enlarged the testvid.mp4 (240 frames, 320 x 240) to 1080p and 4K.
While VRAM usage are the same, process time are different as expected.
With 2 Threads:

out 1080p : 465 seconds
out 4K : 1778 seconds

4K needs nearly 4x computational time than 1080p. (Interesting. 4K has 4 times more pixels than 1080p)

asimonf · 2018-12-18T14:36:23Z

Interesting. I thought it would be much faster. I tried an OpenCL version on my house computer (It has a Vega 56 card) and it took about that long for the 1080 one. I might not be remembering accurately, so I'll try testing again tonight.

k4yt3x · 2018-12-18T14:48:00Z

@asimonf I thought waifu2x only supports cuda? Maybe I'm wrong on that. I don't have an AMD card to test things out.

asimonf · 2018-12-18T14:53:12Z

Yes, but there's a rewrite in C++ that is compatible with CUDA and OpenCL. I haven't tested enough to see if quality is similar. I just tested speed yesterday. It might not be the same (but it does use the same models for the NN though, so it can't be that different). Here's the link: https://github.com/DeadSix27/waifu2x-converter-cpp.

I doubt it's faster than the original waifu2x on nvidia cards, but I don't know.

k4yt3x · 2018-12-18T14:55:11Z

That's cool. It's fascinating how many derivations there are from the original waifu2x.

I feel like this thread is beginning to be like a forum.

asimonf · 2018-12-18T15:17:58Z

So I just tested it on the Vega 56 and it took 432 seconds to process all 240 frames with upscale and denoise to 1440x1080. I can't really compare it 100% to what waifu2x-caffe does with video2x, but apparently it isn't that slow compared to the 2080ti if (and this is a big if) they are working to generate similar quality images.

dealvidit · 2018-12-23T06:22:28Z

So with 2 threads, this works for me too. But I have 2 1080Ti. So can I spread the load on both the GPUs?

k4yt3x · 2018-12-25T22:37:18Z

@dealvidit spreading the load will unlikely be a thing that this program can control. It's more up to either waifu2x or caffe2.

k4yt3x · 2019-02-26T19:00:56Z

I'm getting some progress. I'm considering using the GPUtil library to monitor the GPU memory usage to solve this issue.

k4yt3x · 2019-02-26T23:11:20Z

I have just pushed update 2.4.2. If you have Nvidia GPU and CUDA drivers installed, the new version will now read output of nvidia-smi.exe to determine usable GPU memory prior to upscaling.

Although it doesn't have support for AMD GPUs, AMD GPUs must use waifu2x-converter-cpp. Therefore I'm not sure if the same problem will occure on that driver. If there is something like that on the other driver, please open a new issue and I'll see if I can fix it.

k4yt3x self-assigned this Dec 17, 2018

k4yt3x added the type:Bug Something isn't working label Dec 17, 2018

k4yt3x added the help wanted Extra attention is needed label Dec 17, 2018

k4yt3x mentioned this issue Feb 26, 2019

ffmpeg extraction fails to start if missing extracted frames folder with custom paths #39

Closed

k4yt3x closed this as completed Feb 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check failed: error == cudaSuccess (2 vs. 0) out of memory #28

Check failed: error == cudaSuccess (2 vs. 0) out of memory #28

chiwing4 commented Dec 16, 2018 •

edited by k4yt3x

Loading

chiwing4 commented Dec 16, 2018 •

edited by k4yt3x

Loading

k4yt3x commented Dec 17, 2018 •

edited

Loading

chiwing4 commented Dec 17, 2018

asimonf commented Dec 17, 2018

WSADKeysGaming commented Dec 17, 2018

k4yt3x commented Dec 17, 2018 •

edited

Loading

k4yt3x commented Dec 17, 2018

asimonf commented Dec 17, 2018

k4yt3x commented Dec 17, 2018 •

edited

Loading

asimonf commented Dec 17, 2018

k4yt3x commented Dec 17, 2018

chiwing4 commented Dec 18, 2018 •

edited

Loading

chiwing4 commented Dec 18, 2018

asimonf commented Dec 18, 2018

k4yt3x commented Dec 18, 2018

asimonf commented Dec 18, 2018

k4yt3x commented Dec 18, 2018 •

edited

Loading

asimonf commented Dec 18, 2018

dealvidit commented Dec 23, 2018

k4yt3x commented Dec 25, 2018

k4yt3x commented Feb 26, 2019

k4yt3x commented Feb 26, 2019

Check failed: error == cudaSuccess (2 vs. 0) out of memory #28

Check failed: error == cudaSuccess (2 vs. 0) out of memory #28

Comments

chiwing4 commented Dec 16, 2018 • edited by k4yt3x Loading

K4YT3X Edit: Temporary Solution

Original Issue

chiwing4 commented Dec 16, 2018 • edited by k4yt3x Loading

k4yt3x commented Dec 17, 2018 • edited Loading

chiwing4 commented Dec 17, 2018

asimonf commented Dec 17, 2018

WSADKeysGaming commented Dec 17, 2018

k4yt3x commented Dec 17, 2018 • edited Loading

k4yt3x commented Dec 17, 2018

asimonf commented Dec 17, 2018

k4yt3x commented Dec 17, 2018 • edited Loading

Therefore, for now, if this issue arises, reduce the number of threads.

asimonf commented Dec 17, 2018

k4yt3x commented Dec 17, 2018

chiwing4 commented Dec 18, 2018 • edited Loading

chiwing4 commented Dec 18, 2018

asimonf commented Dec 18, 2018

k4yt3x commented Dec 18, 2018

asimonf commented Dec 18, 2018

k4yt3x commented Dec 18, 2018 • edited Loading

asimonf commented Dec 18, 2018

dealvidit commented Dec 23, 2018

k4yt3x commented Dec 25, 2018

k4yt3x commented Feb 26, 2019

k4yt3x commented Feb 26, 2019

chiwing4 commented Dec 16, 2018 •

edited by k4yt3x

Loading

chiwing4 commented Dec 16, 2018 •

edited by k4yt3x

Loading

k4yt3x commented Dec 17, 2018 •

edited

Loading

k4yt3x commented Dec 17, 2018 •

edited

Loading

k4yt3x commented Dec 17, 2018 •

edited

Loading

chiwing4 commented Dec 18, 2018 •

edited

Loading

k4yt3x commented Dec 18, 2018 •

edited

Loading