Limitation in processing number of video frames according to GPU memory? #36

instant-high · 2021-07-04T07:24:32Z

Since I got it to work on my GForce 1050GTX / 2GB , at least for videos not longer than ~ 16 frames, before the GPU runs out of memory I wonder if there is also a limitation for using a 8 GB GPU ?

I had the same problem using Wav2Lip, but it could be solved by setting the chunk size to 1.

Would it (theoretically) be possible to process videos in SimSwap in smaller parts or chunks by releasing GPU memory every 15 frames ?

ExponentialML · 2021-07-04T08:03:25Z

There are multiple things you could do.

Lower the size of your input videos.
Split the chunks into separate files, then loop over them or do it one by one (painful).
Modify videoswap.py using the below as a starting point.

SimSwap/util/videoswap.py

Line 38 in fc4b701

frame_count = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
Use a subprocess and use ffmpeg to split the video into chunks, then do a for loop over each video chunk using the python script, then merge them after the fact in a video editor or with ffmpeg. For example:

~pseudo code~ 

for video_file in video_file_directory:
    python test_video_swapsingle.py video_file ...

I would go with number 3, with pseudo code being something like.

this isn't tested, it's just to give you an idea

current_frame = 0
max_frame = 14

for frame_index in tqdm(range(frame_count)): 
        ret, frame = video.read()
        if  ret:
           current_frame += 1
           if current_frame == max_frame:
               Do something to empty video memory here
               current_frame = 0   
           
            detect_results = detect_model.get(frame,crop_size)

            if detect_results is not None:
            .....

Like I said, I haven't tested it, but it could be a bit of work to get it implemented from scratch as I haven't looked into how the models are loaded into memory yet. The above is definitely enough to work out your own solution though without messing with torch though.

instant-high · 2021-07-04T10:17:23Z

Ok.
Added Line #50 in util/videoswap.py
torch.cuda.empty_cache()
This lets me process 99 frames before out of memory....
I'll try to free memory also in the second for loop

ExponentialML · 2021-07-04T18:41:50Z

Ok.
Added Line #50 in util/videoswap.py
torch.cuda.empty_cache()
This lets me process 99 frames before out of memory....
I'll try to free memory also in the second for loop

Great to hear. I would try to create a little wrapper function where you can tune your own parameters (max frame count), and plug it into line 50 where it executes torch.cuda.empty_cache() every nth frame.

instant-high · 2021-07-04T19:54:46Z

Yes.
But why does it run out of memory after 99 frames even if i call "empty_cache" after each frame?
I cannot find anything filling the cache. Searched all other scripts in simswap.
Btw.:
I don't know much about python... just beginner after 30 years coding in (visual)basic and a little c++

instant-high · 2021-07-04T20:08:07Z

So I need little bit of help.
I've inserted the following code:

for frame_index in tqdm(range(frame_count)):
    torch.cuda.empty_cache() 
    ret, frame = video.read()
    if frame_index == 98:
        print (frame_index)
        input("Press Enter to continue...")
        break

Then it begins to write the video_file(1) containing the first 98 frames

Is there a way to jump back to video_swap but continue with frame 99 for the next 98 frames?
Call video_swap or something like goto video_swap?
After the break it just had to write video_file(2)
And so on...
Don't know if this would work and how to....

EDIT.
Got it to work as written above, but when calling swap_video again (break after 10 frames) it runs out of memory immediately.

ExponentialML · 2021-07-04T21:29:13Z

The way torch.cuda.empty_cache() works is it only frees the memory that it's able to. Remember what I said about me not being aware of how the models are loaded into memory with this project? This is what I was referring to. They may be instantiated in different parts of the script, so it may be a bit more work, but you can try what I said below.

Also, you're running that torch call every frame which isn't necessary / can lead to some issues. Also, you don't need to use user input to go to the next iteration. It probably runs out of memory because it's still executing in the background while waiting for your input. Try this instead (untested as I'm away from my machine).

# Add these two lines above the for loop.
current_frame = 0 
max_frame = 14 

for frame_index in tqdm(range(frame_count)): 
        ret, frame = video.read()
        if  ret:
        # If ret returns true, increment the current_frame counter by 1.
           current_frame += 1 
           # if the current frame count equals the max frame count, do something.
           if current_frame == max_frame: 
              # Let's empty the cache.
               torch.cuda.empty_cache() 
              # Reset the counter back to 0.
               current_frame = 0

instant-high · 2021-07-05T04:16:21Z

The user input is just for some testing purpose after 98 frames.
Resetting frame index to 0 would process the same part of the input video and overwrite the temporary image sequence...
I think I found a solution how to process the whole input via batch and some additional parameters in video_swapsingle.py (start frame, end frame) without the need to split it into shorter parts.
But I have a daytime job now.....

ExponentialML · 2021-07-05T04:42:37Z

Resetting frame index to 0 would process the same part of the input video and overwrite the temporary image sequence...

Read my proposed code again please. It's not about setting the frame index to 0, it's about creating a separate counter variable that increases a certain amount of times in the loop, and once it hits a certain limit (max_frame), the counter resets.

You said that your GPU runs out of memory every 15th frame or so. In theory, clearing your GPU cache via torch's methods (which may not work) or doing something to alleviate GPU resources every 15 frames would prevent you from doing all of the extra steps you've mentioned.

instant-high · 2021-07-05T20:14:37Z

Looks like i accidently got it to work on 2GB GPU VRAM....

Not the way I initially planned .... but it works.
No problem processing testvideo duration 23 sec. / 1396 frames.

As soon as I have cleaned up the code I will post the changes I've made.
(test_video_swapsingle.py / videoswap.py / test_options.py)

EDIT:
I will write a simple GUI (VB6 :-)

instant-high · 2021-07-06T17:14:23Z

Here are the changes I made to run SimSwap on 2GB VRAM:

./options/test_options.py

self.parser.add_argument("--first_frame", dest="first_frame", type=int, default=0, help="Set frame to start from.")

./util/videoswap.py
.
.
.
from util.add_watermark import watermark_image
#frame_index = 0
first_frame = 0
.
.
.
def video_swap(first_frame , video_path, id_vetor, swap_model, detect_model, save_path, temp_results_dir='./temp_results', crop_size=224, no_simswaplogo = False):
.
.
.
for frame_index in tqdm(range(first_frame,frame_count)):
torch.cuda.empty_cache()
ret, frame = video.read()
if frame_index == 1:
break
.
.
.
video.release()
if frame_index > 1:
image_filename_list = []
path = os.path.join(temp_results_dir,'*.jpg')
image_filenames = sorted(glob.glob(path))
clips = ImageSequenceClip(image_filenames,fps = fps)

test_video_swapsingle.py
.
.
.
first_frame = 0
video_swap(first_frame, opt.video_path, latend_id, model, app,
opt.output_path,temp_results_dir=opt.temp_path,no_simswaplogo=opt.no_simswaplogo)

first_frame = 2
video_swap(first_frame, opt.video_path, latend_id, model, app, opt.output_path,temp_results_dir=opt.temp_path,no_simswaplogo=opt.no_simswaplogo)

test_video_swapsingle.py calls video_swap in ./util/videoswap.py and processes the first 2 frames before the break
then it calls video_swap again, starting at frame 2 and runs until the end of the input file. Tested so far on 2600 frames but there seems to be no limit.
torch.cuda.empty_cache() clears the VRAM before processing every single frame...

Not perfect but (working for me).....

instant-high · 2021-07-18T16:03:53Z

Just found a more simple solution for "cuda out of memory" problem while running SimSwap on 2GB GPU:

I only insert a with torch.no_grad(): command in ../util/videoswap.py between lines 48 and 49
(and add 4 more spaces indent to every following line from 49 to 84)

and it works perfect

NNNNAI · 2021-07-18T16:19:36Z

Just found a more simple solution for "cuda out of memory" problem while running SimSwap on 2GB GPU:

I only insert a with torch.no_grad(): command in ../util/videoswap.py between lines 48 and 49
(and add 4 more spaces indent to every following line from 49 to 84)

and it works perfect

OMG， I forgot to add this .I will make it done in the next update.

instant-high · 2021-07-18T16:59:57Z

:-)
Came across it making some more mods to first order motion model and co-part segmentation

instant-high closed this as completed Jul 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limitation in processing number of video frames according to GPU memory? #36

Limitation in processing number of video frames according to GPU memory? #36

instant-high commented Jul 4, 2021

ExponentialML commented Jul 4, 2021 •

edited

instant-high commented Jul 4, 2021 •

edited

ExponentialML commented Jul 4, 2021

instant-high commented Jul 4, 2021

instant-high commented Jul 4, 2021 •

edited

ExponentialML commented Jul 4, 2021

instant-high commented Jul 5, 2021

ExponentialML commented Jul 5, 2021 •

edited

instant-high commented Jul 5, 2021 •

edited

instant-high commented Jul 6, 2021 •

edited

instant-high commented Jul 18, 2021

NNNNAI commented Jul 18, 2021

instant-high commented Jul 18, 2021

Limitation in processing number of video frames according to GPU memory? #36

Limitation in processing number of video frames according to GPU memory? #36

Comments

instant-high commented Jul 4, 2021

ExponentialML commented Jul 4, 2021 • edited

instant-high commented Jul 4, 2021 • edited

ExponentialML commented Jul 4, 2021

instant-high commented Jul 4, 2021

instant-high commented Jul 4, 2021 • edited

ExponentialML commented Jul 4, 2021

instant-high commented Jul 5, 2021

ExponentialML commented Jul 5, 2021 • edited

instant-high commented Jul 5, 2021 • edited

instant-high commented Jul 6, 2021 • edited

instant-high commented Jul 18, 2021

NNNNAI commented Jul 18, 2021

instant-high commented Jul 18, 2021

ExponentialML commented Jul 4, 2021 •

edited

instant-high commented Jul 4, 2021 •

edited

instant-high commented Jul 4, 2021 •

edited

ExponentialML commented Jul 5, 2021 •

edited

instant-high commented Jul 5, 2021 •

edited

instant-high commented Jul 6, 2021 •

edited