[demo] RuntimeError: std::bad_alloc #165

Nntsyeo · 2023-06-23T02:39:05Z

The example code is pulled from here.
Error:

(otter) user@env:~/Otter$ python test-model.py                             
                                                                                    
Using pad_token, but it is not set yet.                                            
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/
███████████████████████████████████████████████| 4/4 [00:35<00:00,  8.92s/it]       
Enter prompts (comma-separated): what are they doing?                              

Prompt: what are they doing?                                                       
Traceback (most recent call last):                                                 
  File "/home/user/Otter/test-model.py", line 141, in <module>                                                                                                        
    response = get_response(frames_list, prompt, model, image_processor)                                                                                              
  File "/home/user/Otter/test-model.py", line 98, in get_response                                                                                                     
    generated_text = model.generate(                                               
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context                                        
    return func(*args, **kwargs)                                                   
  File "/home/user/Otter/otter/modeling_otter.py", line 873, in generate                                                                                              
    self._encode_vision_x(vision_x=vision_x)                                       
  File "/home/user/Otter/otter/modeling_otter.py", line 831, in _encode_vision_x                                                                                      
    vision_x = self.vision_encoder(vision_x)[0][:, 1:, :]                                                                                                             
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                             
    return forward_call(*args, **kwargs)                                           
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 940, in forward                                  
    return self.vision_model(                                                      
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                             
    return forward_call(*args, **kwargs)                                           
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 865, in forward                                  
    hidden_states = self.embeddings(pixel_values)                                  
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                             
    return forward_call(*args, **kwargs)                                           
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 195, in forward                                  
    patch_embeds = self.patch_embedding(pixel_values)  # shape = [*, width, grid, grid]                                                                               
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                             
    return forward_call(*args, **kwargs)                                           
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 463, in forward                                                   
    return self._conv_forward(input, self.weight, self.bias)                                                                                                          
  File "/home/user/miniconda3/envs/otter/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward                                             
    return F.conv2d(input, weight, bias, self.stride,                              
RuntimeError: std::bad_alloc

Not sure if this is correct, but I've used this as stated in the example (main):

    model = OtterForConditionalGeneration.from_pretrained(
        "luodian/otter-9b-dc-hf",
    )

My packages are:

(otter) user@env:~/Otter$ pip list | grep -e torch -e xformers
open-clip-torch          2.20.0
torch                    2.0.1
torchaudio               2.0.2
torchvision              0.15.2

Originally posted by @Nntsyeo in #147 (comment)

The text was updated successfully, but these errors were encountered:

ZhangYuanhan-AI · 2023-06-23T08:22:57Z

#147

Nntsyeo · 2023-06-24T13:10:41Z

I may have found the issue to my problem. The GPU I'm using (1x RTX4090 24Gb) is not enough to run the model with video input. It works well when I used image instead.

nvcc version: 12.1.105
cuda version: 12.1

ZhangYuanhan-AI · 2023-06-25T09:16:59Z

Great.

Just to be specific. You mean the 1x RTX4090 24Gb is not capable for 128 frames?

Nntsyeo · 2023-06-26T07:28:37Z

Have tested a few more times (with different video lengths). Videos within 10-20secs are able to work. My first few attempts with 1min video couldn't work and would prompt the error above.

May I know if this 128 frames is part of the parameter in the model? It doesn't seem to take any effect when I changed its config value under the config.json in my cache folder.

Luodian · 2023-06-26T07:32:24Z

We can run videos more than 3mins on our machine (dual 3090 and A100). The max_num_frames in config.json means the cross attentions are build across the 128 frames during training. You can actually uniformly extract your videos into 16, 32...128 frames at your wish.

Please refer to this to see how we organize the input for video demo, and consider to give it a ❤️~

ZhangYuanhan-AI · 2023-06-26T07:34:19Z

Have tested a few more times (with different video lengths). Videos within 10-20secs are able to work. My first few attempts with 1min video couldn't work and would prompt the error above.

May I know if this 128 frames is part of the parameter in the model? It doesn't seem to take any effect when I changed its config value under the config.json in my cache folder.

Generally, 128 is the upper bound number of the video frames.

ZhangYuanhan-AI self-assigned this Jun 23, 2023

king159 added the area:demo code of demo label Jun 25, 2023

Nntsyeo closed this as completed Jun 26, 2023

Luodian mentioned this issue Jun 26, 2023

[demo] error encountered running Mini Demo code in https://github.com/Luodian/Otter/blob/main/docs/demo.md. #147

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[demo] RuntimeError: std::bad_alloc #165

[demo] RuntimeError: std::bad_alloc #165

Nntsyeo commented Jun 23, 2023 •

edited

Loading

ZhangYuanhan-AI commented Jun 23, 2023

Nntsyeo commented Jun 24, 2023

ZhangYuanhan-AI commented Jun 25, 2023

Nntsyeo commented Jun 26, 2023

Luodian commented Jun 26, 2023

ZhangYuanhan-AI commented Jun 26, 2023

[demo] RuntimeError: std::bad_alloc #165

[demo] RuntimeError: std::bad_alloc #165

Comments

Nntsyeo commented Jun 23, 2023 • edited Loading

ZhangYuanhan-AI commented Jun 23, 2023

Nntsyeo commented Jun 24, 2023

ZhangYuanhan-AI commented Jun 25, 2023

Nntsyeo commented Jun 26, 2023

Luodian commented Jun 26, 2023

ZhangYuanhan-AI commented Jun 26, 2023

Nntsyeo commented Jun 23, 2023 •

edited

Loading