Name		Name	Last commit message	Last commit date
parent directory ..
assert		assert
dataset		dataset
eval_configs		eval_configs
minigpt4		minigpt4
prompts		prompts
train_configs		train_configs
LICENSE.md		LICENSE.md
LICENSE_Lavis.md		LICENSE_Lavis.md
README.md		README.md
demo.py		demo.py
demo_video.py		demo_video.py
environment.yml		environment.yml
train.py		train.py

README.md

MiniGPT-4 for video

Currently, this is a simple extension of MiniGPT-4 without extra training. We try to undermine its ability for video understanding with simple prompt design.

🔥 Updates

2023/04/19: Simple extension release We simple encode 4 frames and use the time-sensitive prompt as follows:
```
    "First, <Img><ImageHere></Img>. Then, <Img><ImageHere></Img>. "
    "After that, <Img><ImageHere></Img>. Finally, <Img><ImageHere></Img>. "
```
However, without video-text instruction finetuning, it's difficult to answer those questions about the time.

💬 Example

🏃 Usage

Please follow the instrction in MiniGPT-4 to prepare the environment.

Prepare the envirment.

    conda env create -f environment.yml
    conda activate minigpt4

Download BLIP2 model:
- ViT: wget https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/eva_vit_g.pth
- QFormer: wget https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained_flant5xxl.pth
- Change the vit_model_path and q_former_model_path in minigpt4.yaml.

Download Vicuna model:

LLAMA: Download it from the original repo or hugging face.
If you download LLAMA from the original repo, please process it via the following command:

    # convert_llama_weights_to_hf is copied from transformers
    python src/transformers/models/llama/convert_llama_weights_to_hf.py \
        --input_dir /path/to/downloaded/llama/weights \
        --model_size 7B --output_dir /output/path

Download Vicuna-13b-deelta-v0 and process it:

    # fastchat v0.1.10
    python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-13b \
    --target /output/path/to/vicuna-13b \
    --delta lmsys/vicuna-13b-delta-v1.0

Change the llama_model in minigpt4.yaml.

Download MiniGPT-4 model:
- Linear layer can be downloaded here.
- Change the ckpt in minigpt4_eval.yaml.

Running demo:

    python demo_video.py --cfg-path eval_configs/minigpt4_eval.yaml

Acknowledgement

This project is mainly based on MiniGPT-4, which is support by Lavis, Vicuna and BLIP2. Thanks for these amazing projects!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

video_miniGPT4

video_miniGPT4

assert

assert

dataset

dataset

eval_configs

eval_configs

minigpt4

minigpt4

prompts

prompts

train_configs

train_configs

LICENSE.md

LICENSE.md

LICENSE_Lavis.md

LICENSE_Lavis.md

README.md

README.md

demo.py

demo.py

demo_video.py

demo_video.py

environment.yml

environment.yml

train.py

train.py

README.md

MiniGPT-4 for video

🔥 Updates

💬 Example

🏃 Usage

Acknowledgement

Files

video_miniGPT4

Directory actions

More options

Directory actions

More options

Latest commit

History

video_miniGPT4

Folders and files

parent directory

MiniGPT-4 for video

🔥 Updates

💬 Example

🏃 Usage

Acknowledgement