VIDGEN-1M

VIDGEN-1M: A LARGE-SCALE DATASET FOR TEXT-TO-VIDEO GENERATION

Introduction

we present VidGen-1M, a superior training dataset for text-to-video models. Produced through a coarse-to-fine curation strategy, this dataset guarantees high-quality videos and detailed captions with excellent temporal consistency. We trained a video generation model using this data and open-source the model.

News

(🔥 New) August 16, 2024 VidGen-1M dataset has been released and can be downloaded here.Please note that due to the large size of the dataset, uploading takes time, so the data you download may be less than 1M, but you can continue to pay attention and get all the data. Thank you for your attention.

Install

Clone this repository
Install Package

conda create -n vidgen python=3.10
conda activate vidgen

pip install torch==2.2.2 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tqdm einops omegaconf bigmodelvis deepspeed tensorboard timm==0.9.16 ninja opencv-python opencv-python-headless ftfy bs4 beartype colossalai accelerate ultralytics webdataset

pip install -U xformers --index-url https://download.pytorch.org/whl/cu118

VidGen-1M Datasets

To assist the community in researching and learning about video generation, we have made public VidGen-1M high-quality video data.

Model Weights

Please download the Model weight from huggingface.

Sampling

You can use a single GPU or multiple GPUs for inference. The script has various arguments.

bash scripts/sample_t2v.sh

Citation

@article{tan2024vdgen-1m,
  title={VIDGEN-1M: A LARGE-SCALE DATASET FOR TEXT�TO-VIDEO GENERATION},
  author={Tan, Zhiyu and Yang, Xiaomeng and Qin, Luozheng and Li, Hao},
  journal={arXiv preprint arXiv:2408.02629},
  year={2024},
  institution={Fudan University and Shanghai Academy of AI for Science},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
core		core
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VIDGEN-1M

VIDGEN-1M: A LARGE-SCALE DATASET FOR TEXT-TO-VIDEO GENERATION

Introduction

News

Contents

Install

VidGen-1M Datasets

Model Weights

Sampling

Citation

About

Releases

Packages

Languages

SAIS-FUXI/VidGen

Folders and files

Latest commit

History

Repository files navigation

VIDGEN-1M

VIDGEN-1M: A LARGE-SCALE DATASET FOR TEXT-TO-VIDEO GENERATION

Introduction

News

Contents

Install

VidGen-1M Datasets

Model Weights

Sampling

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages