Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
zhengzangw committed Mar 19, 2024
1 parent ad1ab2b commit 5fd3d6c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Expand Up @@ -46,7 +46,7 @@ Videos are downsampled to `.gif` for display. Click for original videos. Prompts
* 📍 Open-Sora-v1 released. Model weights are available [here](#model-weights). With only 400K video clips and 200 H800 days (compared with 152M samples in Stable Video Diffusion), we are able to generate 2s 512×512 videos.
* ✅ Three stages training from an image diffusion model to a video diffusion model. We provide the weights for each stage.
* ✅ Support training acceleration including accelerated transformer, faster T5 and VAE, and sequence parallelism. Open-Sora improve **55%** training speed when training on 64x512x512 videos. Details locates at [acceleration.md](docs/acceleration.md).
* ✅ We provide video cutting and captioning tools for data preprocessing. Instructions can be found [here](tools/data/README.md) and our data collection plan can be found at [datasets.md](docs/datasets.md).
* ✅ We provide data preprocessing pipeline, including [downloading](/tools/datasets/README.md), [video cutting](/tools/scenedetect/README.md), and [captioning](/tools/caption/README.md) tools. Our data collection plan can be found at [datasets.md](docs/datasets.md).
* ✅ We find VQ-VAE from [VideoGPT](https://wilson1yan.github.io/videogpt/index.html) has a low quality and thus adopt a better VAE from [Stability-AI](https://huggingface.co/stabilityai/sd-vae-ft-mse-original). We also find patching in the time dimension deteriorates the quality. See our **[report](docs/report_v1.md)** for more discussions.
* ✅ We investigate different architectures including DiT, Latte, and our proposed STDiT. Our **STDiT** achieves a better trade-off between quality and speed. See our **[report](docs/report_v1.md)** for more discussions.
* ✅ Support clip and T5 text conditioning.
Expand Down

0 comments on commit 5fd3d6c

Please sign in to comment.