Skip to content

lishaoxu1994/Instruct-Video2Avatar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions [Arxiv Paper]

"Make him look like Vincent Van Gogh"

"He should be in "Zelda: Breath of the Wild""

"He should look 100 years old"

Usage example for image editing:

Installation

git clone https://github.com/timothybrooks/instruct-pix2pix
cd instruct-pix2pix
conda env create -f environment.yaml

Usage

conda activate ip2p
bash scripts/download_checkpoints.sh
python edit_cli.py --steps 100 --resolution 512 --seed 1371 --cfg-text 4.5 --cfg-image 1.2 --input imgs/example.jpg --output imgs/output.jpg --edit "turn him into a cyborg"

Example-based Style transfer:

For Example-based Image Synthesis, we recommend the Ebsynth.exe for windows. select keyframes->select video->run all. For our task, the keyframes is the edited portrait image, and the video images are the training or rendering images.

Usage example for video2avatar of INSTA:

Installation

git clone --recursive https://github.com/Zielon/INSTA.git
cd INSTA
cmake . -B build
cmake --build build --config RelWithDebInfo -j
cd INSTANT

After building the project you can either start training an avatar from scratch or load a snapshot. For training, we recommend a graphics card higher or equal to RTX3090 24GB and 32 GB of RAM memory. Training on a different hardware probably requires adjusting options in the config:

	"parent": "main.json",
	"max_steps": 30000,
	"max_cached_bvh": 4000,
	"max_images_gpu": 1700,
	"use_dataset_cache": true,
	"render_novel_trajectory": false,
	"render_from_snapshot": true

Usage

cd INSTANT
## Training
./build/rta --config insta.json --scene data/obama --height 512 --width 512 --no-gui
## Loading from a checkpoint
./build/rta --config insta.json --scene data/obama --height 512 --width 512 --no-gui --snapshot data/obama/experiments/insta/debug/snapshot.msgpack

For training, set "render_from_snapshot": false. For rendering from a checkpoint, set "render_from_snapshot": true. For rendering novel views, set "render_novel_trajectory": true.

Instructions for our pipeline:

  1. Select one keyframe and execute image editing with InstructPix2Pix.

  2. Update the images of dataset with ebsynth, with original/rendering images and the edited keyframe.

  3. Train avatar and render portrait images, continue to step 2.

  Tips: The open mouth keyframe is better. Three iterations of step 2&3 is sufficient.

Dataset

We are releasing part of our training dataset and the checkpoint. We use the avatars from INSTA.

Some checkpoints Edited Avatars.

For Dataset Generation of original videos, we direct the user to INSTA and Metrical Photometric Tracker

BibTeX

@misc{li2023instructvideo2avatar,
      title={Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions}, 
      author={Shaoxu Li},
      year={2023},
      eprint={2306.02903},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgment

InstructPix2Pix: Learning to Follow Image Editing Instructions. (https://github.com/timothybrooks/instruct-pix2pix)

ebsynth: Fast Example-based Image Synthesis and Style Transfer. (https://github.com/jamriska/ebsynth)

INSTA - Instant Volumetric Head Avatars. (https://github.com/Zielon/INSTA/tree/master)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages