Skip to content

yudianzheng/SketchVideo

Repository files navigation

Sketch Video Synthesis

Yudian Zheng · Xiaodong Cun · Menghan Xia · Chi-Man Pun

🗺 Showcases


💡 Abstract

Understanding semantic intricacies and high-level concepts is essential in image sketch generation, and this challenge becomes even more formidable when applied to the domain of videos. To address this, we propose a novel optimization-based framework for sketching videos represented by the frame-wise Bézier Curves. In detail, we first propose a cross-frame stroke initialization approach to warm up the location and the width of each curve. Then, we optimize the locations of these curves by utilizing a semantic loss based on CLIP features and a newly designed consistency loss using the self-decomposed 2D atlas network. Built upon these design elements, the resulting sketch video showcases impressive visual abstraction and temporal coherence. Furthermore, by transforming a video into SVG lines through the sketching process, our method unlocks applications in sketch-based video editing and video doodling, enabled through video composition, as exemplified in the teaser.


🚩 Getting Start

if you only want to optimize the example, run (1.2) and (5).

(1) build up the environment:

the total training need projects of layer neural layer atlas and diffvg

# install all and train from beginning
sh scripts/install.sh
# (1.1) install NLA
sh scripts/install_atlas.sh
# (1.2) install diffvg and CLIP(optimize the example models)
sh scripts/install_clipavideo.sh
(2) download Dataset or take your own data(less than 70 frames,and extract masks), put on the folder :
wget https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-Unsupervised-trainval-Full-Resolution.zip
 
unzip DAVIS-2017-Unsupervised-trainval-Full-Resolution.zip

or using the examples data(car-turn) and extract the masks.

(3) process/crop the data:
sh scripts/process_dataset.sh
(4) build up atlas:
sh scripts/operate_atlas.sh <video_name>

The trained models should be located at 'data/dataset/<video_name>/results/<epoch_num>' and 'data/dataset/<video_name>/results/checkpoint'.

(5)compute our method with trained models(eg. mallard-water, scooter-gray, and soapbox):
sh scripts/operate_clipavideo.sh <video_name>

Look at arguments.txt to see more arguments

Citation

@article{zheng2023sketch,
      title={Sketch Video Synthesis}, 
      author={Yudian Zheng and Xiaodong Cun and Menghan Xia and Chi-Man Pun},
      year={2023},
      eprint={2311.15306},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgements

The code is borrowed heavily from CLIPasso and CLIPScene, thanks for their wonderful work!