CoGS: Controllable Generation and Search from Sketch and Style

Cusuh Ham*, Gemma Canet Tarrés*, Tu Bui, James Hays, Zhe Lin, John Collomosse
* equal contribution

Requirements

Create and activate a Conda or Miniconda environment named cogs with all the necessary dependencies using the following command:

conda env create -f environment.yaml
conda activate cogs

Pseudosketches Dataset

Download the Pseudosketches dataset. There are train/val input file lists under data/pseudosketches, where each line contain a tuple of corresponding inputs:

synset_id/sketch.JPEG,synset_id/style.JPEG,synset_id/image.JPEG

The sketch and image are corresponding pairs from the Pseudosketches dataset, while the style image is selected by using a pre-trained ALADIN to compute the most stylistically similar image to the original image.

Training models

Pseudosketches VQGAN

Train a VQGAN on Pseudosketches with:

python main.py --base configs/pseudosketches_vqgan.yaml -t True --gpus 0,

Then, adjust the checkpoint path under model.params.sketch_encoder_config.params.ckpt_path in configs/pseudosketches_cogs_transformer.yaml. Alternatively, you can download our pre-trained Pseudosketches VQGAN checkpoint and place it under checkpoints/, which corresponds to the default checkpoint path in the config file).

ImageNet VQGAN

For training a VQGAN on ImageNet, refer to the taming-transformers repository. Alternatively, use their pre-trained ImageNet VQGAN checkpoint and place it under checkpoints/ (the default checkpoint path in the config has the filed renamed as imagenet_vqgan_last.ckpt). If you trained your own, adjust the checkpoint path under model.params.image_encoder_config.params.ckpt_path in configs/pseudosketches_cogs_transformer.yaml.

We use the same ImageNet VQGAN to encode both the style and ground truth images.

CoGS Transformer

First, download the pre-trained ALADIN checkpoints (aladin_vgg.pt and resnet_model.pt), and place them under cogs/modules/style/. These checkpoints are required to compute the style loss during training.

To train the transformer stage of CoGS, run:

python main.py --base configs/pseudosketches_cogs_transformer.yaml -t True --gpus 0,

We also provide a pre-trained CoGS transformer checkpoint.

CoGS VAE

Adjust the checkpoint paths under model.params.cogs_transformer_config.params.ckpt_path, model.params.cogs_transformer_config.image_encoder_config.params.ckpt_path and model.params.cogs_transformer_config.sketch_encoder_config.params.ckpt_path in configs/pseudosketches_cogs_vae.yaml. Then to train the VAE stage of CoGS, run:

python main.py --base configs/pseudosketches_cogs_vae.yaml -t True --gpus 0,

In our experiments, the VAE is trained on a single class. We provide pre-trained VAE checkpoints for multiple classes.

Sampling

After training a transformer or downloading a pre-trained model, you can sample images by running:

python scripts/sample_cogs_transformer.py --resume <path/to/ckpt/and/config> --out_dir <path/to/output/directory>

For including the optional VAE step at the end of the pipeline, you can sample images by running:

python main.py --base configs/pseudosketches_cogs_vae.yaml -t False --gpus 0,

Adjust the checkpoint path under model.params.params.ckpt_path to the pre-trained VAE, and the name of the output folder under model.params.params.output_name in configs/pseudosketches_cogs_vae_test.yaml

Acknowledgements

Our code is developed from the taming-transformers repository.

BibTeX

@article{ham2022cogs,
  title={CoGS: Controllable Generation and Search from Sketch and Style},
  author={Ham, Cusuh and Tarres, Gemma Canet and Bui, Tu and Hays, James and Lin, Zhe and Collomosse, John},
  journal={European Conference on Computer Vision},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
cogs		cogs
configs		configs
data/pseudosketches		data/pseudosketches
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
main.py		main.py
setup.py		setup.py

License

cusuh/CoGS

Folders and files

Latest commit

History

Repository files navigation

CoGS: Controllable Generation and Search from Sketch and Style

Requirements

Pseudosketches Dataset

Training models

Pseudosketches VQGAN

ImageNet VQGAN

CoGS Transformer

CoGS VAE

Sampling

Acknowledgements

BibTeX

About

Resources

License

Stars

Watchers

Forks

Languages