This is a cover image generation system based on nerual networks. Here are some images we generated:
It is a sample of the input layout scene for the first image, the text of the title is "Wind".
You can use this model by this command
python scripts/gui/simple-server.py --checkpoint models/YOUR_MODEL_CHECKPOINT
We uploaded a pretrained weight file and processed appearance features. You can download them and try the model. All the files should be put in the same folder. https://drive.google.com/drive/u/1/folders/1m4y_AS0eA3duAi1GmIrK81Wvjr3sHmcg
(Before the training you should download the coco images and annotation files, and put them into datasets/coco/images/ and datasets/coco/images/annatations/)
- Train the network
python train.py
- Encode the appearance of the objects
python scripts/encode_features.py --checkpoint models/TRAINED_MODEL_CHECKPOINT
We made use of the work of scene_generation(https://arxiv.org/abs/1909.05379 ICCV 2019). We modified their model mainly by adding an extra cover image discriminator and used another SRnet to generate text image with better quality. The pretrained SRnet is from https://github.com/Niwhskal/SRNet.
We used the COCO stuff 2017 dataset(https://cocodataset.org) for the training. In order to generate the solid regions (background regions with simple colors) of a book cover. We processed the coco images with real cover images. Both the cover images used in the processing and used in the cover image discriminator are from https://github.com/uchidalab/book-dataset
In our system we only want to generate cover images. However this model can be applied in many other image generation applications.(If you do not need to generate text information or solid regions, the original model is a better choice). Here we show some pokemon(pikachu) images generated by the system.
The project is introduced detailedly at paper:https://arxiv.org/abs/2105.11088