Skip to content

Kalamojo/Anime-TrVAE

Repository files navigation

Anime Transfer Variational Autoencoder

Image Transferring by way of a Variational Autoencoder

Demo

The demo can be found on HuggingFace here

Model Download

Here is the link to the tensorflow model: https://drive.google.com/file/d/1RXZmK8eS1m2d3ZEje0e2me5uYccirFuf/view?usp=sharing

To run an image transferring demo, download the model and add it to your local repository. Then, simply install the requirements:

Please note that the demo only supports python version 3.9 (and maybe earlier)

pip install -r requirements.txt

Then, run the streamlit demo:

streamlit run demo.py

Introduction

Filters for images already exist, and have been around for quite a while. Applications like Snapchat make use of them, and are able to superimpose items on users, change the coloring/light of an image, or even transfer the style of a person's face. The way these applications generally work are by the use of Generative Adversarial Networks (GANs). The GAN approach generally involves training a model to slightly modify an input image until it resembles images from an anime. While this method effectively accomplishes the bare minimum task, it really doesn't truly generate an anime version of a realistic image.

Our proposed method makes use of a Transfer Variational Autoencoder (trVAE) proposed in a transfer learning research paper (Lotfollahi et al). This model transfers images in a generative fashion, and can easily be supplied labels for transferring instructions. Furthermore, we hope to display how novel methods can be applied to unique problems in general.

To allow others to actually make use of our model, we plan to create a website application for image transferring. It will provide image and label selection functionality, and should return results within a couple of seconds. On this site, we also plan to document the model architecture along with our data, model parameters, and of course, the original paper and repository from which we obtained the model.

Materials and Methods

Autoencoders simply take data, encode it, and then decode it. While this seems like a useless task at first, the main objective of using an Autoencoder is to train the encoder to effectively reduce data dimensionality. This reduced data ideally contains all of the information from the original data, yet is more effective for training and susceptible to change.

The Transfer Variational Autoencoder takes advantage of this feature of encoded data with the following method:

  1. Data is encoded
  2. Encoded data is manipulated
  3. Data is decoded
Transfer Variational Autoencoder Diagram
trVAE architecture diagram

As the diagram depicts, the exact method of manipulating encoded data is with a labelling system. Encoded data is concatenated with an initial label, and encoded data is combined with a second label. The encoder portion of the model effectively learns that data will always be “supplied” a label later on, so with this system, it learns to strip its data of its label class. And once data has been stripped of its class, it is primed and ready to be supplied any second label, effectively transferring data from one class to any other.

Summarization of how the trVAE trains

Results

Original Transferred
Portraite image of Billie Eilish Transferred anime version of Billie Eilish
Portrait image of Tom Cruise Transferred anime version of Tom Cruise
Portrait image of Hinata Hyuga from Naruto Transferred realistic version of Hinata
Portrait image of Shikamaru Nara from Naruto Transferred realistic version of Shikamaru

Areas of Improvement

  • Enable whole-body transfers, not just faces
  • Diversify training datasets for more robust transfers
  • Host model somewhhere for website deployment
  • Add more classes of image transferring

Acknowledgments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages