Skip to content

Сompresssing First Order Motion Model for Image Animation to enable its real-time inference on mobile devices

License

Notifications You must be signed in to change notification settings

AksRustagi/fast-fomm-mobile

 
 

Repository files navigation

Fast-fomm-mobile

The code was written by Arman Tsaturyan, Nikita Mokrov, Ilya Selnitskiy and Ilya Zakharkin.

About

The purpose of the project was to compress First-Order Motion Model for conditional image generation task to enable its real-time inference on mobile devices. We were inspired by three novel works: First-Order Motion Model (FOMM), GAN Compression and StyleGAN2 Distillation, we came up with our approach that we call 2pix2pix. The main idea was to gather a distilled dataset based on FOMM predictions, then train pix2pix generator that is being fed 2 images: "source" and "driving", and predict the "target" image with it. Loss is calculated based on original FOMM predictions, so the aim of 2pix2pix is to predict as similar to FOMM as possible given absolutely the same input. We also benchmark all used models: FOMM, original pix2pix and compressed pix2pix on CPU, GPU and mobile processors.

Video

Please see a video of project presentation for more details.

Prerequisites

  • Linux or macOS
  • Python 3
  • CPU or NVIDIA GPU + CUDA

Description

First Order Motion Model

This is a fork of the original FOMM model. We added a script generation_syntetic_dataset_v3_recognition.py for synthetic dataset creation, it consists of triplets: (source image, driving image, FOMM-predicted image). FOMM-predicted image is a result of transformation of the source image in a position of driving one with FOMM. The example of such a triplet is presented below. Here is a link to the created dataset.

Gan compression

The module is also a fork of the original GAN Compression model. Here we added several important improvenents:

  1. Triplet dataloader for pix2pix model;
  2. Dense Motion block inside of the pix2pix;
  3. CoordConv block.

Example of the Dense Motion block prediction for random images during training:

See a demo notebook to launch our model.

ONNX to Core ML Converter

This submodule is a fork of ONNX to Core ML Converter. The module aimed at converting PyTorch modules to Apple CoreML format. It's a special format for model inference on Apple devices. There is no straightforward solution to convert PyTorch models directly to CoreML, therefore an intermediate conversion to ONNX format is used.

Have look at gifs in pics folder

About

Сompresssing First Order Motion Model for Image Animation to enable its real-time inference on mobile devices

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published