# Demo for paper "First Order Motion Model for Image Animation"

**Clone repository**

In [2]:
# !git clone https://github.com/AliaksandrSiarohin/first-order-model

**sample data: https://drive.google.com/drive/folders/1kZ1gCnpfU0BnpdU47pLM_TQ6RypDDqgw?usp=sharing**

**Load driving video and source image**

In [9]:
import imageio
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
from skimage.transform import resize
from IPython.display import HTML
import warnings
warnings.filterwarnings("ignore")

source_image = imageio.imread('./data/02.png')
driving_video = imageio.mimread('./data/04.mp4')


#Resize image and video to 256x256

source_image = resize(source_image, (256, 256))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]

def display(source, driving, generated=None):
    fig = plt.figure(figsize=(8 + 4 * (generated is not None), 6))

    ims = []
    for i in range(len(driving)):
        cols = [source]
        cols.append(driving[i])
        if generated is not None:
            cols.append(generated[i])
        im = plt.imshow(np.concatenate(cols, axis=1), animated=True)
        plt.axis('off')
        ims.append([im])

    ani = animation.ArtistAnimation(fig, ims, interval=50, repeat_delay=1000)
    plt.close()
    return ani
    

# HTML(display(source_image, driving_video).to_html5_video())

# pip install imageio-ffmpeg
# pip install scikit-imag
# conda install -c conda-forge ffmpeg

**Create a model and load checkpoints**

In [8]:
from model.demo import load_checkpoints
generator, kp_detector = load_checkpoints(config_path='./model/config/vox-256.yaml', 
                            checkpoint_path='vox-cpk.pth.tar')

  config = yaml.load(f)


**Perform image animation**

In [10]:
from model.demo import make_animation
from skimage import img_as_ubyte

predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=True)

#save resulting video
imageio.mimsave('./output/generated_RelativeKeypointDisplacement.mp4', [img_as_ubyte(frame) for frame in predictions])
#video can be downloaded from /content folder

# HTML(display(source_image, driving_video, predictions).to_html5_video())

100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [00:11<00:00, 17.68it/s]


**In the cell above we use relative keypoint displacement to animate the objects. We can use absolute coordinates instead,  but in this way all the object proporions will be inherited from the driving video. For example Putin haircut will be extended to match Trump haircut.**

In [11]:
predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=False, adapt_movement_scale=True)
imageio.mimsave('./output/generated_AbsoluteKeypointDisplacement.mp4', [img_as_ubyte(frame) for frame in predictions])
# HTML(display(source_image, driving_video, predictions).to_html5_video())

100%|████████████████████████████████████████████████████████████████████████████████| 211/211 [00:10<00:00, 19.44it/s]


## Running on your data

**First we need to crop a face from both source image and video, while simple graphic editor like paint can be used for cropping from image. Cropping from video is more complicated. You can use ffpmeg for this.**

In [12]:
# !ffmpeg -i /content/gdrive/My\ Drive/first-order-motion-model/07.mkv -ss 00:08:57.50 -t 00:00:08 -filter:v "crop=600:600:760:50" -async 1 hinton.mp4

**Another posibility is to use some screen recording tool, or if you need to crop many images at ones use face detector(https://github.com/1adrianb/face-alignment) , see https://github.com/AliaksandrSiarohin/video-preprocessing for preprcessing of VoxCeleb.** 

In [18]:
# source_image = imageio.imread('./data/02.png')
# driving_video = imageio.mimread('04.mp4', memtest=False)
source_image = imageio.imread('./data/02.png')
driving_video = imageio.mimread('./data/DZ1BJU.gif')
# Driving video can accept both .gif and .mp4 input


#Resize image and video to 256x256

source_image = resize(source_image, (256, 256))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]

predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=True,
                             adapt_movement_scale=True)

# HTML(display(source_image, driving_video, predictions).to_html5_video())

#save resulting video
imageio.mimsave('./output/generated_RelativeKeypointDisplacement.mp4', [img_as_ubyte(frame) for frame in predictions])

100%|██████████████████████████████████████████████████████████████████████████████████| 43/43 [00:02<00:00, 18.84it/s]


In [19]:
# faces
# source_image = imageio.imread('./data/02.png')
# driving_video = imageio.mimread('./data/04.mp4')

# supposedly any image/video pairs should work as long as user cropped them already

In [22]:
# !python firstordermotion.py