#Few shot vid2vid - Inference
**Author**: [Christos Antoniou](https://github.com/cantonioupao)


This notebook provides an inference example of the [fs_vid2vid](https://github.com/NVlabs/imaginaire/blob/master/projects/fs_vid2vid/README.md) model, which is
compiled under the [NVIDIA Imaginaire library](https://github.com/NVlabs/imaginaire)

## Set Cuda and Pytorch version

In [None]:
### Check CUDA version
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0


In [None]:
## Check Pytorch version and Cuda Compatibility
import torch
print(torch.__version__) ### preinstalled with most recent pytorch version

1.10.0+cu111


### Install compatible Pytorch version

In [None]:
### Install compatible torch version 
#!pip install torch==1.9.1  # --> installs 1.9.1+cu102

### Install existing CUDA installations

In [None]:
### Check available cuda version and set version
#%cd /usr/local/
#!pwd
#!ls
#!rm -rf cuda
#!ln -s /usr/local/cuda-11.1 /usr/local/cuda

### Or install your own

In [None]:
### Or install your own ( from Ubuntu, deb local)
'''
!wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb
!apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub
!apt-get update
!apt-get install cuda-10.2
'''

'\n!wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb\n!dpkg -i cuda-repo-ubuntu1804-10-2-local-10.2.89-440.33.01_1.0-1_amd64.deb\n!apt-key add /var/cuda-repo-10-2-local-10.2.89-440.33.01/7fa2af80.pub\n!apt-get update\n!apt-get install cuda-10.2\n'

In [None]:
'''
%cd /usr/local/
!pwd
!ls
!rm -rf cuda
!ln -s /usr/local/cuda-10.2 /usr/local/cuda
'''

'\n%cd /usr/local/\n!pwd\n!ls\n!rm -rf cuda\n!ln -s /usr/local/cuda-10.2 /usr/local/cuda\n'

### Check symbolic link and version

In [None]:
### Check symbolic link
!stat cuda

stat: cannot stat 'cuda': No such file or directory


In [None]:
### Check that CUDA version is changed
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0


## Quick install of requirements & library 

In [None]:
%cd /content/
!git clone https://github.com/NVlabs/imaginaire

/content
Cloning into 'imaginaire'...
remote: Enumerating objects: 938, done.[K
remote: Counting objects: 100% (364/364), done.[K
remote: Compressing objects: 100% (300/300), done.[K
remote: Total 938 (delta 106), reused 207 (delta 47), pack-reused 574[K
Receiving objects: 100% (938/938), 66.08 MiB | 16.08 MiB/s, done.
Resolving deltas: 100% (261/261), done.


In [None]:
%cd /content/imaginaire
!bash scripts/install.sh

/content/imaginaire
Get:1 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease [15.9 kB]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]
Hit:3 http://ppa.launchpad.net/cran/libgit2/ubuntu bionic InRelease
Get:4 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic InRelease [15.9 kB]
Hit:5 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease
Hit:6 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:7 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Ign:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Get:9 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main Sources [1,950 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Ign:11 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Get:12 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic/main amd64 Packages [99

## Fix bug with FlowNet pretrained file

### Mount Google Drive  download the pretrained Flownet pth.tar file. 

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### Download pretrained model and upload it to your local GoogleDrive
If following does not work, then try doing it manually [here]('https://docs.google.com/uc?export=download&id=1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da&confirm=t')

In [None]:
#!pip install pycurl (Does not work)
'''
import pycurl
file_name = '/content/imaginaire/checkpoints/flownet2.pth.tar'
file_url = 'https://docs.google.com/uc?export=download&id=1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da&confirm=t'
with open(file_name, 'wb') as f:
    cl = pycurl.Curl()
    cl.setopt(cl.URL, file_url)
    cl.setopt(cl.WRITEDATA, f)
    cl.perform()
    cl.close()
'''

"\nimport pycurl\nfile_name = '/content/imaginaire/checkpoints/flownet2.pth.tar'\nfile_url = 'https://docs.google.com/uc?export=download&id=1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da&confirm=t'\nwith open(file_name, 'wb') as f:\n    cl = pycurl.Curl()\n    cl.setopt(cl.URL, file_url)\n    cl.setopt(cl.WRITEDATA, f)\n    cl.perform()\n    cl.close()\n"

### Change flownet.py to include new path to FlowNet



Specifically, change *line 28* in **imaginaire/imaginaire/third_party/flow_net/flow_net.py** to:
```
flownet2_path = [GOOGLE_DRIVE_LOCAL_FLOWNET_PATH]
```
Make sure that the flownet pth.tar file exists already in your GoogleDrive and hardcode it, in the flownet.py file


Also to pass succesfully the tests, install the checkpoint .pt [file]() to your Google Drive and provide a local path to it. Specifically change *line 35* in **imaginaire/imaginaire/trainers/gancraft.py** 

```
f = [GOOGLE_DRIVE_LOCAL_CHECKPOINT PATH]
```




In [None]:
import torch
savepath  ='/content/drive/MyDrive/FlowNet2_checkpoint.pth.tar'
checkpoint = torch.load(savepath, map_location=torch.device('cpu'))
checkpoint_path = '/content/drive/MyDrive/demoworld-epoch_00115_iteration_000215625_checkpoint-net_G_only.pt'

#### Or just overwrite files directly

In [None]:
%%writefile /content/imaginaire/imaginaire/third_party/flow_net/flow_net.py
# Copyright (C) 2021 NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
#
# This work is made available under the Nvidia Source Code License-NC.
# To view a copy of this license, check out LICENSE.md
import torch
import torch.nn as nn
import torch.nn.functional as F
import types
from imaginaire.third_party.flow_net.flownet2 import models as \
    flownet2_models
from imaginaire.third_party.flow_net.flownet2.utils import tools \
    as flownet2_tools
from imaginaire.model_utils.fs_vid2vid import resample
from imaginaire.utils.io import get_checkpoint


class FlowNet(nn.Module):
    def __init__(self, pretrained=True, fp16=False):
        super().__init__()
        flownet2_args = types.SimpleNamespace()
        setattr(flownet2_args, 'fp16', fp16)
        setattr(flownet2_args, 'rgb_max', 1.0)
        if fp16:
            print('FlowNet2 is running in fp16 mode.')
        self.flowNet = flownet2_tools.module_to_dict(flownet2_models)[
            'FlowNet2'](flownet2_args).to('cuda')
        if pretrained:
            flownet2_path ='/content/drive/MyDrive/FlowNet2_checkpoint.pth.tar'
            #flownet2_path = get_checkpoint(savepath,
            #                              '1hF8vS6YeHkx3j2pfCeQqqZGwA_PJq_Da')
            checkpoint = torch.load(flownet2_path,
                                    map_location=torch.device('cpu'))
            self.flowNet.load_state_dict(checkpoint['state_dict'])
        self.flowNet.eval()

    def forward(self, input_A, input_B):
        size = input_A.size()
        assert(len(size) == 4 or len(size) == 5 or len(size) == 6)
        if len(size) >= 5:
            if len(size) == 5:
                b, n, c, h, w = size
            else:
                b, t, n, c, h, w = size
            input_A = input_A.contiguous().view(-1, c, h, w)
            input_B = input_B.contiguous().view(-1, c, h, w)
            flow, conf = self.compute_flow_and_conf(input_A, input_B)
            if len(size) == 5:
                return flow.view(b, n, 2, h, w), conf.view(b, n, 1, h, w)
            else:
                return flow.view(b, t, n, 2, h, w), conf.view(b, t, n, 1, h, w)
        else:
            return self.compute_flow_and_conf(input_A, input_B)

    def compute_flow_and_conf(self, im1, im2):
        assert(im1.size()[1] == 3)
        assert(im1.size() == im2.size())
        old_h, old_w = im1.size()[2], im1.size()[3]
        new_h, new_w = old_h // 64 * 64, old_w // 64 * 64
        if old_h != new_h:
            im1 = F.interpolate(im1, size=(new_h, new_w), mode='bilinear',
                                align_corners=False)
            im2 = F.interpolate(im2, size=(new_h, new_w), mode='bilinear',
                                align_corners=False)
        data1 = torch.cat([im1.unsqueeze(2), im2.unsqueeze(2)], dim=2)
        with torch.no_grad():
            flow1 = self.flowNet(data1)
        # img_diff = torch.sum(abs(im1 - resample(im2, flow1)),
        #                      dim=1, keepdim=True)
        # conf = torch.clamp(1 - img_diff, 0, 1)

        conf = (self.norm(im1 - resample(im2, flow1)) < 0.02).float()

        # data2 = torch.cat([im2.unsqueeze(2), im1.unsqueeze(2)], dim=2)
        # with torch.no_grad():
        #     flow2 = self.flowNet(data2)
        # warped_flow2 = resample(flow2, flow1)
        # flow_sum = self.norm(flow1 + warped_flow2)
        # disocc = flow_sum > (0.05 * (self.norm(flow1) +
        # self.norm(warped_flow2)) + 0.5)
        # conf = 1 - disocc.float()

        if old_h != new_h:
            flow1 = F.interpolate(flow1, size=(old_h, old_w), mode='bilinear',
                                  align_corners=False) * old_h / new_h
            conf = F.interpolate(conf, size=(old_h, old_w), mode='bilinear',
                                 align_corners=False)
        return flow1, conf

    def norm(self, t):
        return torch.sum(t * t, dim=1, keepdim=True)



Overwriting /content/imaginaire/imaginaire/third_party/flow_net/flow_net.py


In [None]:
%%writefile /content/imaginaire/imaginaire/trainers/gancraft.py
#
# This work is made available under the Nvidia Source Code License-NC.
# To view a copy of this license, check out LICENSE.md
import collections
import os

import torch
import torch.nn as nn

from imaginaire.config import Config
from imaginaire.generators.spade import Generator as SPADEGenerator
from imaginaire.losses import (FeatureMatchingLoss, GaussianKLLoss, PerceptualLoss)
from imaginaire.model_utils.gancraft.loss import GANLoss
from imaginaire.trainers.base import BaseTrainer
from imaginaire.utils.distributed import master_only_print as print
from imaginaire.utils.io import get_checkpoint
from imaginaire.utils.misc import split_labels, to_device
from imaginaire.utils.trainer import ModelAverage, WrappedModel
from imaginaire.utils.visualization import tensor2label


class GauGANLoader(object):
    r"""Manages the SPADE/GauGAN model used to generate pseudo-GTs for training GANcraft.

    Args:
        gaugan_cfg (Config): SPADE configuration.
    """

    def __init__(self, gaugan_cfg):
        print('[GauGANLoader] Loading GauGAN model.')
        cfg = Config(gaugan_cfg.config)
        #default_checkpoint_path = os.path.basename(gaugan_cfg.config).split('.yaml')[0] + '-' + \
        #    cfg.pretrained_weight + '.pt'
        checkpoint = '/content/drive/MyDrive/demoworld-epoch_00115_iteration_000215625_checkpoint-net_G_only.pt'
        #checkpoint = get_checkpoint(default_checkpoint_path, cfg.pretrained_weight)
        ckpt = torch.load(checkpoint)

        net_G = WrappedModel(ModelAverage(SPADEGenerator(cfg.gen, cfg.data).to('cuda')))
        net_G.load_state_dict(ckpt['net_G'])
        self.net_GG = net_G.module.averaged_model
        self.net_GG.eval()
        self.net_GG.half()
        print('[GauGANLoader] GauGAN loading complete.')

    def eval(self, label, z=None, style_img=None):
        r"""Produce output given segmentation and other conditioning inputs.
        random style will be used if neither z nor style_img is provided.

        Args:
            label (N x C x H x W tensor): One-hot segmentation mask of shape.
            z: Style vector.
            style_img: Style image.
        """
        inputs = {'label': label[:, :-1].detach().half()}
        random_style = True

        if z is not None:
            random_style = False
            inputs['z'] = z.detach().half()
        elif style_img is not None:
            random_style = False
            inputs['images'] = style_img.detach().half()

        net_GG_output = self.net_GG(inputs, random_style=random_style)

        return net_GG_output['fake_images']


class Trainer(BaseTrainer):
    r"""Initialize GANcraft trainer.

    Args:
        cfg (Config): Global configuration.
        net_G (obj): Generator network.
        net_D (obj): Discriminator network.
        opt_G (obj): Optimizer for the generator network.
        opt_D (obj): Optimizer for the discriminator network.
        sch_G (obj): Scheduler for the generator optimizer.
        sch_D (obj): Scheduler for the discriminator optimizer.
        train_data_loader (obj): Train data loader.
        val_data_loader (obj): Validation data loader.
    """

    def __init__(self,
                 cfg,
                 net_G,
                 net_D,
                 opt_G,
                 opt_D,
                 sch_G,
                 sch_D,
                 train_data_loader,
                 val_data_loader):
        super(Trainer, self).__init__(cfg, net_G, net_D, opt_G,
                                      opt_D, sch_G, sch_D,
                                      train_data_loader, val_data_loader)

        # Load the pseudo-GT network only if in training mode, else not needed.
        if not self.is_inference:
            self.gaugan_model = GauGANLoader(cfg.trainer.gaugan_loader)

    def _init_loss(self, cfg):
        r"""Initialize loss terms.

        Args:
            cfg (obj): Global configuration.
        """
        if hasattr(cfg.trainer.loss_weight, 'gan'):
            self.criteria['GAN'] = GANLoss()
            self.weights['GAN'] = cfg.trainer.loss_weight.gan
        if hasattr(cfg.trainer.loss_weight, 'pseudo_gan'):
            self.criteria['PGAN'] = GANLoss()
            self.weights['PGAN'] = cfg.trainer.loss_weight.pseudo_gan
        if hasattr(cfg.trainer.loss_weight, 'l2'):
            self.criteria['L2'] = nn.MSELoss()
            self.weights['L2'] = cfg.trainer.loss_weight.l2
        if hasattr(cfg.trainer.loss_weight, 'l1'):
            self.criteria['L1'] = nn.L1Loss()
            self.weights['L1'] = cfg.trainer.loss_weight.l1
        if hasattr(cfg.trainer, 'perceptual_loss'):
            self.criteria['Perceptual'] = \
                PerceptualLoss(
                    network=cfg.trainer.perceptual_loss.mode,
                    layers=cfg.trainer.perceptual_loss.layers,
                    weights=cfg.trainer.perceptual_loss.weights)
            self.weights['Perceptual'] = cfg.trainer.loss_weight.perceptual
        # Setup the feature matching loss.
        if hasattr(cfg.trainer.loss_weight, 'feature_matching'):
            self.criteria['FeatureMatching'] = FeatureMatchingLoss()
            self.weights['FeatureMatching'] = \
                cfg.trainer.loss_weight.feature_matching
        # Setup the Gaussian KL divergence loss.
        if hasattr(cfg.trainer.loss_weight, 'kl'):
            self.criteria['GaussianKL'] = GaussianKLLoss()
            self.weights['GaussianKL'] = cfg.trainer.loss_weight.kl

    def _start_of_epoch(self, current_epoch):
        torch.cuda.empty_cache()  # Prevent the first iteration from running OOM.

    def _start_of_iteration(self, data, current_iteration):
        r"""Model specific custom start of iteration process. We will do two
        things. First, put all the data to GPU. Second, we will resize the
        input so that it becomes multiple of the factor for bug-free
        convolutional operations. This factor is given by the yaml file.
        E.g., base = getattr(self.net_G, 'base', 32)

        Args:
            data (dict): The current batch.
            current_iteration (int): The iteration number of the current batch.
        """
        data = to_device(data, 'cuda')

        # Sample camera poses and pseudo-GTs.
        with torch.no_grad():
            samples = self.net_G.module.sample_camera(data, self.gaugan_model.eval)

        return {**data, **samples}

    def gen_forward(self, data):
        r"""Compute the loss for SPADE generator.

        Args:
            data (dict): Training data at the current iteration.
        """
        net_G_output = self.net_G(data, random_style=False)

        self._time_before_loss()

        if 'GAN' in self.criteria or 'PGAN' in self.criteria:
            incl_pseudo_real = False
            if 'FeatureMatching' in self.criteria:
                incl_pseudo_real = True
            net_D_output = self.net_D(data, net_G_output, incl_real=False, incl_pseudo_real=incl_pseudo_real)
            output_fake = net_D_output['fake_outputs']  # Choose from real_outputs and fake_outputs.

            gan_loss = self.criteria['GAN'](output_fake, True, dis_update=False)
            if 'GAN' in self.criteria:
                self.gen_losses['GAN'] = gan_loss
            if 'PGAN' in self.criteria:
                self.gen_losses['PGAN'] = gan_loss

        if 'FeatureMatching' in self.criteria:
            self.gen_losses['FeatureMatching'] = self.criteria['FeatureMatching'](
                net_D_output['fake_features'], net_D_output['pseudo_real_features'])

        if 'GaussianKL' in self.criteria:
            self.gen_losses['GaussianKL'] = self.criteria['GaussianKL'](net_G_output['mu'], net_G_output['logvar'])

        # Perceptual loss is always between fake image and pseudo real image.
        if 'Perceptual' in self.criteria:
            self.gen_losses['Perceptual'] = self.criteria['Perceptual'](
                net_G_output['fake_images'], data['pseudo_real_img'])

        # Reconstruction loss between fake and pseudo real.
        if 'L2' in self.criteria:
            self.gen_losses['L2'] = self.criteria['L2'](net_G_output['fake_images'], data['pseudo_real_img'])
        if 'L1' in self.criteria:
            self.gen_losses['L1'] = self.criteria['L1'](net_G_output['fake_images'], data['pseudo_real_img'])

        total_loss = 0
        for key in self.criteria:
            total_loss = total_loss + self.gen_losses[key] * self.weights[key]

        self.gen_losses['total'] = total_loss
        return total_loss

    def dis_forward(self, data):
        r"""Compute the loss for GANcraft discriminator.

        Args:
            data (dict): Training data at the current iteration.
        """
        if 'GAN' not in self.criteria and 'PGAN' not in self.criteria:
            return

        with torch.no_grad():
            net_G_output = self.net_G(data, random_style=False)
            net_G_output['fake_images'] = net_G_output['fake_images'].detach()

        incl_real = False
        incl_pseudo_real = False
        if 'GAN' in self.criteria:
            incl_real = True
        if 'PGAN' in self.criteria:
            incl_pseudo_real = True
        net_D_output = self.net_D(data, net_G_output, incl_real=incl_real, incl_pseudo_real=incl_pseudo_real)

        self._time_before_loss()
        total_loss = 0
        if 'GAN' in self.criteria:
            output_fake = net_D_output['fake_outputs']
            output_real = net_D_output['real_outputs']

            fake_loss = self.criteria['GAN'](output_fake, False, dis_update=True)
            true_loss = self.criteria['GAN'](output_real, True, dis_update=True)
            self.dis_losses['GAN/fake'] = fake_loss
            self.dis_losses['GAN/true'] = true_loss
            self.dis_losses['GAN'] = fake_loss + true_loss
            total_loss = total_loss + self.dis_losses['GAN'] * self.weights['GAN']
        if 'PGAN' in self.criteria:
            output_fake = net_D_output['fake_outputs']
            output_pseudo_real = net_D_output['pseudo_real_outputs']

            fake_loss = self.criteria['PGAN'](output_fake, False, dis_update=True)
            true_loss = self.criteria['PGAN'](output_pseudo_real, True, dis_update=True)
            self.dis_losses['PGAN/fake'] = fake_loss
            self.dis_losses['PGAN/true'] = true_loss
            self.dis_losses['PGAN'] = fake_loss + true_loss
            total_loss = total_loss + self.dis_losses['PGAN'] * self.weights['PGAN']

        self.dis_losses['total'] = total_loss
        return total_loss

    def _get_visualizations(self, data):
        r"""Compute visualization image.

        Args:
            data (dict): The current batch.
        """
        with torch.no_grad():
            label_lengths = self.train_data_loader.dataset.get_label_lengths()
            labels = split_labels(data['label'], label_lengths)

            # Get visualization of the real image and segmentation mask.
            segmap = tensor2label(labels['seg_maps'], label_lengths['seg_maps'], output_normalized_tensor=True)
            segmap = torch.cat([x.unsqueeze(0) for x in segmap], 0)

            # Get output from GANcraft model
            net_G_output_randstyle = self.net_G(data, random_style=True)
            net_G_output = self.net_G(data, random_style=False)

            vis_images = [data['images'], segmap, net_G_output_randstyle['fake_images'], net_G_output['fake_images']]

            if 'fake_masks' in data:
                # Get pseudo-GT.
                labels = split_labels(data['fake_masks'], label_lengths)
                segmap = tensor2label(labels['seg_maps'], label_lengths['seg_maps'], output_normalized_tensor=True)
                segmap = torch.cat([x.unsqueeze(0) for x in segmap], 0)
                vis_images.append(segmap)

            if 'pseudo_real_img' in data:
                vis_images.append(data['pseudo_real_img'])

            if self.cfg.trainer.model_average_config.enabled:
                net_G_model_average_output = self.net_G.module.averaged_model(data, random_style=True)
                vis_images.append(net_G_model_average_output['fake_images'])
        return vis_images

    def load_checkpoint(self, cfg, checkpoint_path, resume=None, load_sch=True):
        r"""Load network weights, optimizer parameters, scheduler parameters
        from a checkpoint.

        Args:
            cfg (obj): Global configuration.
            checkpoint_path (str): Path to the checkpoint.
            resume (bool or None): If not ``None``, will determine whether or
            not to load optimizers in addition to network weights.
        """
        ret = super().load_checkpoint(cfg, checkpoint_path, resume, load_sch)

        if getattr(cfg.trainer, 'reset_opt_g_on_resume', False):
            self.opt_G.state = collections.defaultdict(dict)
            print('[GANcraft::load_checkpoint] Resetting opt_G.state')
        if getattr(cfg.trainer, 'reset_opt_d_on_resume', False):
            self.opt_D.state = collections.defaultdict(dict)
            print('[GANcraft::load_checkpoint] Resetting opt_D.state')

        return ret

    def test(self, data_loader, output_dir, inference_args):
        r"""Compute results images for a batch of input data and save the
        results in the specified folder.

        Args:
            data_loader (torch.utils.data.DataLoader): PyTorch dataloader.
            output_dir (str): Target location for saving the output image.
        """
        if self.cfg.trainer.model_average_config.enabled:
            net_G = self.net_G.module.averaged_model
        else:
            net_G = self.net_G.module
        net_G.eval()

        torch.cuda.empty_cache()
        with torch.no_grad():
            net_G.inference(output_dir, **vars(inference_args))



Overwriting /content/imaginaire/imaginaire/trainers/gancraft.py


## Inference 

In [None]:
### Check testing works
#!bash scripts/test_training.sh

Download some testing data

In [None]:
### Download some testing data
%cd /content/imaginaire
!python3 ./scripts/download_test_data.py --model_name fs_vid2vid

/content/imaginaire
projects/fs_vid2vid/test_data
Downloading test data to projects/fs_vid2vid/test_data.tar.gz
Extracting test data to projects/fs_vid2vid/test_data


Or create your own custom dataset using the notebook [here](https://colab.research.google.com/drive/10CoBuUn6lK3b1FFvRNaSEqynGt-oXMsn). The customized dataset is stored under **/content/drive/MyDrive/faceForensics**

In [None]:
### Custom dataset can be found in /content/drive/MyDrive/faceForensics
!rm -r /content/imaginaire/projects/fs_vid2vid/test_data/faceForensics #remove existing testing data
!cp -R /content/drive/MyDrive/faceForensics /content/imaginaire/projects/fs_vid2vid/test_data/ #copy to appropriate folder

For inference you need to download the checkpoint of the pretrained model. This can be manually downloaded [here](https://l.facebook.com/l.php?u=https%3A%2F%2Fdocs.google.com%2Fuc%3Fexport%3Ddownload%26id%3D1F_22ctFmo553nRHy1d_BX7aorc9zk9cF%26confirm%3Dt%26fbclid%3DIwAR08txvqL9tLDQoWAgglFZZf6qZIMmJcIdroGWu_c8sDst85zm8mJZJtIzY&h=AT3ZJhWkgzINpC3JlMfPIaYtdeLwUOy-_jGtt9nbLCfUaxzDHU-GffO9BJm_t0okN2eFC7G4uTJJGKvYpDfe9HEFLdKj6cUUvnZ4WEQHKTzxXH1y4q0CRnu2FFBs29UAaF5P9A).
Then upload the model to your local MyDrive and place it under the following path **/content/drive/MyDrive/epoch_00200_iteration_000005800_checkpoint.pt**


In [57]:
#@markdown #### Select the fps of the output video (fps = 60 by default)
%%writefile /content/imaginaire/imaginaire/trainers/fs_vid2vid.py
# Copyright (C) 2021 NVIDIA CORPORATION & AFFILIATES.  All rights reserved.
#
# This work is made available under the Nvidia Source Code License-NC.
# To view a copy of this license, check out LICENSE.md
fps = 60 # select frames per second for output video
import os
import imageio
import numpy as np
import torch
from tqdm import tqdm


from imaginaire.model_utils.fs_vid2vid import (concat_frames, get_fg_mask,
                                               pre_process_densepose,
                                               random_roll)
from imaginaire.model_utils.pix2pixHD import get_optimizer_with_params
from imaginaire.trainers.vid2vid import Trainer as vid2vidTrainer
from imaginaire.utils.distributed import is_master
from imaginaire.utils.distributed import master_only_print as print
from imaginaire.utils.misc import to_cuda
from imaginaire.utils.visualization import tensor2flow, tensor2im


class Trainer(vid2vidTrainer):
    r"""Initialize vid2vid trainer.

    Args:
        cfg (obj): Global configuration.
        net_G (obj): Generator network.
        net_D (obj): Discriminator network.
        opt_G (obj): Optimizer for the generator network.
        opt_D (obj): Optimizer for the discriminator network.
        sch_G (obj): Scheduler for the generator optimizer.
        sch_D (obj): Scheduler for the discriminator optimizer.
        train_data_loader (obj): Train data loader.
        val_data_loader (obj): Validation data loader.
    """

    def __init__(self, cfg, net_G, net_D, opt_G, opt_D, sch_G, sch_D,
                 train_data_loader, val_data_loader):
        super(Trainer, self).__init__(cfg, net_G, net_D, opt_G,
                                      opt_D, sch_G, sch_D,
                                      train_data_loader, val_data_loader)

    def _start_of_iteration(self, data, current_iteration):
        r"""Things to do before an iteration.

        Args:
            data (dict): Data used for the current iteration.
            current_iteration (int): Current number of iteration.
        """
        data = self.pre_process(data)
        return to_cuda(data)

    def pre_process(self, data):
        r"""Do any data pre-processing here.

        Args:
            data (dict): Data used for the current iteration.
        """
        data_cfg = self.cfg.data
        if hasattr(data_cfg, 'for_pose_dataset') and \
                ('pose_maps-densepose' in data_cfg.input_labels):
            pose_cfg = data_cfg.for_pose_dataset
            data['label'] = pre_process_densepose(pose_cfg, data['label'],
                                                  self.is_inference)
            data['few_shot_label'] = pre_process_densepose(
                pose_cfg, data['few_shot_label'], self.is_inference)
        return data

    def get_test_output_images(self, data):
        r"""Get the visualization output of test function.

        Args:
            data (dict): Training data at the current iteration.
        """
        vis_images = [
            tensor2im(data['few_shot_images'][:, 0]),
            self.visualize_label(data['label'][:, -1]),
            tensor2im(data['images'][:, -1]),
            tensor2im(self.net_G_output['fake_images']),
        ]
        return vis_images

    def get_data_t(self, data, net_G_output, data_prev, t):
        r"""Get data at current time frame given the sequence of data.

        Args:
            data (dict): Training data for current iteration.
            net_G_output (dict): Output of the generator (for previous frame).
            data_prev (dict): Data for previous frame.
            t (int): Current time.
        """
        label = data['label'][:, t] if 'label' in data else None
        image = data['images'][:, t]

        if data_prev is not None:
            nG = self.cfg.data.num_frames_G
            prev_labels = concat_frames(data_prev['prev_labels'],
                                        data_prev['label'], nG - 1)
            prev_images = concat_frames(
                data_prev['prev_images'],
                net_G_output['fake_images'].detach(), nG - 1)
        else:
            prev_labels = prev_images = None

        data_t = dict()
        data_t['label'] = label
        data_t['image'] = image
        data_t['ref_labels'] = data['few_shot_label'] if 'few_shot_label' \
                                                         in data else None
        data_t['ref_images'] = data['few_shot_images']
        data_t['prev_labels'] = prev_labels
        data_t['prev_images'] = prev_images
        data_t['real_prev_image'] = data['images'][:, t - 1] if t > 0 else None

        # if 'landmarks_xy' in data:
        #     data_t['landmarks_xy'] = data['landmarks_xy'][:, t]
        #     data_t['ref_landmarks_xy'] = data['few_shot_landmarks_xy']
        return data_t

    def post_process(self, data, net_G_output):
        r"""Do any postprocessing of the data / output here.

        Args:
            data (dict): Training data at the current iteration.
            net_G_output (dict): Output of the generator.
        """
        if self.has_fg:
            fg_mask = get_fg_mask(data['label'], self.has_fg)
            if net_G_output['fake_raw_images'] is not None:
                net_G_output['fake_raw_images'] = \
                    net_G_output['fake_raw_images'] * fg_mask

        return data, net_G_output

    def test(self, test_data_loader, root_output_dir, inference_args, fps=fps):
        r"""Run inference on the specified sequence.

        Args:
            test_data_loader (object): Test data loader.
            root_output_dir (str): Location to dump outputs.
            inference_args (optional): Optional args.
        """
        self.reset()
        test_data_loader.dataset.set_sequence_length(0)
        print("Inference args", inference_args)
        
        # Set the inference sequences.
        
        test_data_loader.dataset.set_inference_sequence_idx(
            inference_args.driving_seq_index,
            inference_args.few_shot_seq_index,
            inference_args.few_shot_frame_index)
        

        video = []
        for idx, data in enumerate(tqdm(test_data_loader)):
            key = data['key']['images'][0][0]
            filename = key.split('/')[-1]

            # Create output dir for this sequence.
            if idx == 0:
                seq_name = '%03d' % inference_args.driving_seq_index
                output_dir = os.path.join(root_output_dir, seq_name)
                os.makedirs(output_dir, exist_ok=True)
                video_path = output_dir

            # Get output and save images.
            data['img_name'] = filename
            data = self.start_of_iteration(data, current_iteration=-1)
            output = self.test_single(data, output_dir, inference_args)
            video.append(output)

        # Save output as mp4.
        imageio.mimsave(video_path + '.mp4', video, fps=fps)

    def save_image(self, path, data):
        r"""Save the output images to path.
        Note when the generate_raw_output is FALSE. Then,
        first_net_G_output['fake_raw_images'] is None and will not be displayed.
        In model average mode, we will plot the flow visualization twice.

        Args:
            path (str): Save path.
            data (dict): Training data for current iteration.
        """
        self.net_G.eval()
        if self.cfg.trainer.model_average_config.enabled:
            self.net_G.module.averaged_model.eval()

        self.net_G_output = None
        with torch.no_grad():
            first_net_G_output, last_net_G_output, _ = self.gen_frames(data)
            if self.cfg.trainer.model_average_config.enabled:
                first_net_G_output_avg, last_net_G_output_avg, _ = \
                    self.gen_frames(data, use_model_average=True)

        def get_images(data, net_G_output, return_first_frame=True,
                       for_model_average=False):
            r"""Get the ourput images to save.

            Args:
                data (dict): Training data for current iteration.
                net_G_output (dict): Generator output.
                return_first_frame (bool): Return output for first frame in the
                sequence.
                for_model_average (bool): For model average output.
            Return:
                vis_images (list of numpy arrays): Visualization images.
            """
            frame_idx = 0 if return_first_frame else -1
            warped_idx = 0 if return_first_frame else 1
            vis_images = []
            if not for_model_average:
                vis_images += [
                    tensor2im(data['few_shot_images'][:, frame_idx]),
                    self.visualize_label(data['label'][:, frame_idx]),
                    tensor2im(data['images'][:, frame_idx])
                ]
            vis_images += [
                tensor2im(net_G_output['fake_images']),
                tensor2im(net_G_output['fake_raw_images'])]
            if not for_model_average:
                vis_images += [
                    tensor2im(net_G_output['warped_images'][warped_idx]),
                    tensor2flow(net_G_output['fake_flow_maps'][warped_idx]),
                    tensor2im(net_G_output['fake_occlusion_masks'][warped_idx],
                              normalize=False)
                ]
            return vis_images

        if is_master():
            vis_images_first = get_images(data, first_net_G_output)
            if self.cfg.trainer.model_average_config.enabled:
                vis_images_first += get_images(data, first_net_G_output_avg,
                                               for_model_average=True)
            if self.sequence_length > 1:
                vis_images_last = get_images(data, last_net_G_output,
                                             return_first_frame=False)
                if self.cfg.trainer.model_average_config.enabled:
                    vis_images_last += get_images(data, last_net_G_output_avg,
                                                  return_first_frame=False,
                                                  for_model_average=True)

                # If generating a video, the first row of each batch will be
                # the first generated frame and the flow/mask for warping the
                # reference image, and the second row will be the last
                # generated frame and the flow/mask for warping the previous
                # frame. If using model average, the frames generated by model
                # average will be at the rightmost columns.
                vis_images = [[np.vstack((im_first, im_last))
                               for im_first, im_last in
                               zip(imgs_first, imgs_last)]
                              for imgs_first, imgs_last in zip(vis_images_first,
                                                               vis_images_last)
                              if imgs_first is not None]
            else:
                vis_images = vis_images_first

            image_grid = np.hstack([np.vstack(im) for im in vis_images
                                    if im is not None])

            print('Save output images to {}'.format(path))
            os.makedirs(os.path.dirname(path), exist_ok=True)
            imageio.imwrite(path, image_grid)

    def finetune(self, data, inference_args):
        r"""Finetune the model for a few iterations on the inference data."""
        # Get the list of params to finetune.
        self.net_G, self.net_D, self.opt_G, self.opt_D = \
            get_optimizer_with_params(self.cfg, self.net_G, self.net_D,
                                      param_names_start_with=[
                                          'weight_generator.fc', 'conv_img',
                                          'up'])
        data_finetune = {k: v for k, v in data.items()}
        ref_labels = data_finetune['few_shot_label']
        ref_images = data_finetune['few_shot_images']

        # Number of iterations to finetune.
        iterations = getattr(inference_args, 'finetune_iter', 100)
        for it in range(1, iterations + 1):
            # Randomly set one of the reference images as target.
            idx = np.random.randint(ref_labels.size(1))
            tgt_label, tgt_image = ref_labels[:, idx], ref_images[:, idx]
            # Randomly shift and flip the target image.
            tgt_label, tgt_image = random_roll([tgt_label, tgt_image])
            data_finetune['label'] = tgt_label.unsqueeze(1)
            data_finetune['images'] = tgt_image.unsqueeze(1)

            self.gen_update(data_finetune)
            self.dis_update(data_finetune)
            if (it % (iterations // 10)) == 0:
                print(it)

        self.has_finetuned = True


Overwriting /content/imaginaire/imaginaire/trainers/fs_vid2vid.py


In [58]:
### Run inference
%cd /content/imaginaire/
!python3 inference.py  --single_gpu --num_workers 0 \
--config configs/projects/fs_vid2vid/face_forensics/ampO1.yaml \
--output_dir projects/fs_vid2vid/output/face_forensics \
--checkpoint /content/drive/MyDrive/epoch_00200_iteration_000005800_checkpoint.pt # change this path accordingly

/content/imaginaire
Using random seed 0
cudnn benchmark: True
cudnn deterministic: False
Creating metadata
['images', 'landmarks-dlib68']
Data file extensions: {'images': 'jpg', 'landmarks-dlib68': 'json'}
Searching in dir: images
Found 1 sequences
Found 1 files
['images', 'landmarks-dlib68']
Data file extensions: {'images': 'jpg', 'landmarks-dlib68': 'json'}
Searching in dir: images
Found 1 sequences
Found 243 files
Folder at projects/fs_vid2vid/test_data/faceForensics/reference/images opened.
Folder at projects/fs_vid2vid/test_data/faceForensics/reference/landmarks-dlib68 opened.
Folder at projects/fs_vid2vid/test_data/faceForensics/driving/images opened.
Folder at projects/fs_vid2vid/test_data/faceForensics/driving/landmarks-dlib68 opened.
Num datasets: 2
Num sequences: 2
Max sequence length: 243
Epoch length: 1
Using random seed 0
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: 

In [64]:
### Merge audio to video
import moviepy.editor as mpe
videopath = './projects/fs_vid2vid/output/face_forensics/001.mp4'
audiopath = '/content/drive/MyDrive/voca/welcome.wav'
output = './projects/fs_vid2vid/output/face_forensics/output.mp4'
driver_path = '/content/drive/MyDrive/voca/video.mp4' # driver video path
my_clip = mpe.VideoFileClip(videopath)
clip_with_audio = mpe.VideoFileClip(driver_path) #video to get audio from
audio_background = mpe.AudioFileClip(audiopath)
if (my_clip.audio is None) and (clip_with_audio is None):
  #final_audio = mpe.CompositeAudioClip([audio_background])
  final_clip = my_clip.set_audio(audio_background)
  final_clip.write_videofile(output)
else:
  final_clip = my_clip.set_audio(clip_with_audio.audio)
  final_clip.write_videofile(output)

[MoviePy] >>>> Building video ./projects/fs_vid2vid/output/face_forensics/output.mp4
[MoviePy] Writing audio in outputTEMP_MPY_wvf_snd.mp3


100%|██████████| 114/114 [00:00<00:00, 1316.41it/s]

[MoviePy] Done.
[MoviePy] Writing video ./projects/fs_vid2vid/output/face_forensics/output.mp4



100%|█████████▉| 243/244 [00:01<00:00, 123.70it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: ./projects/fs_vid2vid/output/face_forensics/output.mp4 



In [65]:
#@markdown ###Play the generated sample video
from IPython.display import HTML
from base64 import b64encode
mp4 = open('/content/imaginaire/projects/fs_vid2vid/output/face_forensics/000.mp4','rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)

In [62]:
### Save video to Google Drive
from datetime import date
import os
today = ''.join(str(date.today()).split('-')[1:])
video_path = '/content/imaginaire/projects/fs_vid2vid/output/face_forensics/output.mp4'
newpath = '/content/imaginaire/projects/fs_vid2vid/output/face_forensics/' + today + '.mp4'
os.rename(video_path, newpath)
cmd = 'cp %s /content/drive/MyDrive/' %(newpath)
os.system(cmd)
print("Stored in", newpath)

Stored in /content/imaginaire/projects/fs_vid2vid/output/face_forensics/0421.mp4


# Retraining for improved Inference

In [None]:
### Preprocess data for retraining
%cd /content/imaginaire/
!python scripts/build_lmdb.py --config configs/projects/fs_vid2vid/face_forensics/ampO1.yaml --data_root /content/drive/MyDrive/face_forensics --output_root datasets/face_forensics/lmdb/train --paired --remove_missing --overwrite
!python scripts/build_lmdb.py --config configs/projects/fs_vid2vid/face_forensics/ampO1.yaml --data_root /content/drive/MyDrive/face_forensics --output_root datasets/face_forensics/lmdb/val --paired --remove_missing --overwrite

[Errno 2] No such file or directory: '/content/imaginaire/'
/content
python3: can't open file 'scripts/build_lmdb.py': [Errno 2] No such file or directory
python3: can't open file 'scripts/build_lmdb.py': [Errno 2] No such file or directory


In [None]:
!cp /content/drive/MyDrive/face_forensics/images/ /content/imaginaire/

cp: cannot stat '/content/drive/MyDrive/face_forensics/images/': No such file or directory


In [None]:
#torch.cuda.set_device(0)
!python -m torch.distributed.launch --nproc_per_node=1 train.py \
--config configs/projects/fs_vid2vid/face_forensics/ampO1.yaml \
--single_gpu --checkpoint /content/drive/MyDrive/epoch_00200_iteration_000005800_checkpoint.pt 

and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

/usr/bin/python3: can't open file 'train.py': [Errno 2] No such file or directory
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 188) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/usr/local/lib/python3.7/d