<a href="https://colab.research.google.com/github/fzantalis/colab_collection/blob/master/ttmai_impersonator_plus_plus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Impersonator++
## Με αυτό το Jupyter Notebook μπορούμε να αντιγράψουμε τις κινήσεις ενός ανθρώπου σε ένα βίντεο και να τις μεταφέρουμε σε μία φωτογραφία ενός άλλου ανθρώπου.

## Το notepad αυτό είναι μια παραλαγή του notebook που παρέχεται από το project των δημιουργών στους παρακάτω συνδέσμους

Σύνδεσμοι:

[![GitHub stars](https://img.shields.io/github/stars/iPERDance/iPERCore?style=social)](https://github.com/iPERDance/iPERCore)

- Paper: https://arxiv.org/pdf/2011.09055.pdf
- Repo: https://github.com/iPERDance/iPERCore
- Project Page: https://www.impersonator.org/work/impersonator-plus-plus.html
- Dataset https://svip-lab.github.io/dataset/iPER_dataset.html
- Forum https://discuss.impersonator.org/




## Σημείωση
Πριν ξεκινήσουμε επιβεβαιώστε ότι έχουμε επιλεγμένη την χρήση κάρτας γραφικών στις ρυθμίσεις του notepad. Πατάμε: Edit > Notebook settings > Hardware Accelerator > και επιλέγουμε "GPU".

In [None]:
#%%
 
import IPython
IPython.display.HTML(
   '<h2>Demo</h2><iframe width="75%" height="512" src="https://www.impersonator.org/project_img/impersonator_plus_plus/demo_video/demo_1_512x512.mp4" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>'
)

# Προαπαιτούμενα πακέτα για όποιον θέλει να το τρέξει σε δικό του Hardware. Αν το δοκιμάζετε απλώς εδώ στο colab μπορείτε να αγνοήσετε τις παρακάτω πληροφορίες

## System Requirements
 - Linux (test on Ubuntu 16.04 and 18.04) or Windows (test on windows 10)
 - CUDA 10.1, 10.2, or 11.0
 - gcc 7.5+ (needs to support C++14)
 - ffmpeg (ffprobe) 4.3.1+

## Python Requirements
  - Python 3.6+
  - PyTorch tested on 1.7.0
  - Torchvison tested on 0.8.1
  - mmcv-full test on 1.2.0
  - numpy>=1.19.3
  - scipy>=1.5.2
  - scikit-image>=0.17.2
  - opencv-python>=4.4.0.40
  - tensorboardX>=2.1
  - tqdm>=4.48.2
  - visdom>=0.1.8.9
  - easydict>=1.9
  - toml>=0.10.2
  - git+https://github.com/open-mmlab/mmdetection.git@8179440ec5f75fe95484854af61ce6f6279f3bbc
  - git+https://github.com/open-mmlab/mmediting@d4086aaf8a36ae830f1714aad585900d24ad1156
  - git+https://github.com/iPERDance/neural_renderer.git@e5f54f71a8941acf372514eb92e289872f272653

# 1. Εγκατάσταση


## 1.1 Εγκατάσταση του προγράμματος ffmpeg (ffprobe) και ορισμός της μεταβλητής CUDA_HOME 

In [None]:
# Install ffmpeg (ffprobe)
!apt-get install ffmpeg

In [None]:
# set CUDA_HOME, here we use CUDA 10.1
import os
os.environ["CUDA_HOME"] = "/usr/local/cuda-10.1"
 
!echo $CUDA_HOME

## 1.1 Αντιγράφουμε το iPERCore Github Repo

In [None]:
!git clone https://github.com/iPERDance/iPERCore.git

## 1.2 Ρυθμίσεις για το iPERCore

In [None]:
cd /content/iPERCore/

In [None]:
!python setup.py develop

## 1.3 Κατεβάζουμε το εκπαιδευμένο μοντέλο και τα παραδείγματα των βίντεο εισόδου που θέλουμε.





In [None]:
# Download all checkpoints
!wget -O assets/checkpoints.zip "https://nas.koulouras.gr/index.php/s/qSB4WrG9pLDJAq4/download"
!unzip -o assets/checkpoints.zip -d assets/
 
!rm assets/checkpoints.zip

In [None]:
# download samples
!wget -O assets/samples.zip  "https://nas.koulouras.gr/index.php/s/iRrjMySpm2Y8dcM/download"
!unzip -o assets/samples.zip -d  assets
!rm assets/samples.zip

# Ετοιμαζόμαστε να τρέξουμε την εφαρμογή

In [None]:
cd /content/iPERCore/

In [None]:
import os
import os.path as osp
import platform
import argparse
import time
import sys
import subprocess
from IPython.display import HTML
from base64 import b64encode

## Περιγραφή όλων των ρυθμίσεων για περίεργους χρήστες. Αγνοήστε το αν θέλετε απλά να τρέξετε το παράδειγμα
 - gpu_ids (str): the gpu_ids, default is "0";
 - image_size (int): the image size, default is 512;
 - num_source (int): the number of source images for Attention, default is 2. Large needs more GPU memory;
 - assets_dir (str): the assets directory. This is very important, and there are the configurations and all pre-trained checkpoints;
 - output_dir (str): the output directory;

 - src_path (str): the source input information. 
       All source paths and it supports multiple paths, uses "|" as the separator between all paths. 
       The format is "src_path_1|src_path_2|src_path_3". 
       
       Each src_input is "path?=path1,name?=name1,bg_path?=bg_path1". 
       
       It must contain 'path'. If 'name' and 'bg_path' are empty, they will be ignored.

       The 'path' could be an image path, a path of a directory contains source images, and a video path.

       The 'name' is the rename of this source input, if it is empty, we will ignore it, and use the filename of the path.

       The 'bg_path' is the actual background path if provided, otherwise we will ignore it.
       
       There are several examples of formated source paths,

        1. "path?=path1,name?=name1,bg_path?=bg_path1|path?=path2,name?=name2,bg_path?=bg_path2",
        this input will be parsed as [{path: path1, name: name1, bg_path:bg_path1},
        {path: path2, name: name2, bg_path: bg_path2}];

        2. "path?=path1,name?=name1|path?=path2,name?=name2", this input will be parsed as
        [{path: path1, name:name1}, {path: path2, name: name2}];

        3. "path?=path1", this input will be parsed as [{path: path1}].

        4. "path1", this will be parsed as [{path: path1}].

 - ref_path (str): the reference input information.
       
       All reference paths. It supports multiple paths, and uses "|" as the separator between all paths.
       The format is "ref_path_1|ref_path_2|ref_path_3".

       Each ref_path is "path?=path1,name?=name1,audio?=audio_path1,fps?=30,pose_fc?=300,cam_fc?=150".

       It must contain 'path', and others could be empty, and they will be ignored.

       The 'path' could be an image path, a path of a directory contains images of a same person, and a video path.

       The 'name' is the rename of this source input, if it is empty, we will ignore it, and use the filename of the path.

       The 'audio' is the audio path, if it is empty, we will ignore it. If the 'path' is a video,
        you can ignore this, and we will firstly extract the audio information of this video (if it has audio channel).

       The 'fps' is fps of the final outputs, if it is empty, we will set it as the default fps 25.

       The 'pose_fc' is the smooth factor of the temporal poses. The smaller of this value, the smoother of the temporal poses. If it is empty, we will set it as the default 300. In the most cases, using the default 300 is enough, and if you find the poses of the outputs are not stable, you can decrease this value. Otherwise, if you find the poses of the outputs are over stable, you can increase this value.

       The 'cam_fc' is the smooth factor of the temporal cameras (locations in the image space). The smaller of this value, the smoother of the locations in sequences. If it is empty, we will set it as the default 150. In the most cases, the default 150 is enough.

       There are several examples of formated reference paths,

        1. "path?=path1,name?=name1,audio?=audio_path1,fps?=30,pose_fc?=300,cam_fc?=150|
            path?=path2,name?=name2,audio?=audio_path2,fps?=25,pose_fc?=450,cam_fc?=200",
            this input will be parsed as
            [{path: path1, name: name1, audio: audio_path1, fps: 30, pose_fc: 300, cam_fc: 150},
             {path: path2, name: name2, audio: audio_path2, fps: 25, pose_fc: 450, cam_fc: 200}]

        2. "path?=path1,name?=name1, pose_fc?=450|path?=path2,name?=name2", this input will be parsed as
        [{path: path1, name: name1, fps: 25, pose_fc: 450, cam_fc: 150},
         {path: path2, name: name2, fps: 25, pose_fc: 300, cam_fc: 150}].

        3. "path?=path1|path?=path2", this input will be parsed as
        [{path: path1, fps:25, pose_fc: 300, cam_fc: 150}, {path: path2, fps: 25, pose_fc: 300, cam_fc: 150}].

        4. "path1|path2", this input will be parsed as
        [{path: path1, fps:25, pose_fc: 300, cam_fc: 150}, {path: path2, fps: 25, pose_fc: 300, cam_fc: 150}].

        5. "path1", this will be parsed as [{path: path1, fps: 25, pose_fc: 300, cam_fc: 150}].

## Τρέχουμε τα scriptάκια!


In [None]:
# the gpu ids
gpu_ids = "0"
 
# the image size
image_size = 512
 
# the assets directory. This is very important, please download it from `one_drive_url` firstly.
assets_dir = "/content/iPERCore/assets"
 
# the output directory.
output_dir = "./results"
 
# the model id of this case. This is a random model name.
# model_id = "model_" + str(time.time())
 
# # This is a specific model name, and it will be used if you do not change it.
# model_id = "axing_1"
 
# symlink from the actual assets directory to this current directory
work_asserts_dir = os.path.join("./assets")
if not os.path.exists(work_asserts_dir):
    os.symlink(osp.abspath(assets_dir), osp.abspath(work_asserts_dir),
               target_is_directory=(platform.system() == "Windows"))
 
cfg_path = osp.join(work_asserts_dir, "configs", "deploy.toml")

### Τρέχοντας με τα δικά μας δεδομένα


#### Εικόνα πηγή:
 - Στο επόμενο βήμα θα έχουμε την επιλογή να ανεβάσουμε την εικόνα ενός ανθρώπου που θέλουμε να τον κάνουμε να χορέψει.
 - Η εικόνα πρέπει να είναι ολόσωμη και να φαίνονται και τα πόδια για να έχουμε σωστά αποτελέσματα.
 - Ιδανικά θέλουμε το σώμα του ανθρώπου να βρίσκεται σε μία πόζα "Α". Δηλαδή να κοιτάει ευθεία την κάμερα με τα πόδια και τα χέρια ελαφρώς ανοιχτά προς τα κάτω.
 - Για ακόμα καλύτερα αποτελέσματα, στο επόμενο βήμα μπορούμε να παίξουμε με τις επιλογές και να ανεβάσουμε και μία εικόνα με την πίσω όψη του ανθρώπου και ακόμα μία εικόνα με το background χωρις τον άνθρωπο. (Αυτές οι επιλογές είναι προεραιτικές)

 #### Βίντεο πηγή:
 - Για το βίντεο απ όπου θα αντιγράψουμε τις κινήσεις, έχουμε ήδη μία μεγάλη επιλογή από έτοιμα παραδείγματα για να επιλέξετε από ένα dropdown menu. Παρόλα αυτά, μπορούμε να ανεβάσουμε και ένα δικό μας βίντεο. Στο βίντεο αυτό θα πρέπει να φαίνεται μόνο ένας άνθρωπος. Ολόκληρος, και να μην αλλάζουν οι λήψεις της κάμερας. Να είναι όλο το κλιπ τραβηγμένο από ένα σημείο.

In [None]:
#@title Ρυθμίσεις του μοντέλου και εκτέλεση
#@markdown Επιλέξτε ένα βίντεο αναφοράς
reference_video = "akun_1.mp4" #@param  ['kuailechongbai_boy.mp4', 'chengfengpolang_1.mp4', 'Av37667655_2.mp4', 'akun_1.mp4', 'BV1rD4y1Q72j_2.mp4', 'akGexYZug2Q_2.mp4.mp4', 'aini.mp4', 'mabaoguo_short.mp4', 'mabaoguo.mp4', 'bantangzhuyi_1.mp4', 'akun_2.mp4']
#@markdown Θέλετε να προσθέσετε και φωτογραφία με την πίσω όψη του ανθρώπου;
back_photo = "NO" #@param ["YES", "NO"]
#@markdown Θέλετε να προσθέσετε και φωτογραφία με κενό background;
bg_photo = "NO" #@param ["YES", "NO"]
#@markdown Ρυθμίστε την ευαισθησία του μοντέλου. 
fc = 400 #@param {type:"slider", min:100, max:500, step:1}
 
 
import os
 
ref_name = reference_video.split(".")[0]
src_p = "/content/iPERCore/assets/samples/sources/"
 
print("Ανέβασε την φωτογραφία με την μπροστινή όψη του ανθρώπου:")
from google.colab import files 
uploaded = files.upload() 
for si in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=si, length=len(uploaded[si])))
  
model_id = si.split(".")[0]
!rm -rf $src_p/$model_id
!mkdir -p $src_p/$model_id
 
if back_photo == "YES":
  print("Ανέβασε την φωτογραφία με την πίσω όψη του ανθρώπου:")
  uploaded = files.upload() 
  for bi in uploaded.keys():
    print('User uploaded file "{name}" with length {length} bytes'.format(
        name=bi, length=len(uploaded[bi])))
    
if bg_photo == "YES":
  print("Ανέβασε την φωτογραφία με το κενό background:")
  uploaded = files.upload() 
  for bgi in uploaded.keys():
    print('User uploaded file "{name}" with length {length} bytes'.format(
        name=bgi, length=len(uploaded[bgi])))
    
 
if back_photo == "NO" and bg_photo == "NO":
  num_source = 1
  !mv $si $src_p/$model_id/
  src_path = "\"path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + si + ",name?=" + model_id + "\""
elif back_photo == "NO" and bg_photo == "YES":
  num_source = 2
  !mv $si $src_p/$model_id/
  !mv $bgi $src_p/$model_id/
  src_path = "\"path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + si + ",name?=" + model_id + "\"" \
             "bg_path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + bgi + "\""
elif back_photo == "YES" and bg_photo == "NO":
  num_source = 2
  !mkdir -p $src_p/$model_id/${model_id}2
  !mv $si $src_p/$model_id/${model_id}2/
  !mv $bi $src_p/$model_id/${model_id}2/
  src_path = "\"path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + model_id + "2," \
               "name?=" + model_id + "2\"" 
else:
  num_source = 2
  !mkdir -p $src_p/$model_id/${model_id}2
  !mv $si $src_p/$model_id/${model_id}2/
  !mv $bi $src_p/$model_id/${model_id}2/
  !mv $bgi $src_p/$model_id/
  src_path = "\"path?=/content/iPERCore/assets/samples/sources/afan_6/afan_6=ns=2," \
             "name?=" + model_id + "2," \
             "bg_path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + bgi + "\""
 
 
 
 
src_path = "\"path?=/content/iPERCore/assets/samples/sources/" + model_id + "/" + si + ",name?=" + model_id + "\""
 
ref_path = "\"path?=/content/iPERCore/assets/samples/references/" + reference_video + "," \
              "name?=" + ref_name + "," \
              "pose_fc?=" + str(fc) + "\""
 
print(ref_path)
 
!python -m iPERCore.services.run_imitator  \
  --gpu_ids     $gpu_ids       \
  --num_source  $num_source    \
  --image_size  $image_size    \
  --output_dir  $output_dir    \
  --model_id    $model_id      \
  --cfg_path    $cfg_path      \
  --src_path    $src_path      \
  --ref_path    $ref_path

# Εμφανίστε το βίντεο στην οθόνη σας

In [None]:
mp4 = open("./results/primitives/" + model_id + "/synthesis/imitations/" + model_id + "-" + ref_name + ".mp4", "rb").read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML(f"""
<video width="100%" height="100%" controls>
      <source src="{data_url}" type="video/mp4">
</video>""")