Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model-level DeOldify Ability #3

Merged
merged 31 commits into from
May 15, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
2804d04
update: use tables to enhance expressiveness
CyFeng16 May 13, 2020
545ef3b
update: fix table format
CyFeng16 May 13, 2020
e1bd593
update: fix table info
CyFeng16 May 13, 2020
b513e52
update: address Usage
CyFeng16 May 13, 2020
dcaa4e7
add: zh-cn doc
CyFeng16 May 14, 2020
4f0fd12
update: eng-cn readme link
CyFeng16 May 14, 2020
c61c874
update: eng-cn readme link
CyFeng16 May 14, 2020
e172dcc
update: remove external links
CyFeng16 May 14, 2020
5251407
upload and black ori deoldify
CyFeng16 May 14, 2020
93b7434
update: add deoldify location
CyFeng16 May 14, 2020
638d562
update: upload and black ori deoldify
CyFeng16 May 14, 2020
ff94af0
update: try deoldify for images
CyFeng16 May 14, 2020
817ddca
update: try deoldify art for images
CyFeng16 May 14, 2020
780a735
del: deinit DeOldify
CyFeng16 May 14, 2020
4a1b046
add: re-init DeOldify
CyFeng16 May 14, 2020
a32f970
update: black
CyFeng16 May 14, 2020
aeed7e1
update: set GPU0 as default
CyFeng16 May 14, 2020
a568bb1
update: todo
CyFeng16 May 14, 2020
cf4b5d6
update: prepare and run stable version
CyFeng16 May 14, 2020
0574a01
update: descriptions
CyFeng16 May 14, 2020
2ac8d45
cleanup
CyFeng16 May 14, 2020
e4d647d
del: deoldify saas
CyFeng16 May 14, 2020
4c9d1db
cleanup
CyFeng16 May 14, 2020
4d1a755
update: add cfg output
CyFeng16 May 15, 2020
a3fddf7
update: fix cfg opt
CyFeng16 May 15, 2020
ec53c57
update: fix arg
CyFeng16 May 15, 2020
409d7e7
update: desc. and args.
CyFeng16 May 15, 2020
d2575d0
update: add one blank
CyFeng16 May 15, 2020
a2360da
update: args desc
CyFeng16 May 15, 2020
5a1c6e0
update: try internal link
CyFeng16 May 15, 2020
113f13d
update: try internal link #2
CyFeng16 May 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 0 additions & 9 deletions .gitmodules

This file was deleted.

122 changes: 75 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,38 +5,39 @@
<img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg" />
</p>

English | [简体中文](docs/README_zh-Hans.md)

# MVIMP

**M**ixed **V**ideo and **I**mage **M**anipulation **P**rogram

I realize that training a good-performance AI model is kind of just one side of the story, make it easy to use for others is the other thing. So, this repository tries to embrace out-of-the-box AI ability to manipulate multimedia, also, I wish you have fun!

[中文文档请移步](https://cyfeng.science/2020/05/05/introduce-to-MVIMP/)

| Parallel | Input | Output | Parallel |
|:--------:|:------:|:------:|:----------------------:|
| AnimeGAN | Images | Images | True |
| DAIN | Video | Video | False |
| Photo3D | Images | Videos | True(not recommmended) |
| DeOldify | Images | Images | True |
| [AnimeGAN](https://github.com/CyFeng16/MVIMP/tree/cyfeng-0315-DeOldify#animegan) | Images | Images | True |
| [DAIN](https://github.com/CyFeng16/MVIMP/tree/cyfeng-0315-DeOldify#dain) | Video | Video | False |
| [Photo3D](https://github.com/CyFeng16/MVIMP/tree/cyfeng-0315-DeOldify#photo3d) | Images | Videos | True(not recommmended) |
| [DeOldify](https://github.com/CyFeng16/MVIMP/tree/cyfeng-0315-DeOldify#deoldify) | Images | Images | True |

## AnimeGAN

Original repository: [TachibanaYoshino/AnimeGAN](https://github.com/TachibanaYoshino/AnimeGAN)

This is the Open source of the paper <AnimeGAN: a novel lightweight GAN for photo animation>, which uses the GAN framwork to transform real-world photos into anime images.

Requirements:
- TensorFLow 1.15.2
- CUDA 10.0(tested locally) / 10.1(colab)
- Python 3.6.8(3.6+/3.7+/3.8+)
- opencv
- tqdm
- numpy
- glob
- argparse
| Dependency | Version |
|:------------:|:----------------------------------:|
| TensorFLow | 1.15.2 |
| CUDA Toolkit | 10.0(tested locally) / 10.1(colab) |
| Python | 3.6.8(3.6+) |
| opencv | - |
| tqdm | - |
| numpy | - |
| glob | - |
| argparse | - |

Usage:
**Usage**:

1. `Local`

Expand All @@ -60,15 +61,17 @@ Usage:

Original repository: [vt-vl-lab/3d-photo-inpainting](https://github.com/vt-vl-lab/3d-photo-inpainting)

a method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view.
The method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view.

| Dependency | Version |
|:------------:|:--------------------------:|
| PyTroch | 1.5.0 |
| CUDA Toolkit | 10.1(tested locally/colab) |
| Python | 3.6.8(3.6+) |

Requirements:
- PyTroch 1.5.0
- CUDA 10.1(tested locally/colab)
- Python 3.6.8(3.6+/3.7+/3.8+)
- Other Python dependencies listed in requirements.txt (will be auto prepared through running `preparation.py`)
Other Python dependencies listed in `requirements.txt`, and will be auto installed while running `preparation.py`.

Usage:
**Usage**:

1. `Local`

Expand All @@ -88,26 +91,30 @@ Usage:

https://colab.research.google.com/drive/1VAFCN8Wh4DAY_HDcwI-miNIBomx_MZc5?usp=sharing

P.S. Massive memory is occupied during operation(grows with `-l`). `Higher memory` runtime helps if you are Colab Pro user.
P.S. Massive memory is occupied during operation(grows with `-l`).

`Higher memory` runtime helps if you are Colab Pro user.

3. Description of Parameters

- `--fps`or`-f`: setup the FPS of output video.
- `--frames`or`-n`: setup the number of frames of output video.
- `--longer_side_len`or`-l`: set the longer side of output video(either height or width).
| params | abbr. | Default | Description |
|-------------------|-------|---------|----------------------------------------------------------|
| --fps | -f | 40 | The FPS of output video. |
| --frames | -n | 240 | The number of frames of output video. |
| --longer_side_len | -l | 960 | The longer side of output video(either height or width). |

## DAIN

Original repository: [baowenbo/DAIN](https://github.com/baowenbo/DAIN)

Depth-Aware video frame INterpolation (DAIN) model explicitly detect the occlusion by exploring the depth cue. We develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones.

Requirements:
- FFmpeg
- PyTroch 1.4.0
- CUDA 10.0(tested locally/colab)
- Python 3.6.8(3.6+/3.7+/3.8+)
- GCC 7.5 (Compiling PyTorch 1.4.0 extension files (.c/.cu))
| Dependency | Version |
|:------------:|:-----------------------------------------------------:|
| PyTroch | 1.4.0 |
| CUDA Toolkit | 10.0(tested locally/colab) |
| Python | 3.6.8(3.6+) |
| GCC | 7.5(Compiling PyTorch 1.4.0 extension files (.c/.cu)) |

P.S. Make sure your virtual env has torch-1.4.0+cu100 and torchvision-0.5.0+cu100.
You can use the following [command](https://github.com/baowenbo/DAIN/issues/44#issuecomment-624025613):
Expand All @@ -121,7 +128,7 @@ sudo ln -snf /usr/local/cuda-10.0 /usr/local/cuda
# After that we can perform a complete compilation.
```

Usage:
**Usage**:

1. `Local`

Expand All @@ -143,42 +150,63 @@ Usage:

3. Description of Parameters

- `--input_video`or`-input`: set the input video name.
- `--time_step`or`-ts`: set the frame multiplier, 0.5 corresponds to 2X, 0.25 corresponds to 4X, and 0.125 corresponds to 8X.
- `--high_resolution`or`-hr`: Default is False. Pascal V100 has not enough memory to run DAIN for the FHD video, set `-hr` True to split a single frame into four blocks and process them separately in order to reduce GPU memory usage.
| params | abbr. | Default | Description |
|-------------------|--------|------------|---------------------------------------------------------------------------------------------------------------------------------------------|
| --input_video | -input | / | The input video name. |
| --time_step | -ts | 0.5 | Set the frame multiplier.<br>0.5 corresponds to 2X;<br>0.25 corresponds to 4X;<br>0.125 corresponds to 8X. |
| --high_resolution | -hr | store_true | Default is False(action:store_true).<br>Turn it on when you handling FHD videos,<br>A frame-splitting process will reduce GPU memory usage. |

## DeOldify

Original repository: [jantic/DeOldify](https://github.com/jantic/DeOldify)

DeOldify is a Deep Learning based project for colorizing and restoring old images and video! We currently try the easiest way to colorize images using DeOldify, which is using SaaS service provided by DeepAI(**For Now**). You must sign up DeepAI.
DeOldify is a Deep Learning based project for colorizing and restoring old images and video!

~~We currently try the easiest way to colorize images using DeOldify, which is using SaaS service provided by DeepAI(**For Now**). You must sign up DeepAI.~~

We are now integrating the inference capabilities of the DeOldify model (both Artistic and Stable, no Video) with our MVIMP repository, and keeping the input and output interfaces consistent.

| Dependency | Version |
|:------------:|:--------------------------:|
| PyTroch | 1.5.0 |
| CUDA Toolkit | 10.1(tested locally/colab) |
| Python | 3.6.8(3.6+) |

Requirements:
- None
Other Python dependencies listed in `colab_requirements.txt`, and will be auto installed while running `preparation.py`.

Usage:
**Usage**:

1. `Local`

```shell
# Step 1: Prepare
git clone https://github.com/CyFeng16/MVIMP.git
cd MVIMP
python3 preparation.py
python3 preparation.py -f deoldify
# Step 2: Infernece
python3 inference_deoldify.py -key quickstart-QUdJIGlzIGNvbWluZy4uLi4K
# This trial code (`quickstart-QUdJIGlzIGNvbWluZy4uLi4K`) will be invalid after a few times. If you wanna keep doing this, sign up a DeepAI account and you will get $5 credit.
python3 inference_deoldify.py -st
```

2. Description of Parameters
2. `Colab`

Or you can try following shared colab in playground mode:

https://colab.research.google.com/drive/156StQ1WdErl-_213pCQV-ysX2FT_vtjm?usp=sharing

3. Description of Parameters

- `--api_key`or`-key`: the API key of DeepAI, please sign up first.
| params | abbr. | Default | Description |
|-----------------|---------|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| --artistic | -art | store_true | The artistic model achieves the highest quality results in image coloration, <br>in terms of interesting details and vibrance. |
| --stable | -st | store_true | Stable model achieves the best results with landscapes and portraits. |
| --render_factor | -factor | 35 | Between 7 and 40, try more times for better performance. |
| --watermarked | -mark | store_true | I respect the spirit of the original author adding a watermark to distinguish AI works, <br>but setting it to False may be more convenient for use in a production environment. |

# TODO
- [x] Chinese Document
- [x] DeOldify for colorizing and restoring old images and videos
- [x] tqdm instead of print loop
- [ ] Original DeOldify local as well as Colab
- [x] Original DeOldify local as well as Colab
- [ ] Dockerized deployment.
- [ ] https://roxanneluo.github.io/Consistent-Video-Depth-Estimation/
- [ ] MMSR for image and video super-resolution
Expand Down
158 changes: 158 additions & 0 deletions docs/README_zh-Hans.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
<p align="center">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/CyFeng16/MVIMP" />
<img alt="GitHub issues" src="https://img.shields.io/github/issues/CyFeng16/MVIMP" />
<img alt="GitHub License" src="https://img.shields.io/github/license/cyfeng16/MVIMP" />
<img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg" />
</p>

[English](/README.md) | 简体中文

# MVIMP

`MVIMP`(**M**ixed **V**ideo and **I**mage **M**anipulation **P**rogram)名字的灵感来自于`GIMP`(**G**NU **I**mage **M**anipulation **P**rogram),也希望更多的人可以尝试使用它w

目前MVIMP中添加了如下三个第三方功能,代码目录及各文件功能如下:
- `third_party`: 存放第三方repo,本打算使用submodule的模式,不过因为各个代码库代码风格不同无法做到统一,所以就保留LISENCE做minimize的二次开发
- `mvimp_utils`: 存放处理文件和视频的单独功能模块,用于辅助推理
- `preparation.py`: 所有的准备工作集成在一起
- `inference_animegan.py`: 统一输入输出接口,辅助AnimeGAN的推理
- `inference_dain.py`: 统一输入输出接口,辅助DAIN的推理
- `inference_photo3d.py`: 统一输入输出接口,辅助3d-photo-inpainting的推理

第三方功能的输入输出定义如下:

| 模型 | 输入 | 输出 | 是否并行 |
|:--------------:|:------:|:------:|:----------------------:|
| AnimeGAN | 图片(s) | 图片(s) | 可并行 |
| DAIN | 视频 | 视频 | 不可并行 |
| Photo3D | 图片(s) | 视频 | 可并行(不推荐) |

## AnimeGAN

AnimeGAN的原始仓库位于 [TachibanaYoshino/AnimeGAN](https://github.com/TachibanaYoshino/AnimeGAN), 作为《 AnimeGAN:一种用于照片动画的新型轻量级GAN》论文的开放源代码,它使用GAN框架将真实世界的照片转换为动漫图像。

### 系统需求

| Dependency | Version |
|:------------:|:----------------------------------:|
| TensorFLow | 1.15.2 |
| CUDA Toolkit | 10.0(tested locally) / 10.1(colab) |
| Python | 3.6.8(3.6+) |
| opencv | - |
| tqdm | - |
| numpy | - |
| glob | - |
| argparse | - |

### 使用方法

1. `本地运行`

```shell
# Step 1: 准备工作
git clone https://github.com/CyFeng16/MVIMP.git
cd MVIMP
python3 preparation.py -f animegan
# Step 2: 把需要处理的图片(s)放入 ./Data/Input/
# Step 3: 运行如下命令进行推理
python3 inference_animegan.py
```

2. `Colab云端运行`

我们也可以选择在 playground 模式下在Colab上运行:

https://colab.research.google.com/drive/1bpwUFcr5i38_P3a0r3Qm9Dvkl-MS_Y1y?usp=sharing

## Photo3D

Photo3D的原始仓库位于 [vt-vl-lab/3d-photo-inpainting](https://github.com/vt-vl-lab/3d-photo-inpainting),Photo3D输入单个RGB-D输入图像并将其转换为3D照片(视频)的方法。

### 系统需求

| Dependency | Version |
|:------------:|:--------------------------:|
| PyTroch | 1.5.0 |
| CUDA Toolkit | 10.1(tested locally/colab) |
| Python | 3.6.8(3.6+) |

其他的python依赖需求写在requirements.txt中,运行`preparation.py`时将自动添加

### 使用方法

1. `本地运行`

```shell
# Step 1: 准备工作
git clone https://github.com/CyFeng16/MVIMP.git
cd MVIMP
python3 preparation.py -f photo3d
# Step 2: 把需要处理的图片放入 ./Data/Input/
# Step 3: 运行如下命令进行推理
python3 inference_photo3d.py -f 40 -n 240 -l 960
```

2. `Colab云端运行`

我们也可以选择在 playground 模式下在Colab上运行:

https://colab.research.google.com/drive/1VAFCN8Wh4DAY_HDcwI-miNIBomx_MZc5?usp=sharing

需要注意的是,Photo3D所需的运行时内存随着`longer_side_len`(输出视频最大长/宽)的参数增加而显著增加,如果是Colab Pro用户建议开启`高内存`的运行时,并尽量一次推理一张图片.

### 参数说明

- `--fps`or`-f`: 设置输出视频的FPS.
- `--frames`or`-n`: 设置输出视频的帧数.
- `--longer_side_len`or`-l`: 设置输出视频的最长边边长.

## DAIN

DAIN的原始仓库位于 [baowenbo/DAIN](https://github.com/baowenbo/DAIN),DAIN通过检测深度感知流投影层来合成中间流,进行视频帧内插.

当前版本的DAIN可以流畅运行1080p视频的插帧.

### 系统需求

| Dependency | Version |
|:------------:|:-----------------------------------------------------:|
| PyTroch | 1.4.0 |
| CUDA Toolkit | 10.0(tested locally/colab) |
| Python | 3.6.8(3.6+) |
| GCC | 7.5(Compiling PyTorch 1.4.0 extension files (.c/.cu)) |

需要注意当前版本的DAIN不支持PyTorch1.5.0版本,所以我们在本地和云端运行环境中都需要手动安装 torch-1.4.0+cu100 和 torchvision-0.5.0+cu100. 参见[issue](https://github.com/baowenbo/DAIN/issues/44#issuecomment-624025613).

```shell
# 安装 PyTorch 1.4.0(CUDA 10.0)
pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
pip install scipy==1.1.0
# 设置系统软链接指向 CUDA 10.0(CUDA需要提前安装)
sudo ln -snf /usr/local/cuda-10.0 /usr/local/cuda
```

### 使用方法

1. `本地运行`

```shell
# Step 1: 准备工作
git clone https://github.com/CyFeng16/MVIMP.git
cd MVIMP
python3 preparation.py -f dain
# Step 2: 将需要插帧的一个视频文件放在 ./Data/Input/ 下
# Step 3: 运行如下命令进行推理
python3 inference_dain.py -input your_input.mp4 -ts 0.5 -hr False
```

2. `Colab云端运行`

我们也可以选择在 playground 模式下在Colab上运行:

https://colab.research.google.com/drive/1pIPHQAu7z4Z3LXztCUXiDyBaIlOqy4Me?usp=sharing

### 参数说明

- `--input_video`or`-input`: 设置输入视频名称.
- `--time_step`or`-ts`: 设置插帧倍数,0.5对应2X,0.25对应4X,0.125对应8X.
- `--high_resolution`or`-hr`: 默认False.对于视频格式为1080p(+)的视频而言V100的显存不足以运行DAIN,设置True将一帧拆分为4块分别处理以减少显存占用.
Loading