MindEditing

English | 中文

Introduction

MindEditing is an open-source toolkit based on MindSpore, containing the most advanced image and video task models from open-source or Huawei Technologies Co. , such as IPT, FSRCNN, BasicVSR and other models. These models are mainly used for low-level vision task, such as Super-Resolution, DeNoise, DeRain, Inpainting. MindEditing also supports many platforms, including CPU/GPU/Ascend.Of course, you'll get an even better experience on the Ascend.

Some Demos:

Video super-resolution demo

Video_SR_Demo-1.-.Trim.mp4

Video frame Interpolation demo

Video_frame_Interpolation_Demo.mp4

Main features

Easy to use

We take the unified entry, you just specify the supported model name and configure the parameters in the parameter yaml file to start your task.
Support multiple tasks

MindEditing supports a variety of popular and contemporary tasks such as deblurring, denoising, super-resolution, and inpainting.
SOTA

MindEditing provides state-of-the-art algorithms in deblurring, denoising, super-resolution, and inpainting tasks.

Multi-Task

With so many tasks, is there a model that can handle multiple tasks? Of course, the pre-trained model, namely, image processing transformer (IPT).The IPT model is a new pre-trained model,it is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after finetuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

Excellent performance

Compared with the state-of-the-art image processing models on different tasks, the IPT model performs better

Brush multiple low-level visual tasks

Compared with the state-of-the-art methods, the IPT model achieve the best performance.

Generalization Ability

Generation ability(table 4) of the IPT model on color image denoising with different noise levels.
The performance of CNN and IPT models using different percentages of data

When the pre-training data is limited, the CNN model can obtain better performance. With the increase of data volume, the IPT model based on Transformer module gains significant performance improvement, and the curve(table 5) trend also shows the promising potential of the IPT model.

Amazing actual image inference results
- Image Super-resolution task
The figure below shows super-resolution results with bicubic downsampling (×4) from Urban100. The proposed IPT model recovers more details.
- Image Denoising task
It must be pointed out that IPT won CVPR2023 NTIRE Image Denoising track champion.

The figure below shows color image denoising results with noise level σ = 50.
- Image Deraining task
The figure below shows image deraining results on the Rain100L dataset.

Dependency

mindspore >=1.9
numpy =1.19.5
scikit-image =0.19.3
pyyaml =5.1
pillow =9.3.0
lmdb =1.3.0
h5py =3.7.0
imageio =2.25.1
munch =2.5.0

Python can be installed by Conda.

Install Miniconda:

cd /tmp
curl -O https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py37_4.10.3-Linux-$(arch).sh
bash Miniconda3-py37_4.10.3-Linux-$(arch).sh -b
cd -
. ~/miniconda3/etc/profile.d/conda.sh
conda init bash

Create a virtual environment, taking Python 3.7.5 as an example:

conda create -n mindspore_py37 python=3.7.5 -y
conda activate mindspore_py37

Check the Python version.

python --version

To install the dependency, please run:

pip install -r requirements.txt

MindSpore(>=1.9) can be easily installed by following the official instruction where you can select your hardware platform for the best fit. To run in distributed mode, openmpi is required to install.

Get Started

we provide the boot file of training and validation, chose different model config to start.Please see the document for more basic usage of MindEditing.

python3 train.py --config_path ./configs/basicvsr/train.yaml
# or
python3 val.py --config_path ./configs/basicvsr/val.yaml

Graph Mode and Pynative Mode

Graph mode is optimized for efficiency and parallel computing with a compiled static graph. In contrast, pynative mode is optimized for flexibility and easy development. You may alter the parameter system.context_mode in model config file to switch to pure pynative mode for development purpose.

News

MindEditing currently has a branch of 0.x, but it will have a branch of 1.x in the future. You'll find more features in the 1.x branch, so stay tuned.

April 6, 2023

The model (MPFER) of Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations is coming soon, Stay tuned.
March 15, 2023

The inference codes and demos of Tunable Conv had already been joined as test case, you can find them in ./tests/. Besides, the training codes are coming soon. The Tunable Conv has 4 models for demo, NAFNet for modulated image denoising, SwinIR for modulated image denoising and perceptual super-resolution, EDSR for modulated joint image denoising and deblurring and StyleNet for modulated style transfer.

Parallel Performance

Increasing the number of parallel work can speed up the training speed. The following is the experiment of example model on CPU 16-core GPU 2xP100:

num_parallel_workers: 8
epoch 1/100 step 1/133, loss = 0.045729052, duration_time = 00:01:07, step_time_avg = 0.00 secs, eta = 00:00:00
epoch 1/100 step 2/133, loss = 0.027709303, duration_time = 00:01:20, step_time_avg = 6.66 secs, eta = 1 day(s) 00:36:02
epoch 1/100 step 3/133, loss = 0.027135072, duration_time = 00:01:33, step_time_avg = 8.74 secs, eta = 1 day(s) 08:17:56

num_parallel_workers: 16
epoch 1/100 step 1/133, loss = 0.04535071, duration_time = 00:00:47, step_time_avg = 0.00 secs, eta = 00:00:00
epoch 1/100 step 2/133, loss = 0.032363698, duration_time = 00:01:00, step_time_avg = 6.74 secs, eta = 1 day(s) 00:54:38
epoch 1/100 step 3/133, loss = 0.02718924, duration_time = 00:01:13, step_time_avg = 8.83 secs, eta = 1 day(s) 08:36:07

Tutorials

The following tutorials are provided to help users learn to use Mindediting.

document

Model List

model_name	task	Conference	Support platform	Download
IPT	Multi-Task	CVPR 2021	Ascend/GPU	ckpt
BasicVSR	Video Super Resolution	CVPR 2021	Ascend/GPU	ckpt
BasicVSR++Light	Video Super Resolution	CVPR 2022	Ascend/GPU	ckpt
NOAHTCV	Image DeNoise	CVPR 2021(MAI Challenge)	Ascend/GPU	ckpt
RRDB	Image Super Resolution	ECCVW, 2018	Ascend/GPU	ckpt
FSRCNN	Image Super Resolution	ECCV 2016	Ascend/GPU	ckpt
SRDiff	Image Super Resolution	Neurocomputing 2022	Ascend/GPU	ckpt
VRT	Multi-Task	arXiv(2022.01)	Ascend/GPU	ckpt
RVRT	Multi-Task	arXiv(2022.06)	Ascend/GPU	ckpt
TTVSR	Video Super Resolution	CVPR 2022	Ascend/GPU	ckpt
MIMO-Unet	Image DeBlur	ICCV 2021	Ascend/GPU	ckpt
NAFNet	Image DeBlur	arXiv(2022.04)	Ascend/GPU	ckpt
CTSDG	Image InPainting	ICCV 2021	Ascend/GPU	ckpt
EMVD	Video Denoise	CVPR 2021	Ascend/GPU	ckpt
Tunable_Conv	tunable task（image process）	arXiv(2023.04)	Ascend/GPU	ckpt
IFR+	Video Frame Interpolation	CVPR 2022	Ascend/GPU	ckpt
MPFER	3D-based Multi-Frame Denoising(is coming soon)	arXiv(2023.04)	GPU	ckpt

Download: The model files are available in.ckpt and.OM formats, and you can download the corresponding files to carry out your research work.

The.ckpt file can be downloaded by clicking the corresponding link in thedownloadcolumn of the table above.
The.om file can be found here. For details about how to use the.om file, see deploy.
The multi-task model can be downloaded according to the task division of the corresponding model files, the selection of model files refer to the yaml file of the different models in the configs folder.
For models that require spynet or vgg pretrained weights, they can also be downloaded in the corresponding models link.

Please refer to ModelZoo Homepage or the documentation under the folder docsfor more details on the model.

License

This project follows the Apache License 2.0 open-source license.

Feedbacks and Contact

The dynamic version is still under development, if you find any issue or have an idea on new features, please don't hesitate to contact us via issue.

Acknowledgement

MindSpore is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new computer vision methods.

If you find MindEditing useful in your research, please consider to cite the following related papers:

@misc{MindEditing 2022,
    title={{MindEditing}:MindEditing for low-level vision task},
    author={MindEditing},
    howpublished = {\url{https://github.com/mindspore-lab/mindediting}},
    year={2022}
}

Projects in MindSpore-Lab

MindCV:A toolbox of vision models and algorithms based on MindSpore.
MindNLP:An opensource NLP library based on MindSpore.
MindDiffusion:A collection of diffusion models based on MindSpore.
MindFace:MindFace is an open source toolkit based on MindSpore, containing the most advanced face recognition and detection models, such as ArcFace, RetinaFace and other models.
MindAudio:An open source all-in-one toolkit for the voice field based on MindSpore.
MindOCR:A toolbox of OCR models, algorithms, and pipelines based on MindSpore.
MindRL:A high-performance, scalable MindSpore reinforcement learning framework.
MindREC:MindSpore large-scale recommender system library.
MindPose:an open-source toolbox for pose estimation based on MindSpore.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
configs		configs
docs		docs
mindediting		mindediting
tests		tests
tools		tools
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.md		LICENSE.md
README.md		README.md
README_CN.md		README_CN.md
export.py		export.py
requirements.txt		requirements.txt
train.py		train.py
val.py		val.py

License

Licenses found

mindspore-lab/mindediting

Folders and files

Latest commit

History

Repository files navigation

MindEditing

Introduction

Multi-Task

Dependency

Get Started

News

Parallel Performance

Tutorials

Model List

License

Feedbacks and Contact

Acknowledgement

Projects in MindSpore-Lab

About

Resources

License

Licenses found

Stars

Watchers

Forks

Languages