open-mmlab · wangruohui · Jan 27, 2022 · Jan 19, 2022 · Jan 26, 2022 · Jan 26, 2022
diff --git a/.dev_scripts/github/update_model_index.py b/.dev_scripts/github/update_model_index.py
@@ -151,8 +151,8 @@ def parse_md(md_file):
         collection_name = name
         while i < len(lines):
             # parse reference
-            if lines[i].startswith('<!-- [PAPER_URL:'):
-                url = re.match(r'<!-- \[PAPER_URL: (.*?)] -->', lines[i])
+            if lines[i].startswith('> ['):
+                url = re.match(r'> \[.*]\((.*)\)', lines[i])
                 url = url.groups()[0]
                 collection['Paper'].append(url)
                 i += 1

diff --git a/configs/inpainting/deepfillv1/README.md b/configs/inpainting/deepfillv1/README.md
@@ -1,32 +1,19 @@
 # DeepFillv1 (CVPR'2018)
 
+> [Generative Image Inpainting with Contextual Attention](https://arxiv.org/abs/1801.07892)
+
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 <!-- [ABSTRACT] -->
 
 Recent deep learning based approaches have shown promising results for the challenging task of inpainting large missing regions in an image. These methods can generate visually plausible image structures and textures, but often create distorted structures or blurry textures inconsistent with surrounding areas. This is mainly due to ineffectiveness of convolutional neural networks in explicitly borrowing or copying information from distant spatial locations. On the other hand, traditional texture and patch synthesis approaches are particularly suitable when it needs to borrow textures from the surrounding regions. Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions. The model is a feed-forward, fully convolutional neural network which can process images with multiple holes at arbitrary locations and with variable sizes during the test time. Experiments on multiple datasets including faces (CelebA, CelebA-HQ), textures (DTD) and natural images (ImageNet, Places2) demonstrate that our proposed approach generates higher-quality inpainting results than existing ones.
 
 <!-- [IMAGE] -->
-<p align="center">
-  <img src="https://user-images.githubusercontent.com/12726765/144174665-9675931f-e448-4475-a659-99b65e7d4a64.png" />
-</p>
-
-<!-- [PAPER_TITLE: Generative Image Inpainting with Contextual Attention] -->
-<!-- [PAPER_URL: https://arxiv.org/abs/1801.07892] -->
-
-## Citation
-
-<!-- [ALGORITHM] -->
-
-```bibtex
-@inproceedings{yu2018generative,
-  title={Generative image inpainting with contextual attention},
-  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
-  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
-  pages={5505--5514},
-  year={2018}
-}
-```
+<div align=center >
+ <img src="https://user-images.githubusercontent.com/12726765/144174665-9675931f-e448-4475-a659-99b65e7d4a64.png" width="400"/>
+</div >
 
 ## Results and models
 
@@ -41,3 +28,16 @@ Recent deep learning based approaches have shown promising results for the chall
 |                                    Method                                    |  Mask Type  | Resolution | Train Iters |  Test Set  | l1 error |  PSNR  | SSIM  |                                                                                                                           Download                                                                                                                            |
 | :--------------------------------------------------------------------------: | :---------: | :--------: | :---------: | :--------: | :------: | :----: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | [DeepFillv1](/configs/inpainting/deepfillv1/deepfillv1_256x256_4x4_celeba.py) | square bbox |  256x256   |    1500k    | CelebA-val |  6.677   | 26.878 | 0.911 | [model](https://download.openmmlab.com/mmediting/inpainting/deepfillv1/deepfillv1_256x256_4x4_celeba_20200619-dd51a855.pth) \| [log](https://download.openmmlab.com/mmediting/inpainting/deepfillv1/deepfillv1_256x256_4x4_celeba_20200619-dd51a855.log.json) |
+
+
+## Citation
+
+```bibtex
+@inproceedings{yu2018generative,
+  title={Generative image inpainting with contextual attention},
+  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
+  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
+  pages={5505--5514},
+  year={2018}
+}
+```
diff --git a/configs/inpainting/deepfillv2/README.md b/configs/inpainting/deepfillv2/README.md
@@ -1,32 +1,19 @@
 # DeepFillv2 (CVPR'2019)
 
+> [Free-Form Image Inpainting with Gated Convolution](https://arxiv.org/abs/1806.03589)
+
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 <!-- [ABSTRACT] -->
 
 We present a generative image inpainting system to complete images with free-form mask and guidance. The system is based on gated convolutions learned from millions of images without additional labelling efforts. The proposed gated convolution solves the issue of vanilla convolution that treats all input pixels as valid ones, generalizes partial convolution by providing a learnable dynamic feature selection mechanism for each channel at each spatial location across all layers. Moreover, as free-form masks may appear anywhere in images with any shape, global and local GANs designed for a single rectangular mask are not applicable. Thus, we also present a patch-based GAN loss, named SN-PatchGAN, by applying spectral-normalized discriminator on dense image patches. SN-PatchGAN is simple in formulation, fast and stable in training. Results on automatic image inpainting and user-guided extension demonstrate that our system generates higher-quality and more flexible results than previous methods. Our system helps user quickly remove distracting objects, modify image layouts, clear watermarks and edit faces.
 
 <!-- [IMAGE] -->
-<p align="center">
-  <img src="https://user-images.githubusercontent.com/12726765/144175160-75473789-924f-490b-ab25-4c4f252fa55f.png" />
-</p>
-
-<!-- [PAPER_TITLE: Free-Form Image Inpainting with Gated Convolution] -->
-<!-- [PAPER_URL: https://arxiv.org/abs/1806.03589] -->
-
-## Citation
-
-<!-- [ALGORITHM] -->
-
-```bibtex
-@inproceedings{yu2019free,
-  title={Free-form image inpainting with gated convolution},
-  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
-  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
-  pages={4471--4480},
-  year={2019}
-}
-```
+<div align=center >
+ <img src="https://user-images.githubusercontent.com/12726765/144175160-75473789-924f-490b-ab25-4c4f252fa55f.png" width="400"/>
+</div >
 
 ## Results and models
 
@@ -41,3 +28,16 @@ We present a generative image inpainting system to complete images with free-for
 |                                    Method                                    | Mask Type | Resolution | Train Iters |  Test Set  | l1 error |  PSNR  | SSIM  |                                                                                                                           Download                                                                                                                            |
 | :--------------------------------------------------------------------------: | :-------: | :--------: | :---------: | :--------: | :------: | :----: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | [DeepFillv2](/configs/inpainting/deepfillv2/deepfillv2_256x256_8x2_celeba.py) | free-form |  256x256   |     20k     | CelebA-val |  5.411   | 25.721 | 0.871 | [model](https://download.openmmlab.com/mmediting/inpainting/deepfillv2/deepfillv2_256x256_8x2_celeba_20200619-c96e5f12.pth) \| [log](https://download.openmmlab.com/mmediting/inpainting/deepfillv2/deepfillv2_256x256_8x2_celeba_20200619-c96e5f12.log.json) |
+
+
+## Citation
+
+```bibtex
+@inproceedings{yu2019free,
+  title={Free-form image inpainting with gated convolution},
+  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
+  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
+  pages={4471--4480},
+  year={2019}
+}
+```
diff --git a/configs/inpainting/global_local/README.md b/configs/inpainting/global_local/README.md
@@ -1,35 +1,19 @@
 # Global&Local (ToG'2017)
 
+> [Globally and Locally Consistent Image Completion](http://iizuka.cs.tsukuba.ac.jp/projects/completion/data/completion_sig2017.pdf)
+
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 <!-- [ABSTRACT] -->
 
 We present a novel approach for image completion that results in images that are both locally and globally consistent. With a fully-convolutional neural network, we can complete images of arbitrary resolutions by flling in missing regions of any shape. To train this image completion network to be consistent, we use global and local context discriminators that are trained to distinguish real images from completed ones. The global discriminator looks at the entire image to assess if it is coherent as a whole, while the local discriminator looks only at a small area centered at the completed region to ensure the local consistency of the generated patches. The image completion network is then trained to fool the both context discriminator networks, which requires it to generate images that are indistinguishable from real ones with regard to overall consistency as well as in details. We show that our approach can be used to complete a wide variety of scenes. Furthermore, in contrast with the patch-based approaches such as PatchMatch, our approach can generate fragments that do not appear elsewhere in the image, which allows us to naturally complete the image.
 
 <!-- [IMAGE] -->
-<p align="center">
-  <img src="https://user-images.githubusercontent.com/12726765/144175196-51dfda11-f7e1-4c7e-abed-42799f757bef.png" />
-</p>
-
-<!-- [PAPER_TITLE: Globally and Locally Consistent Image Completion] -->
-<!-- [PAPER_URL: http://iizuka.cs.tsukuba.ac.jp/projects/completion/data/completion_sig2017.pdf] -->
-
-## Citation
-
-<!-- [ALGORITHM] -->
-
-```bibtex
-@article{iizuka2017globally,
-  title={Globally and locally consistent image completion},
-  author={Iizuka, Satoshi and Simo-Serra, Edgar and Ishikawa, Hiroshi},
-  journal={ACM Transactions on Graphics (ToG)},
-  volume={36},
-  number={4},
-  pages={1--14},
-  year={2017},
-  publisher={ACM New York, NY, USA}
-}
-```
+<div align=center >
+ <img src="https://user-images.githubusercontent.com/12726765/144175196-51dfda11-f7e1-4c7e-abed-42799f757bef.png" width="400"/>
+</div >
 
 ## Results and models
 
@@ -46,3 +30,19 @@ We present a novel approach for image completion that results in images that are
 |                                  Method                                   |  Mask Type  | Resolution | Train Iters |  Test Set  | l1 error |  PSNR  | SSIM  |                                                                                                                      Download                                                                                                                       |
 | :-----------------------------------------------------------------------: | :---------: | :--------: | :---------: | :--------: | :------: | :----: | :---: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | [Global&Local](/configs/inpainting/global_local/gl_256x256_8x12_celeba.py) | square bbox |  256x256   |    500k     | CelebA-val |  6.678   | 26.780 | 0.904 | [model](https://download.openmmlab.com/mmediting/inpainting/global_local/gl_256x256_8x12_celeba_20200619-5af0493f.pth) \| [log](https://download.openmmlab.com/mmediting/inpainting/global_local/gl_256x256_8x12_celeba_20200619-5af0493f.log.json) |
+
+
+## Citation
+
+```bibtex
+@article{iizuka2017globally,
+  title={Globally and locally consistent image completion},
+  author={Iizuka, Satoshi and Simo-Serra, Edgar and Ishikawa, Hiroshi},
+  journal={ACM Transactions on Graphics (ToG)},
+  volume={36},
+  number={4},
+  pages={1--14},
+  year={2017},
+  publisher={ACM New York, NY, USA}
+}
+```
diff --git a/configs/inpainting/partial_conv/README.md b/configs/inpainting/partial_conv/README.md
@@ -1,32 +1,19 @@
 # PConv (ECCV'2018)
 
+> [Image Inpainting for Irregular Holes Using Partial Convolutions](https://arxiv.org/abs/1804.07723)
+
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 <!-- [ABSTRACT] -->
 
 Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value). This often leads to artifacts such as color discrepancy and blurriness. Post-processing is usually used to reduce such artifacts, but are expensive and may fail. We propose the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels. We further include a mechanism to automatically generate an updated mask for the next layer as part of the forward pass. Our model outperforms other methods for irregular masks. We show qualitative and quantitative comparisons with other methods to validate our approach.
 
 <!-- [IMAGE] -->
-<p align="center">
-  <img src="https://user-images.githubusercontent.com/12726765/144175613-1bc9ad1b-072d-4c1f-a97d-1af5be2590bd.png" />
-</p>
-
-<!-- [PAPER_TITLE: Image Inpainting for Irregular Holes Using Partial Convolutions] -->
-<!-- [PAPER_URL: https://arxiv.org/abs/1804.07723] -->
-
-## Citation
-
-<!-- [ALGORITHM] -->
-
-```bibtex
-@inproceedings{liu2018image,
-  title={Image inpainting for irregular holes using partial convolutions},
-  author={Liu, Guilin and Reda, Fitsum A and Shih, Kevin J and Wang, Ting-Chun and Tao, Andrew and Catanzaro, Bryan},
-  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
-  pages={85--100},
-  year={2018}
-}
-```
+<div align=center >
+ <img src="https://user-images.githubusercontent.com/12726765/144175613-1bc9ad1b-072d-4c1f-a97d-1af5be2590bd.png" width="400"/>
+</div >
 
 ## Results and models
 
@@ -41,3 +28,16 @@ Existing deep learning based image inpainting methods use a standard convolution
 |                                   Method                                    | Mask Type | Resolution | Train Iters |  Test Set  | l1 error |  PSNR  | SSIM  |                                                                                                                        Download                                                                                                                         |
 | :-------------------------------------------------------------------------: | :-------: | :--------: | :---------: | :--------: | :------: | :----: | :---: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
 | [PConv](/configs/inpainting/partial_conv/pconv_256x256_stage2_4x2_celeba.py) | free-form |  256x256   |    500k     | CelebA-val |  5.990   | 25.404 | 0.853 | [model](https://download.openmmlab.com/mmediting/inpainting/pconv/pconv_256x256_stage2_4x2_celeba_20200619-860f8b95.pth) \| [log](https://download.openmmlab.com/mmediting/inpainting/pconv/pconv_256x256_stage2_4x2_celeba_20200619-860f8b95.log.json) |
+
+
+## Citation
+
+```bibtex
+@inproceedings{liu2018image,
+  title={Image inpainting for irregular holes using partial convolutions},
+  author={Liu, Guilin and Reda, Fitsum A and Shih, Kevin J and Wang, Ting-Chun and Tao, Andrew and Catanzaro, Bryan},
+  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
+  pages={85--100},
+  year={2018}
+}
+```
diff --git a/configs/mattors/dim/README.md b/configs/mattors/dim/README.md
@@ -1,32 +1,19 @@
 # DIM (CVPR'2017)
 
+> [Deep Image Matting](https://arxiv.org/abs/1703.03872)
+
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 <!-- [ABSTRACT] -->
 
 Image matting is a fundamental computer vision problem and has many applications. Previous algorithms have poor performance when an image has similar foreground and background colors or complicated textures. The main reasons are prior methods 1) only use low-level features and 2) lack high-level context. In this paper, we propose a novel deep learning based algorithm that can tackle both these problems. Our deep model has two parts. The first part is a deep convolutional encoder-decoder network that takes an image and the corresponding trimap as inputs and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges. In addition, we also create a large-scale image matting dataset including 49300 training images and 1000 testing images. We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images. Experimental results clearly demonstrate the superiority of our algorithm over previous methods.
 
 <!-- [IMAGE] -->
-<p align="center">
-  <img src="https://user-images.githubusercontent.com/12726765/144175771-05b4d8f5-1abc-48ee-a5f1-8cc89a156e27.png" />
-</p>
-
-<!-- [PAPER_TITLE: Deep Image Matting] -->
-<!-- [PAPER_URL: https://arxiv.org/abs/1703.03872] -->
-
-## Citation
-
-<!-- [ALGORITHM] -->
-
-```bibtex
-@inproceedings{xu2017deep,
-  title={Deep image matting},
-  author={Xu, Ning and Price, Brian and Cohen, Scott and Huang, Thomas},
-  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
-  pages={2970--2979},
-  year={2017}
-}
-```
+<div align=center >
+ <img src="https://user-images.githubusercontent.com/12726765/144175771-05b4d8f5-1abc-48ee-a5f1-8cc89a156e27.png" width="400"/>
+</div >
 
 ## Results and models
 
@@ -47,3 +34,16 @@ Image matting is a fundamental computer vision problem and has many applications
 > The performance of the model is not stable during the training. Thus, the reported performance is not from the last checkpoint. Instead, it is the best performance of all validations during training.
 
 > The performance of training (best performance) with different random seeds diverges in a large range. You may need to run several experiments for each setting to obtain the above performance.
+
+
+## Citation
+
+```bibtex
+@inproceedings{xu2017deep,
+  title={Deep image matting},
+  author={Xu, Ning and Price, Brian and Cohen, Scott and Huang, Thomas},
+  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+  pages={2970--2979},
+  year={2017}
+}
+```