This repository contains download links and the introduction of Multi-view Multi-distortion Image Dataset (MVMDD) for IPSN 2020 paper "CollabAR: Edge-assisted Collaborative Image Recognition for Mobile Augmented Reality" by Zida Liu, Guohao Lan, Jovan Stojkovic, Yunfan Zhang, Carlee Joe-Wong, and Maria Gorlatova.
If you have any questions on this repository or the related paper, please create an issue or send me an email. Email address: zida.liu AT duke.edu.
Summary:
To study the impact of image distortion on multi-view augmented reality systems, we created the Multi-View Multi-Distortion Image Dataset (MVMDD). The dataset includes a pristine Multi-view image set (i.e., clear images without distortion) and an augmented distortion Multi-view image set. The detailed information about the collected MVMDD dataset is presented below.
The pristine images are collected using a commodity Nokia 7.1 smartphone. The resolution of the original images is 3024x4032. Six categories of everyday objects are considered: cup, phone, bottle, book, bag, and pen. Each category has six instances. For each instance, images are taken from six different views (six different angles with a 60 angle difference between any two adjacent views), two different background complexity levels (a clear white table background and a noisy background containing other non-target objects), and three distances. We adjust the distance between the camera and the object such that the sizes of the object in the images are different. For the three distances, the object occupies approximately the whole, half, and one-tenth of the total area of the image. The details are summarized in the table below:
Object categories | 6 |
Number of views | 6 |
Background complexity | 2 |
Size of object in an image | 3 |
Number of instances | 6 |
Total pristine images | 6 x 6 x 2 x 3 x 6 = 1,296 |
We apply data augmentation techniques on the pristine image set to generate a new augmented image set. Specifically, three types of image distortion are considered: Motion blur, Gaussian blur, and Gaussian noise. Smartphones or the head-mounted AR set cameras frequently contain motion blur caused by the motion of the user. Gaussian blur appears when the camera is de-focusing or the image is taken underwater or in a foggy environment. And Gaussian noise is inevitable in images because of poor illumination conditions, digital zooming, and the use of a low-quality image sensor.
For each type of distortion, three distortion levels are considered. We are using the following models to augment the images:
- Motion blur:
- Sun, Jian, Wenfei Cao, Zongben Xu, and Jean Ponce. "Learning a convolutional neural network for non-uniform motion blur removal." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 769-777. 2015.
- Gaussian blur:
- Flusser, Jan, Sajad Farokhi, Cyril Höschl, Tomáš Suk, Barbara Zitová, and Matteo Pedone. "Recognition of images degraded by Gaussian blur." IEEE transactions on Image Processing 25, no. 2 (2015): 790-806.
- Gaussian noise:
- Liu, Wei, and Weisi Lin. "Additive white Gaussian noise level estimation in SVD domain for images." IEEE Transactions on Image processing 22, no. 3 (2012): 872-883.
- The pristine image set can be downloaded via https://1drv.ms/u/s!Aqyf-lNI69G1hBi5mn31KDNzuw2u?e=qxX2gs
- An augmented distortion image set can be downloaded via https://drive.google.com/file/d/1GHtqs2B3Unuhej-BnvZ2QbRCgCPULPvq/view?usp=sharing, It contains three different distortion levels of images for each distortion category.
Distortion category | Distortion parameter | Level 1 | Level 2 | Level 3 |
---|---|---|---|---|
Motion blur | Blur kernel length | 10 | 20 | 30 |
Gaussian blur | Aperture size | 11 | 21 | 31 |
Gaussian noise | Variance | 0.01 | 0.02 | 0.03 |
- Data augmentation source code is provided for generating your own augmented image set.
The pristine image set follows a hierarchical file structure shown below. The two sub-folders, Clear_Background and Complex_Background, correspond to the two background complexities. In each of the sub-folders, there are six folders that correspond to the six object categories.
- The tree structure of the dataset folder:
MVMDD
└───Clear_Background
│ │
│ └───bags
│ │ bag1_view1_distance1.jpg
│ │ bag1_view1_distance2.jpg
│ │ ...
│ └───books
│ └───bottles
│ └───cups
│ └───pens
│ └───phones
│
└───Complex_Background
│ │
│ └───bags
│ └───books
│ └───bottles
| ...
The images are named in the format of (instance number) _ (view number) _ (distance number).jpg, where:
- (instance number) corresponds to one of the six instances,
- (view number) corresponds to one of the six views,
- (distance number) corresponds to one of the six distances.
For instance, the image with name 'bag1_view1_distance1.jpg' corresponds to the image of instance #1 of bag captured at distance1 from view1.
After downloading the pristine image set, one can create the distortion image set by running the Python script "distortion_generation.py". The script can be download via https://github.com/CollabAR-Source/MVMDD/blob/master/distortion_generation.py
To generate distortion images, follow the procedure below:
- Before running the script, you should install the necessary tools and libraries on your computer, including: open-cv, skimage, and numpy.
- Then, put the script under the folder ''MVMDD''.
- Run the script as follows:
python .\distortion_generation.py -source_dir -distortion_type -distortion_degree
- source_dir: indicates the original dir that contains the pristine images.
- distortion_type: indicates the type of distortion you would like to synthesize. There are three options available:
- MB for motion blur
- GB for Gaussian blur
- GN for Gaussian noise
- distortion_degree: indicates the distortion level you would like to set.
- The generated images will be saved in the generated folder.
The following is an example of generating Gaussian noise distorted images with distortion level 0.01 for all images in the ./Clear_Background folder: python .\distortion_generation.py .\Clear_Background\ GN 0.01.
We use MVMDD dataset to make a collaborative image recognition system for improving AR experience in heterogeneous environments. Here is a partial demonstration video of the system. You can find the full video and the demo paper here.
Please cite the following paper in your publications if the dataset helps your research.
@inproceedings{Liu20CollabAR,
title={{CollabAR}: Edge-assisted collaborative image recognition for mobile augmented reality },
author={Liu, Zida and Lan, Guohao and Stojkovic, Jovan and Yunfan, Zhang and Joe-Wong, Carlee and Gorlatova, Maria},
booktitle={Proceedings of the 19th ACM/IEEE Conference on Information Processing in Sensor Networks},
year={2020}
}
The authors of this dataset are Zida Liu, Juan Blanco, Guohao Lan, and Maria Gorlatova. This work was done in the Intelligent Interactive Internet of Things Lab at Duke University.
Contact Information of the contributors:
- zida.liu AT duke.edu
- juanmblanco AT me.com
- guohao.lan AT duke.edu
- maria.gorlatova AT duke.edu
This work is supported by the Lord Foundation of North Carolina and by NSF awards CSR-1903136 and CNS-1908051.