## Recommendations on Image Quality

After conducting pure colour analysis on images, we extracted exif data from images and combined them with the image colour feature data which allows us to check the effect of camera models on different rock types. In order to show results in a more intuitive and interactive way, we create a dashboard using plotly dash that a user can select a rock type of interest and visualize the distribution of each colour feature with different camera models. If a user wants to check out any peculiar values (range of values) on the feature, the system will generate the spreadsheet for him/her to examine which images have potential issues.

### Labelling issues

The following figure shows distribution of average pixel intensity on red channel for different rock types. We can see there are small bumps at intensity around 250, which seems like indicating the white space.

<img src="../../docs/00_images/03_eda_reports/04_meanpx_red_color.png" width="600" height="600"/>

We took a close look at such peculiar value and found that rocks such as HEM and QR contain really high mean pixel intensity in the following images. 

|image name|
|----------|
|20190504_bp-676-045-11_AS.jpg|
|20190504_bp-676-045-8_AS.JPG|
|20190504_bp-676-045-9_AS.JPG|
|20190504_bw-718-095-3_AS.JPG|
|190324_o_BP-676-048_179.jpg|

Let's take a random image shown in the table above as an example. The following table shows the image we opened it using file finder and image we opened with labelme. We can see that the polygon coordinates actually point to the other image.

|System|Image|
|---|---|
|File Explorer/Finder|<img src="../../docs/00_images/03_eda_reports/05_sample_img.JPG" width="400" height="400"/>|
|LabelMe|<img src="../../docs/00_images/03_eda_reports/05_sample_img_lm.JPG" width="400" height="400"/>|

It would become a problem in training our model since we create a mask for each rock face by applying each polygon coordinate read from the tag file to the original image, if we cannot provide a mask that contain the rock type that we want, our model's performance would suffer.

**Recommendation 1**: Because LabelMe contains not only the coordinate we label manually, but also the image itself, when you want to relabel rocks in the image using Labelme, delete the original labelling file (i.e. .json file) before starting to work on relabelling task. Also, we suggest that checking out weird images using File Explorer/Finder instead of opening with LabelMe.

### Camera effect on rocks

The following figure shows the distribution of mean pixel value on the red channel when we choose HEM as the rock that we are insterested in and select all of camera models provided.

<img src="../../docs/00_images/03_eda_reports/06_meanpx_red_camera_hem.png" width="600" height="600"/>

Ideally, since we examine only one type of rock, we are expected that its distribution should be similar among all cameras and should have a narrow distribution so that it could be distinguished from other rock types. However, we can see that in the figure above the distribution of different camera are not similar. iPhone SE, SM-A530W, SM-G965W have the similar distribution but Canon PowerShot SX20 IS and SM-A520W seems to have distributions that are prone to the right side.

The following table shows 2 images that contains HEM but are taken by two different cameras (i.e. iPhone 6 and SM-A520 W).

|Image|Camera|MeanPixelRed|
|---|---|----|
|<img src="../../docs/00_images/03_eda_reports/07_sample_hem_sm520.jpg" width="400" height="400"/>|SM-A520W|148.13|
|<img src="../../docs/00_images/03_eda_reports/07_sample_hem_iphone6.JPG" width="400" height="300"/>|iPhone 6|70.53|


The one with SM-A520 W has much higher mean pixel in red channel than the other one with iPhone 6. From our eyeballs, we cannot tell that the images above contain the same rock type but both of them actually contain HEM.

**Recommendation 2**: Based on our finding, our suggestion would be trying to use only one type of camera to capture each rock face so that we would be able reduce or avoid the noise coming from the camera itself. A professional camera would be a better choice since it can capture more details of the rocks. Also, it is necessary to provide more data on rocks besides HEM and QR as majority of the rocks are of those two types.

### Lighting condition

There are some of images that are taken with the backlight, which makes rock face hard to see because it would be dark. For example, the following table shows an image using backlight photography and the labels for that image from LabelMe.

|Image|Source|
|----|----|
|<img src="../../docs/00_images/03_eda_reports/08_sample_backlight.JPG" width="400" height="400"/>|Original image|
|<img src="../../docs/00_images/03_eda_reports/08_sample_backlight_lm.png" width="400" height="400"/>|LabelMe|

In the reference image, the area inside the rightmost polygon actually represents IFG but we can barely see the dark surface because of the backlighting.

**Recommendation 3**: In order to improve the image quality, we suggest that trying to avoid shooting photos under strong light with the backlighting.