# Compare two methods of assessing skin tone

This notebook demonstrates how to use the **preprocess** and **detection** functions to implement two methods of assessing skin tone in black & white photos -- then, use **conf_matrix**, **mse**, and **OTHER FIT MODULES**? to assess which one worked better.

### Import packages

First, import relevant dependencies:

In [47]:
import os
import sys
import pandas as pd
import numpy as np
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
from tonelocator import detection, conf_matrix
from tonelocator.colorizer import colorizer

You are missing the file "colorization_release_v2.caffemodel" Download it and place into your "model" folder You can download this file from this location:
 https://www.dropbox.com/s/dx0qvhhp5hbcx7z/colorization_release_v2.caffemodel?dl=1


NameError: name 'exit' is not defined

In [46]:
model = r'tonelocator/tonelocator/colorizer/model/colorization_release_v2.caffemodel'
model = os.path.join(os.path.dirname(__name__), model)
print(model)

tonelocator/tonelocator/colorizer/model/colorization_release_v2.caffemodel


### Create array of image file names and list of photos

Next, create an array that lists the file names of all the photos you want to run the tonelocator complexion detection methods on. 

Replace 'folder' below with the file path to the folder your photos are in.

In [45]:
#TODO: replace with the correct folder and a set of unprocessed photos
folder = '../tonelocator/data/practice_set'
imgpaths = []
for file in os.listdir(folder):
    imgpaths.append(os.path.join(folder, file))
imgpaths = np.array(imgpaths)
print(imgpaths)

imgnames = []
for file in os.listdir(folder):
    imgnames.append(file)
imgnames = np.array(imgnames)
#print(imgnames)

['../tonelocator/data/practice_set/.DS_Store'
 '../tonelocator/data/practice_set/a_01.jpg'
 '../tonelocator/data/practice_set/a_03.jpg'
 '../tonelocator/data/practice_set/crop_05.jpg'
 '../tonelocator/data/practice_set/crop_04.jpg'
 '../tonelocator/data/practice_set/a_02.jpg'
 '../tonelocator/data/practice_set/crop_01.jpg'
 '../tonelocator/data/practice_set/a_05.jpg'
 '../tonelocator/data/practice_set/crop_03.jpg'
 '../tonelocator/data/practice_set/crop_02.jpg'
 '../tonelocator/data/practice_set/a_04.jpg']


### Pre-process photos

Next, pre-process the photos into three sets of files: 
1. the color photos that we'll use to detect the "true" color composition of the photos 
2. the grayscale photos that we'll use to test our B&W Monk Scale detection methods on
3. colorized versions of the grayscale photos that we'll used to test B&W Monk Scale detection method 2

The **preprocess** module crops the photos and has an option to convert them to grayscale. The **colorizer** module is used to colorize B&W photos. 

In [None]:
# Create subfolders to contain preprocessed color photos,
# B&W versions of those photos, and colorized versions
# of the B&W photos
folder = '../example/photos'
!mkdir $folder/color
!mkdir $folder/grayscale
!mkdir $folder/colorized

In [None]:
# Preprocess color images and save one set in color and one set in B&W 
# TODO: add code to save set of preprocessed pics in respective folders (color & grayscale)
preprocess(i) for i in arr
etc

#RETURN: 
#arr_color
#arr_grayscale


In [21]:
# Colorize each file in loop in input folder and save in output folder
for pic in imgnames:
    """
    Loop that will run through every file in the input folder, colorize them, and save the colorized image to the output folder.
    """
    imgpath = folder + "/" + pic
    outpath = folder + "/colorized/" + pic
    image, colorized = colorizer.colorize_image(imgpath)
    data=cv2.imencode('.png', colorized)[1].tobytes()
    cv2.imwrite(outpath, colorized)

AttributeError: module 'tonelocator.colorizer' has no attribute 'colorizer'

### Detect Monk scale based on color photos - the 'true' result we'll compare our detection methods to

Next, use the **complexion_detection** function to detect the color composition of the color photos. This is the 'true' result we'll use as a baseline to compare against the two methods of detecting complexion from B&W photos. 

In [20]:
true_results = np.array([detection.complexion_detection(i, rounding_places=2) 
                                  for i in files])
true_df = pd.DataFrame(true_results)
true_df['picid'] = imgnames
# TODO: fix picid to be a numeric identifier
print(true_df)

     0    1     2     3     4     5     6     7     8    9        picid
0  0.0  0.0  0.00  0.00  0.06  0.02  0.02  0.02  0.00  0.0     a_01.jpg
1  0.0  0.0  0.00  0.01  0.12  0.02  0.03  0.01  0.01  0.0     a_03.jpg
2  0.0  0.0  0.06  0.01  0.02  0.05  0.01  0.00  0.00  0.0  crop_05.jpg
3  0.0  0.0  0.00  0.00  0.16  0.12  0.10  0.03  0.00  0.0  crop_04.jpg
4  0.0  0.0  0.00  0.01  0.18  0.03  0.02  0.01  0.00  0.0     a_02.jpg
5  0.0  0.0  0.00  0.00  0.35  0.10  0.08  0.04  0.01  0.0  crop_01.jpg
6  0.0  0.0  0.20  0.03  0.05  0.01  0.00  0.00  0.00  0.0     a_05.jpg
7  0.0  0.0  0.00  0.01  0.36  0.03  0.02  0.01  0.00  0.0  crop_03.jpg
8  0.0  0.0  0.00  0.01  0.34  0.09  0.06  0.01  0.00  0.0  crop_02.jpg
9  0.0  0.0  0.00  0.00  0.05  0.03  0.04  0.02  0.01  0.0     a_04.jpg


### METHOD 1: Detect Monk scale based on B&W versions of color photos and reference to B&W Monk scale

In [None]:
m1_results = np.array([detection.complexion_detection(i, 
                                                      rounding_places=2,
                                                     grayscale=True) 
                                  for i in arr_grayscale])
m1pred_df = pd.DataFrame(m1_results)
m1pred_df['picid'] = arr_color
# TODO: fix picid to be a numeric identifier

### METHOD 2: Detect Monk scale based on colorized photo

In [None]:
m2_results = np.array([detection.complexion_detection(i, 
                                                      rounding_places=2,
                                                     grayscale=False) 
                                  for i in arr_colorized])
m2pred_df = pd.DataFrame(m2_results)
m2pred_df['picid'] = arr_colorized
# TODO: fix picid to be a numeric identifier

### Compare effectiveness of each method

First, create a confusion matrix for each method. 

In [None]:
conf_matrix(true=true_df, pred=m1pred_df).plot()

In [None]:
conf_matrix(true=true_df, pred=m2pred_df).plot()

Calculate the MSE for each method:

In [None]:
print(mse(true=true_df, pred=m1pred_df, bybin=False))
print(mse(true=true_df, pred=m2pred_df, bybin=False))

The MSE is higher for X method and the confusion matrix suggests the fit is better for X method - this is probably because XYZ