Color conversion in SR

How do we convert color in python for SR?

Here we use python3 cv2.cvtColor and compare it with some matlab functions.

Most of operations are done in Numpy.ndarray format, which is read through cv2.imread('Barns_grand_tetons.jpg')

There are usually two formats:

uint8, range [0, 255]: saving to .jpg or .png files OR the raw data read from these files.
float32, range [0, 1]: intermediate processing for minimizing information lost. float32 is necessary in cv2.cvtColor while foloat64 will raise errors.

RGB <--> BGR

The cv2 package reads image in HxWxC, BGR channels. We can use the following two flags to convert between them.

cv2.RGB2BGR
cv2.BGR2RGB

RGB/BGR <--> YUV

YUV is a color encoding system used in analog video (wiki). Y is the luminance channel (the same as GRAY image); U is the chrominance component measuring between green and blue; V is the other chrominance component measuring between green and red. The U-V color plane is shown as follows.

The U and V values are from -0.5 to 0.5 and they are always in float format as intermediate results during image processing. Therefore, before convert to YUV, we'd better perform img = img.astype(np.float32)/255. first.

The formula is shown above (similar to SDTV with BT.601) (U and V add 0.5 to range [0, 1]). The flags used in cv2.cvtColor are as follows.

float32, [0, 1]
cv2.RGB2YUV / cv2.BGR2YUV
cv2.YUV2RGB / cv2.YUV2BGR

(matlab does not have the corresponding function.)

RGB/BGR <--> YCbCr

Similar to YUV, YCbCr is another color encoding system, but used in digital video (wiki). Y is the luminance channel; Cb and Cr are the chrominance components measuring between green and blue, red respectively.

However, the python cv2.cvtColor(cv2.COLOR_RGB2YCrCb) is very different from matlab rgb2ycbcr function. The python cv2.cvtColor version is similar to JPEG conversion, where the Y channel is the same as that in YUV and GRAY.

while the matlab rgb2ycbcr is similar to BT.601, where the Y channel is very different from that in YUV.

Note that most of the SR algorithms calculate PSNR by matlab rgb2ycbcr, where the smallest value is 16.

If we train our SR models with the Y channel in YCbCr, we will encounter artifacts when testing images with pixel values smaller than 16, shown as the following image.

TODO (add an image with artifacts.)

So, for mimicking the matlab rgb2ycbcr functions, we can use the following code.

def rgb2ycbcr(img, only_y=True):
    '''same as matlab rgb2ycbcr
    only_y: only return Y channel
    Input:
        uint8, [0, 255]
        float, [0, 1]
    '''
    in_img_type = img.dtype
    img.astype(np.float64)
    if in_img_type != np.uint8:
        img *= 255.
    # convert
    if only_y:
        rlt = np.dot(img, [65.481, 128.553, 24.966]) / 255.0 + 16.0
    else:
        rlt = np.matmul(img, [[65.481, -37.797, 112.0], [128.553, -74.203, -93.786],
                                [24.966, 112.0, -18.214]]) / 255.0 + [16, 128, 128]
    if in_img_type == np.uint8:
        rlt = rlt.round()
    else:
        rlt /= 255.
    return rlt.astype(in_img_type)

We compare the above implementation with matlab rgb2ycbcr function in Barns_grand_tetons.jpg (using uint8, [0, 255]). The total pixel difference is 0.

RGB/BGR <--> GRAY

Often, GRAY is the same as the Y component in YUV. The formula is: GRAY = 0.299 * R + 0.587 * G + 0.114 * B.

When converting to RGB, copy the gray channel to RGB channels repeatedly.

The flags used in cv2.cvtColor are as follows.

float32, [0, 1] OR uint8, [0, 255]
cv2.COLOR_RGB2GRAY / cv2.COLOR_BGR2GRAY
cv2.COLOR_GRAY2RGB / cv2.COLOR_GRAY2BGR

Note: the rgb2gray function uses a slightly different formula: GRAY = 0.2989 * R + 0.587 * G + 0.114 * B.

We compare the cv2.cvtColor(cv2.COLOR_RGB2GRAY) with matlab rgb2gray function in Barns_grand_tetons.jpg (using uint8, [0, 255]). The total pixel difference is 31, which is very small (error ratio: 31/(1600*1195)=1.62e-3%).

We also reimplement it as follows and the pixel difference is 528. Therefore, we can use cv2.cvtColor(cv2.COLOR_RGB2GRAY) function to replace matlab function rgb2gray.

def rgb2gray(img):
    assert img.dtype == np.uint8, 'np.uint8 is supposed. But received img dtype: %s.' % img.dtype
    in_img_type = img.dtype
    img.astype(np.float64)
    img_gray = np.dot(img[..., :3], [0.2989, 0.587, 0.114]).round()
    return img_gray.astype(in_img_type)

Conclusion

How to choose?

We refer GRAY and the Y component of YUV as gray and refer the Y component of YCbCr as y.

convert to gray

img.astype(float32)
img /= 255.
rlt = cv2.cvtColor(img, cv2.RGB2BGR)
# OR
rlt = cv2.cvtColor(img, cv2.RGB2BGR)

convert to YUV. In most time, we want to process image on Y component and directly copy (or very simple procession) the UV components.

img.astype(float32)
img /= 255.
rlt = cv2.cvtColor(img, cv2.RGB2YUV)

convert to y. May only use when compare with previous results.

Use our own implementation.

Provide feedback

Saved searches