# The aim for this node
---
* Understand how images are represented on computers. 
* With *Pillow and OpenCV,* you can open image files in Python and extract information. 
* In *CIFAR-100,* similar images can be selected based on histogram.

## 18.1 First thing first
---

## 18.2 - Digital Image 
---

> A digital screen is made up of many dots, and one color dot is called a pixel (picture element).     

Each pixel represents a color by adjusting the intensity of each of the three single RGB (Red, Green, Blue) colors.     
But why are these three colors?

<img src="https://aiffelstaticprd.blob.core.windows.net/media/images/Untitled_m5CAo6l.max-800x600.png" width="300">

This is because the eye cells in the retina are mostly made up of three in humans.     

The figure below shows the colored areas in which each type of optic cell responds.     
Certainly, some humans and some birds have one more type of optic cell, which allows them to perceive colors more closely or to detect some ultraviolet regions.

![image-2.png](attachment:image-2.png)

The simplest way to save an image to be displayed on a digital screen composed of dots represented by the intensity of three colors red, green, and blue is to save the color value of each dot. This is called a raster or bitmap image, and it usually uses 8 bits for each color per point and displays the sensitivity of the color as a value between 0 and 255 (2^8 = 256).

On the other hand, the vector method image is not broken because it records the position of the relative point and line as an equation, and then recalculates how it will be expressed in each pixel of the digital screen as it is enlarged or reduced. Among the files we mainly deal with, photo files are in raster format, and fonts that can be freely enlarged or reduced are mostly vector format.

<img src="https://aiffelstaticprd.blob.core.windows.net/media/original_images/Untitled_2_ewOwVgS.png" width="200">

In the end, just as there are many ways to save images displayed as pixels on a digital screen, color values eventually displayed in RGB do not necessarily need to be saved in RGB format. For example, in the days of transition from grayscale TV to color TV, human eyes were more sensitive to shades than to differences in color, using two color channels with 1/4 resolution in addition to the existing black and white channels. In addition, the YUV method (pictured below) was used.

<img src="https://aiffelstaticprd.blob.core.windows.net/media/images/Untitled_4_XlgZRTu.max-800x600.png" width="100">

> HSV (h:Hue, s:Saturation, v(value): Brightness), which can be understood more intuitively when numerically manipulating colors on a digital screen, is often used. 

In addition, in the case of printing media, the more the color intensity is increased, the darker the color, and the realistic reason that combining each color when expressing the frequently used black color wastes a lot of ink. Yellow, Black) four colors are used.

Each of these various ways of expressing colors is called a color space, and a single axis (R, G, B in RGB, respectively) that makes up each color space is called a channel.

<img src="https://aiffelstaticprd.blob.core.windows.net/media/original_images/Untitled_3_J0Nflzl.png" width="200">

However, it takes up more space than expected to store such color information as it is. Therefore, in the case of the JPEG image format, which is commonly used for photo storage, images are compressed in a way that groups nearby pixels and lumps similar colors together.

Because of the loss of color information in this method, when recompression occurs, such as increasing the compression rate when saving or resaving multiple times, you can see a phenomenon of color getting dirty, often called digital weathering.

Conversely, in the case of the PNG image format, which is widely used for screenshots, images are compressed without loss of color.Since the color used in the image can be defined in advance and a palette method that refers to it can be used, a simple image with few colors used can be In this case, the size of a JPEG file of the same resolution may be smaller than that of a JPEG file of the same resolution, but the more colors used in an image, such as a photo, easily occupy more space than a JPEG file. In addition, GIF format images, which are familiar with gifs, can be made to move by placing multiple frames within the image. Also, color information is stored without loss, but is limited to a pallet method that can only memorize 256 colors.

# 18.3 How to use Pillow, the predecessor of PIL (Python Image Library)
---
Remember, an image is an array of data.

For example, if you have three RGB color channels at each 32 pixels wide and vertical, you can create an array of dimensions [32, 32, 3] with Numpy. Also, make sure that the data type is uint8, that is, each value is an unsigned 8-bit integer, representing a value between 0 and 255 (2 to the 8th power = 256).

In [3]:
import numpy as np
from PIL import Image

data = np.zeros([32, 32, 3], dtype=np.uint8)
image = Image.fromarray(data, 'RGB')
image.show()

In [4]:
data[:, :] = [255, 0, 0]
image = Image.fromarray(data, 'RGB')
image.show()

In [5]:
#- 문제 1 -#
# 가로 세로 각 128 픽셀짜리 흰색 이미지를 만들어 화면에 표시해 봅시다.
import numpy as np
from PIL import Image

data = np.zeros([128, 128, 3], dtype = np.uint8)
data[:, :] = [255, 255, 255]
image = Image.fromarray(data, 'RGB')
image.show()

In [6]:
#- 문제 2 -#
# 연습용 이미지를 열어 width와 height를 출력하고, .save()를 이용하여 jpg 파일 포맷으로 저장해 봅시다
from PIL import Image
import os

# 연습용 파일 경로
image_path = os.getenv('HOME')+'/aiffel/python_image_proc/pillow_practice.png'

# 이미지 열기
im= Image.open('/Users/kyurishin/aiffel/python_image_proc/pillow_practice.png')
im.show()
                   
# width와 height 출력
width, height = image.size
print(width, height)

# RGB로 변환 후, JPG 파일 형식으로 저장해보기
rgb_im = im.convert('RGB')
rgb_im.save('pillow_practice.jpg')

128 128


In [7]:
#- 문제 3 -#
# .resize()를 이용하여 이미지 크기를 100X200으로 변경하여 저장해 봅시다

# new_size를 사이즈 100 x 200의 함수로 지정
new_size = (100, 200)
# .resize(함수)
image.resize(new_size)

# RGB로 변환하고 새로운 이미지 저장하기
rgb_im = im.convert('RGB')
rgb_im.save('/Users/kyurishin/aiffel/python_image_proc/pillow_size_100x200.jpg')

In [8]:
#- 문제 4 -#
# .crop()를 이용하여 눈 부분만 잘라내어 저장해 봅시다.

# 이미지 열기 (r stands for raw strings)
im = Image.open(r"/Users/kyurishin/aiffel/python_image_proc/pillow_size_100x200.jpg") 

# Size of the image in pixels (size of orginal image) 
# (This is not mandatory) 
width, height = im.size 

# Cropped image of above dimension (It will not change orginal image) 
im1 = im.crop((300, 100, 600, 400)) 

# Shows the image in image viewer 
im1.save(r"/Users/kyurishin/aiffel/python_image_proc/pillow_size_eye.jpg") 

# 18.4 Data Preprocessing using Pillow module

### Download CIFAR-100 data and extract it as individual image files
---

To build image database, let's utilise **CIFAR-100 Dataset**.     
There is a total 60,000 images of 32x32 pixel resolution; 100 class per 600 images (500 for learning and 100 for testing).     

If you check the decompressed results, it isn't just image files but there are three groups of - meta, test, train.     


1. Using only the train file, let's open the file once according to the python3 version in the Python / Matlab versions section under Dataset layout in the main body.      
2. Let say the extracted content (converted content by return dict) is train and take a look.

In [9]:
import os
import pickle

dir_path = os.getenv('HOME')+'/aiffel/python_image_proc/cifar-100-python'
train_file_path = os.path.join(dir_path,'train')

with open(train_file_path,'rb') as f:
    train = pickle.load(f, encoding='bytes')

type(train)

dict

train 내용을 그대로 화면에 출력하면 한꺼번에 너무 많은 내용이 나와 갈피를 잡기가 힘들 것입니다.      
다시 본문으로 돌아가 이어지는 내용("Loaded in this way, each of the batch files contains a dictionary")에 따르면 파이썬의 dictionary 객체이니,     
.keys() 메소드를 사용하여 먼저 어떤 키들이 있는지 한번 살펴봅니다.

In [10]:
train.keys()

dict_keys([b'filenames', b'batch_label', b'fine_labels', b'coarse_labels', b'data'])

In [11]:
# 특이사항으로는 각 키들이 문자열(str)이 아닌 b로 시작하는 bytes로 되어있다는 점입니다. 
# 이에 유의하면서 일단 파일명(b'filenames')들을 한번 살펴봅시다.

type(train[b'filenames'])

list

In [12]:
# 역시 기대했던 대로 list로군요. 앞의 5개만 출력해 봅시다.

train[b'filenames'][:5]

[b'bos_taurus_s_000507.png',
 b'stegosaurus_s_000125.png',
 b'mcintosh_s_000643.png',
 b'altar_boy_s_001435.png',
 b'cichlid_s_000031.png']

파일 이름이 깔끔하게 나왔습니다. 이 파일 이름에 해당하는 이미지는 어디 있을까요? 다시 한번 본문을 참고해봅시다.

> data -- a 10000x3072 numpy array of uint8s. Each row of the array stores a 32x32 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.

In [13]:
# 딕셔너리에 b'data'를 보라는 말 같습니다. 한번 첫 번째를 꺼내봅시다.

train[b'data'][:5]

array([[255, 255, 255, ...,  10,  59,  79],
       [255, 253, 253, ..., 253, 253, 255],
       [250, 248, 247, ..., 194, 207, 228],
       [124, 131, 135, ..., 232, 236, 231],
       [ 43,  32,  87, ...,  60,  29,  37]], dtype=uint8)

In [15]:
# numpy 배열이 나왔습니다. 한번 형태를 찍어 봅시다.

train[b'data'][0].shape

(3072,)

Looking at this number and the text above, it seems that the number 3072 corresponds to each pixel of red, green, blue 3 channels X 1024 (=32 * 32). If so, it means that if you reshape this Numpy array well, **the original image file will be restored.**     

Then, let's reshape with (32, 32, 3) and print the image on the screen. There is one thing to watch out for. Recalling what was mentioned above, the 3072 bytes of image data are red (R) for the first 1024 bytes, green (G) for the next 1024 and blue (B) for the last 1024 bytes.        \

Fortunately, the RGB order is correct, but you shouldn't just reshape it to match the shape. It **should be a reshape of 1024 to fill 32X32 and repeat 3 times.** For reshape of filling data from the previous dimension, np.reshape has an argument called **order.** If this value is given as**'F'**, it proceeds in the desired form.

In [20]:
# Beware of order!!!
image_data = train[b'data'][0].reshape([32,32,3], order='F')
# Using Pillow, make Numpy array as Image object!
image = Image.fromarray(image_data)
# display it on the screen!
image.show()

flip x axis and y axis of the image.     
<mark>np.swapaxes(0, 1)</mark>will come in useful!

In [21]:
# flip x axis and y axis of the image (horizontally)

image_data = image_data.swapaxes(0,1)
image = Image.fromarray(image_data)
image.show()

So far, we've learnt how to investigate original CIFAR-100 dataset to pull out image file.    

But we want to make image file like a real file.   

We checked how the file name and file data array in dataset has been saved along the order, so it will be enough to read Numpy array consecutively to save them as image file. 

In [None]:
# With tqdm, you can visualise and check the progress of repetitive tasks.

import os
import pickle
from PIL import Image
import numpy
from tqdm import tqdm

dir_path = os.getenv('HOME')+'/aiffel/python_image_proc/cifar-100-python'
train_file_path = os.path.join(dir_path,'train')

# image를 저장할 cifar-100-python의 하위 디렉토리(image)를 생성합니다.
images_dir_path = os.path.join(dir_path,'images')
if not os.path.exists (images_dir_path):
    os.mkdir(images_dir_path) # images 디렉토리 생성
    
# 32X32의 이미지 파일 5만개를 생성합니다.
with open(train_file_path,'rb') as f:
    train = pickle.load(f,encoding='bytes')
    for i in tqdm(range(len(train[b'filenames']))):
        filename = train[b'filenames'][i].decode()
        data = train[b'data'][i].reshape([32, 32, 3], order='F')
        image = Image.fromarray(data.swapaxes(0, 1))
        image.save(os.path.join( /Users/kyurishin/aiffel/python_image_proc/cifar-100-python/32x32_50000 , 32x32_50000))

# 18.5 - OpenCV (1) hello, OpenCV
---

OpenCV is a open-source library.     
it can be used in languages like c++, python, java, MATLAB etc. There are various advanced functions for image processing are implemented for easy use.

Let's take a look at an example on how to extract certain type of colours from an image.

[Changing Colourspaces](https://docs.opencv.org/master/d0/de3/tutorial_py_intro.html)

---

There are more than 150 colour-space conversion methods available in OpenCV.   
Two of the most used ones:    
1. BGR <-> Gray     flag used for this: cv.COLOR_BGR2GRAY
2. BGR <-> HSV      flag used for this: cv.COLOR_BGR2HSV
 
For colour conversion, we use the function cv.cvtColor(input_image, flag) where flag determines the type of conversion.

In [2]:
import cv2 as cv
flags = [i for i in dir(cv) if i.startswith('COLOR_')]
print(flags)

['COLOR_BAYER_BG2BGR', 'COLOR_BAYER_BG2BGRA', 'COLOR_BAYER_BG2BGR_EA', 'COLOR_BAYER_BG2BGR_VNG', 'COLOR_BAYER_BG2GRAY', 'COLOR_BAYER_BG2RGB', 'COLOR_BAYER_BG2RGBA', 'COLOR_BAYER_BG2RGB_EA', 'COLOR_BAYER_BG2RGB_VNG', 'COLOR_BAYER_GB2BGR', 'COLOR_BAYER_GB2BGRA', 'COLOR_BAYER_GB2BGR_EA', 'COLOR_BAYER_GB2BGR_VNG', 'COLOR_BAYER_GB2GRAY', 'COLOR_BAYER_GB2RGB', 'COLOR_BAYER_GB2RGBA', 'COLOR_BAYER_GB2RGB_EA', 'COLOR_BAYER_GB2RGB_VNG', 'COLOR_BAYER_GR2BGR', 'COLOR_BAYER_GR2BGRA', 'COLOR_BAYER_GR2BGR_EA', 'COLOR_BAYER_GR2BGR_VNG', 'COLOR_BAYER_GR2GRAY', 'COLOR_BAYER_GR2RGB', 'COLOR_BAYER_GR2RGBA', 'COLOR_BAYER_GR2RGB_EA', 'COLOR_BAYER_GR2RGB_VNG', 'COLOR_BAYER_RG2BGR', 'COLOR_BAYER_RG2BGRA', 'COLOR_BAYER_RG2BGR_EA', 'COLOR_BAYER_RG2BGR_VNG', 'COLOR_BAYER_RG2GRAY', 'COLOR_BAYER_RG2RGB', 'COLOR_BAYER_RG2RGBA', 'COLOR_BAYER_RG2RGB_EA', 'COLOR_BAYER_RG2RGB_VNG', 'COLOR_BGR2BGR555', 'COLOR_BGR2BGR565', 'COLOR_BGR2BGRA', 'COLOR_BGR2GRAY', 'COLOR_BGR2HLS', 'COLOR_BGR2HLS_FULL', 'COLOR_BGR2HSV', 'COLOR_

> For HSV, hue range is [0,179], saturation range is [0,255], and value range is [0,255]. Different software use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.

# Object Tracking
---
Now we know how to convert a BGR image to HSV, how do we use it to extract coloured object?  
In HSV, it is easier to represent a colour BGR colour-space.     

**steps**
- Take each frame of the video
- Convert from BGR to HSV color-space
- We threshold the HSV image for a range of blue color
- Now extract the blue object alone, we can do whatever we want on that image.

In [23]:
# import modules; numpy and cv2
import cv2 as cv
import numpy as np

cap = cv.VideoCapture(0)

while(1):
    
    # take each frame
    _, frame = cap.read()
    
    # convert BGR to HSV
    hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
    
    # define range of blue color in HSV
    lower_blue = np.array([110,50,50])
    upper_blue = np.array([130,255,255])
    
    # threshold the HSV image to get only blue colors
    mask = cv.inRange(hsv, lower_blue, upper_blue)
    
    # bitwise-and mask and original image
    res = cv.bitwise_and(frame, frame, mask = mask)
    cv.imshow('frame', frame)
    cv.imshow('mask', mask)
    cv.imshow('res', res)
    k = cv.waitKey(5)&0xFF
    if k==27:
        break
        
cv.destroyAllWindows()
cap.release()

## How to find HSV values to track?

You use cv.cvtColor() method! Instead of passing an image, you just pass BGR values you want.     
For example, to find the HSV value of Green, try this on terminal:

In [3]:
green = np.uint8([[[0,255,0]]])
hsv_green = cv.cvtColor(green, cv.COLOR_BGR2HSV)
print(hsv_green)


NameError: name 'np' is not defined

# 18-6. OpenCV (2) Closer Look
---

<mark>import cv2 as cv</mark>
<mark>import numpy as np</mark>

OpenCV 

# 18-7. Practice: finding similar images with CIFAR-100
---
   - How to extract colour histogram from an image
   - Compare these images using tools
   
> Histogram: Distribution of color values per pixel in the image

Let's draw a histogram by referring to drawing a colour histogram in **using Matplotlib among Plotting Histograms** on the OpenCV example page below.

In [None]:
import os
import pickle
import cv2
import numpy as np
from matplotlib import polyplot as plt
from tqdm import tqdm

# preprocessing directory structure
dir_path = os.getenv('HOME')+'/aiffel/python_image_proc/cifar-100-python'
train_file_path = os.path.join(dir_path, 'train')
images_dir_path = os.path.join(dir_path, 'images')

# filename as a paramter and print the image and histogram
def draw_color_histogram_from_image(file_name):
    image_path = os.path.join('/Users/kyurishin/aiffel/python_image_proc/cifar-100-python',image_histogram_cifar-100)

# open image
    image = image.open('/Users/kyurishin/aiffel/python_image_proc/sample.jpeg')
    cv_image = cv2.imread('/Users/kyurishin/aiffel/python_image_proc/sample.jpeg')
#draw Image and histogram
    f = plt.figure(figsize(10,3))
    im1 = f.add_subplot(1,2,1)
    im1.imshow(img)
    im1.set_title("Image")
    
    im2 = f.add_subplot(1,2,2)
    color = ('b','g','r')
    for i, col in enumerate(color):
        # take the inth channel's histogram from image (0:blue, 1:green, 2:red)
        histr = cv2.calcHist([cv.image],[i],None,[256],[0,256])
        im2.plot(histr, color = col) # draw graph along with the channel's colour
    im2.set_title("Histogram")