# Python 機器學習期末專案

**成員: 蔡宇軒、張鈺欣、楊雅嵐** <br>
**題目: 韓國女團人臉辨識**<br>

## 介紹

近幾年，韓國文化與歌曲逐漸流行，韓國藝人在台知名度也大增。我們常常能看到韓國女團在舞台上勁歌熱舞，他們都有著不可思議美麗的臉龐，令人羨慕不已。然而，我們卻總能在他們的臉龐中發現些微的相似。如今，整形已是公開的秘密，技術也越來越進步，我們也常聽到新人女團常常以神似某些當紅女團的成員作為宣傳的手法。同樣的，隨著科技日新月異，人臉辨識的技術也逐漸發展中。我們希望建好的model不但能辨識出人臉的差異，也能辨別兩人是否有高度的相似度。

## 實作方法

**<p style="color:red;">成果展示每個階段都有錄影片，可以點進去觀看哦**

**第一階段:**
data 已經收集大概每個女星 600 張照片，總共有5個女星，總共有3000張照片。然後透過調整明亮度和水平翻轉圖片，將圖片擴展成9000張。
而data的收集方法有3:第一種是去搜尋女星的名字，下載照片。第二種方式是去找女星的video，第三種是去IG下載。從中取出適合的frame做為data照片。(適合代表不希望有其他人的臉，因為怕偵測到其他人的臉孔。所以取出frame後如果有其他人的臉孔，我們就要自己一張張切割照片取出目標女星的臉。)

第一階段做出的成果是結合interact，來偵測圖片中的女星是誰，詳情見**結果呈現1: 利用interact展示成果**

**第二階段**
這個階段我們希望達成的成果是在影片中辨識出這五位女星與其他人，因此我們又去收集了Data，因為需要增加一個類別是"其他人"。這次我們利用的是網路現有的DataSet，UTKFace，有許多不同的人臉，包括男性女性老人嬰兒，我們取出其中的8649張圖片，全部放在Others資料夾，當作是一個新的類別，因此現在的類別有六項，也就是五位女星加上一類其他人，總共有17223筆Data。

最後的呈現成果見**結果呈現4: 影片 real-time偵測人臉 =>多人在影片中的人臉識別**


![](https://i.imgur.com/ZWyisGt.png)

In [1]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np


%matplotlib inline

### Step1: 找到資料夾內所有的檔案

這裡把不同Label(不同女星)的Data存在不同的資料夾，這樣就不需要用檔名去label，而是用資料夾去Label，因此第一步先把5位女星資料夾的data分別讀進陣列的5個index

![](https://i.imgur.com/ZguC5v0.png)

In [2]:
import os

path = ['.\images\M','.\images\J','.\images\\N','.\images\S','.\images\Z']
files = [[],[],[],[],[]]
# r=root, d=directories, f = files
for index in range(5):
    for r, d, f in os.walk(path[index]):
        for file in f:
            files[index].append(os.path.join(r, file))
'''
for i in range(5):
    for f in files[i]:
        print(f)
    print('----')
'''

"\nfor i in range(5):\n    for f in files[i]:\n        print(f)\n    print('----')\n"

In [3]:
import face_recognition
import cv2

### Step2: Preprocessing

因為Data數量大概有600張照片\*5位女星，大概是3000張照片，所以我們用兩種方式讓Data可以增加: <br>
1.圖片水平反轉<br>
2.圖片明亮度改變<br>
因為電腦與人不同，看到1或是2的轉變就會覺得是很不一樣的相片，因此用此方式來讓Data增加三倍，大約是9000張圖片<br>

![](https://i.imgur.com/s4MrMWL.png)
<br>



照片讀進來並做以上的處理，將一張圖片變成3張圖片之後，會用face_recognition這個Library中的face_recognition.face_encodings。這個API的功用是可以把一張圖片的人臉偵測出來，然後把五官偵測出來，最後把五官變成是一個128-dim的vector，具體來說就是一張圖片餵進來，會自動算出描述五官的特徵向量。



In [4]:
#增加明亮度
def increase_brightness(img, value=30):
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    h, s, v = cv2.split(hsv)

    lim = 255 - value
    v[v > lim] = 255
    v[v <= lim] += value

    final_hsv = cv2.merge((h, s, v))
    img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
    return img

In [5]:

# 最後要記錄出的樣品向量(128 維度) 和標籤(明星名)
encodinglist = []
labels = []
# 一橫列畫幾個人
width = 5
count=0
for index_of_person in range(5):
    # 總共幾個橫列
    height = int(len(files[index_of_person]) / width) + 1
    # 整個大圖的size
    plt.figure(figsize=(100,100))
    #print('------------')
    for (i, f) in enumerate(files[index_of_person]):
        #if i >540: 
        #if f == (".\images\S\SinB.full.160333.jpg"): 
        # Step1. 讀取檔案
        for k in range(3):
                if(k==0):
                    img = face_recognition.load_image_file(f)
                elif(k==1):
                    img = face_recognition.load_image_file(f)
                    img = cv2.flip(img, 1)
                else:
                    img = face_recognition.load_image_file(f)
                    img = increase_brightness(img, value=80)
                    
            # Step2. 把臉的降維向量算出, 用已經做好的cnn, [0] 第一張臉
                face_detect_and_encoding = face_recognition.face_encodings(img)
                #print(face_encoding[0])
            #如果沒偵測到臉，就不用特徵向量
                if len(face_detect_and_encoding) <= 0:
                    #print("No faces found in the image!")
                    continue
                else :
                    face_encoding = face_detect_and_encoding[0]

                #print(f+", NO:"+str(count))
                count=count+1

                #height,width = img.shape[:2] #get image height and width
                #if(height>1500 or width>1500):
                #    img = cv2.resize(img,(width//3,height//3),interpolation=cv2.INTER_CUBIC)
                #elif(height>800 or width>800):
                #    img = cv2.resize(img,(width//2,height//2),interpolation=cv2.INTER_CUBIC)
                #(top, right, bottom, left) = face_recognition.face_locations(img, model="cnn")[0]

                #draw
                #size = int(img.shape[0] / 100)
                #cv2.rectangle(img, (left, top), (right, bottom), (100, 100, 200),thickness=3)
                #print(str(right) + ' '+str(top) + ' '+str(bottom) + ' '+str(left))
            # Step4. 把它加到我自己準備的list 裡
                encodinglist.append(face_encoding)
            # Step6. 把人名到我自己準備的list
                labels.append(index_of_person)
        # 利用enumerate 得到的i 指定subplot
            #plt.subplot(height, width, i + 1)
            #plt.axis("off")
            #plt.imshow(img)
print("Total:"+str(count))

Total:8876


<Figure size 7200x7200 with 0 Axes>

<Figure size 7200x7200 with 0 Axes>

<Figure size 7200x7200 with 0 Axes>

<Figure size 7200x7200 with 0 Axes>

<Figure size 7200x7200 with 0 Axes>

### Step3: Model 建立

這一步我們總共用了3個Model來實現人臉辨識，以下將一一說明。

#### <p style="color:red;"> 1. CNN Model 
我們建構的CNN model 結構:
![](https://i.imgur.com/eqQkrn1.png)

In [9]:
%env KERAS_BACKEND=tensorflow
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.layers import Conv2D, MaxPool2D
from keras.optimizers import Adam ,SGD

from sklearn.model_selection import train_test_split

env: KERAS_BACKEND=tensorflow


In [12]:
# 秀一下降維過後的向量
#print("x長度:", len(encodinglist))
#print("y長度:", len(labels))

data_length = len(encodinglist)

data_x = np.array(encodinglist)
data_y = np.array(labels)

data_x = data_x.reshape(data_length, 128, 1, 1)

data_y = np_utils.to_categorical(data_y, 5)

x_train,x_test,y_train,y_test =train_test_split( data_x , data_y ,test_size = 0.1, random_state = 4)


print("xtrain長度:", len(y_train))
#print("y長度:", len(y_test))

xtrain長度: 7988


以上已經把training data 的 128維都已經存在array之中，接下來開始做CNN判斷

TODO: input normalize

In [13]:
model1 = Sequential()

model1.add(Conv2D(40, (4,1), padding='same', input_shape=(128,1,1))) #4個filter，都是5*5
model1.add(Activation('relu'))
model1.add(MaxPool2D(pool_size=(2,1)))

model1.add(Conv2D(60, (4,1), padding='same'))
model1.add(Activation('relu'))
model1.add(MaxPool2D(pool_size=(2,1)))

model1.add(Conv2D(80, (4,1), padding='same'))
model1.add(Activation('relu'))
model1.add(MaxPool2D(pool_size=(2,1)))

model1.add(Conv2D(100, (4,1), padding='same'))
model1.add(Activation('relu'))
model1.add(MaxPool2D(pool_size=(2,1)))

model1.add(Flatten())
model1.add(Dense(150))#拉平完送進最後一個普通NN
model1.add(Activation('relu'))

model1.add(Dense(5))
model1.add(Activation('softmax'))

model1.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model1.summary()

model1_out=model1.fit(x_train, y_train, batch_size=3, epochs=30,verbose =1,validation_data = (x_train,y_train))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_9 (Conv2D)            (None, 128, 1, 40)        200       
_________________________________________________________________
activation_13 (Activation)   (None, 128, 1, 40)        0         
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 64, 1, 40)         0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 64, 1, 60)         9660      
_________________________________________________________________
activation_14 (Activation)   (None, 64, 1, 60)         0         
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 32, 1, 60)         0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 32, 1, 80)         19280     
__________

In [14]:
model1_json = model1.to_json()
open('model1.json', 'w').write(model1_json)
model1.save_weights('model1_weights.h5')

#### <p style="color:red;"> 2. Normal NN(Fully-connected)
我們建構的NN model 結構:
![](https://i.imgur.com/mUABOij.png)

In [16]:
# 秀一下降維過後的向量
print("x長度:", len(encodinglist))
print("y長度:", len(labels))

data_length = len(encodinglist)

NN_inputx = np.array(encodinglist)
NN_inputy = np.array(labels)

NN_inputy = np_utils.to_categorical(NN_inputy, 5)

NN_x_train,NN_x_test,NN_y_train,NN_y_test =train_test_split( NN_inputx , NN_inputy ,test_size = 0.1, random_state = 4)


print("xtrain長度:", len(y_train))
#print("y長度:", len(y_test))

x長度: 8876
y長度: 8876
xtrain長度: 7988


In [17]:

model2 = Sequential()
model2.add(Dense(200, input_dim=128))
model2.add(Activation("relu"))
model2.add(Dense(100))
model2.add(Activation("relu"))
model2.add(Dense(5))
model2.add(Activation("softmax"))
model2.compile(loss='mse', optimizer='Adam', metrics=['accuracy'])
model2.summary()

model2_out = model2.fit(NN_x_train, NN_y_train,
                        batch_size = 3,
                        epochs = 30,
                        verbose = 1,
                        validation_data = (NN_x_train, NN_y_train))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_7 (Dense)              (None, 200)               25800     
_________________________________________________________________
activation_19 (Activation)   (None, 200)               0         
_________________________________________________________________
dense_8 (Dense)              (None, 100)               20100     
_________________________________________________________________
activation_20 (Activation)   (None, 100)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 5)                 505       
_________________________________________________________________
activation_21 (Activation)   (None, 5)                 0         
Total params: 46,405
Trainable params: 46,405
Non-trainable params: 0
_________________________________________________________________
Train 

In [18]:
model2_json = model2.to_json()
open('model2.json', 'w').write(model2_json)
model2.save_weights('model2_weights.h5')

#### <p style="color:red;"> 3. Euclidean Distance
![](https://i.imgur.com/rp63F6b.png)

In [19]:
# 秀一下降維過後的向量
print("x長度:", len(encodinglist))
print("y長度:", len(labels))

data_length = len(encodinglist)

E_inputx = np.array(encodinglist)
E_inputy = np.array(labels)
E_x_train,E_x_test,E_y_train,E_y_test =train_test_split( E_inputx , E_inputy ,test_size = 0.1, random_state = 4)


print("x長度:", len(y_train))
print("y長度:", len(y_test))

x長度: 8876
y長度: 8876
x長度: 7988
y長度: 888


In [20]:
mean0,mean1,mean2,mean3,mean4=[],[],[],[],[]
for i in range(len(E_y_train)):
    if ((E_y_train[i]==0)):
        if(len(mean0)==0):
            mean0.append(E_x_train[i])
        else:
            mean0+=E_x_train[i]
    elif (E_y_train[i]==1):
        if(len(mean1)==0):
            mean1.append(E_x_train[i])
        else:
            mean1+=E_x_train[i]
    elif (E_y_train[i]==2):
        if(len(mean2)==0):
            mean2.append(E_x_train[i])
        else:
            mean2+=E_x_train[i]
    elif (E_y_train[i]==3):
        if(len(mean3)==0):
            mean3.append(E_x_train[i])
        else:
            mean3+=E_x_train[i]
    elif (E_y_train[i]==4):
        if(len(mean4)==0):
            mean4.append(E_x_train[i])
        else:
            mean4+=E_x_train[i]
            
mean0/=len(encodinglist)
mean1/=len(encodinglist)
mean2/=len(encodinglist)
mean3/=len(encodinglist)
mean4/=len(encodinglist)

上面已經取出5位女星中每個人的128-dim的elementwise平均了，以下用TestData一個一個試與誰最近預測出是誰

In [21]:
from scipy.spatial import distance

right=0
wrong=0
for i in range(len(E_y_test)):
    dst0 = distance.euclidean(mean0,E_x_test[i])
    dst1 = distance.euclidean(mean1,E_x_test[i])
    dst2 = distance.euclidean(mean2,E_x_test[i])
    dst3 = distance.euclidean(mean3,E_x_test[i])
    dst4 = distance.euclidean(mean4,E_x_test[i])
    dst = [dst0,dst1,dst2,dst3,dst4]
    predict = dst.index(min(dst))
    #print(str(dst0)+" , "+str(dst1)+" , "+str(dst2)+" , "+str(dst3)+" , "+str(dst4))
    #print("預測: "+str(predict)+" , 真實: ",str(E_y_test[i]))
    if(predict==E_y_test[i]):
        right+=1
    else:
        wrong+=1
print("Euclidean Distance 正確率: "+str(right/(right+wrong))+" ,正確:"+str(right)+" ,錯誤:"+str(wrong))

Euclidean Distance 正確率: 0.3704954954954955 ,正確:329 ,錯誤:559


<p style="color:red;">上面可以發現CNN表現是最好的，可以達到0.99以上NN次之，大概在0.97~0.98。Euclidean Distance則是不太符合預期，只有0.37。

## 結果呈現

一樣需要把Testdata經由前處理降維成128-dim的向量。才餵進去我們的Model中，這裡不在上面Split testing data是因為方便等下做結果的展示，所以才把 Testing data分開資料夾放

In [22]:
import os
Testpath = ['.\images\Testdata\M','.\images\Testdata\J','.\images\Testdata\\N','.\images\Testdata\S','.\images\Testdata\Z']
Testfiles = [[],[],[],[],[]]
# r=root, d=directories, f = files
for index in range(5):
    for r, d, f in os.walk(Testpath[index]):
        for file in f:
            Testfiles[index].append(os.path.join(r, file))

In [23]:
import matplotlib as mpl
import random
mpl.rc('font',family ='Noto Sans CJK TC')

%matplotlib inline

%env KERAS_BACKEND=tensorflow
import numpy as np
import matplotlib.pyplot as plt
import cv2
from keras.models import model_from_json
from keras.optimizers import SGD
import face_recognition
from keras.utils import np_utils

model1 = model_from_json(open('./model1.json').read())
model1.load_weights('./model1_weights.h5')
model1.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])

model2 = model_from_json(open('./model2.json').read())
model2.load_weights('./model2_weights.h5')
model2.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])

env: KERAS_BACKEND=tensorflow


In [24]:
encodinglist = []
labels = []
# 一橫列畫幾個人
width = 5
count=0

for index_of_person in range(5):
    for (i, f) in enumerate(Testfiles[index_of_person]):
                
                img = face_recognition.load_image_file(f)
                face_detect_and_encoding = face_recognition.face_encodings(img)
                if len(face_detect_and_encoding) <= 0:
                    #print("No faces found in the image!")
                    continue
                else :
                    face_encoding = face_detect_and_encoding[0]
                    #print(count)
                count=count+1
                encodinglist.append(face_encoding)
                labels.append(index_of_person)
                
data_length = len(encodinglist)
data_x = np.array(encodinglist)
data_y = np.array(labels)
x_test = data_x.reshape(data_length, 128, 1, 1)
y_test = np_utils.to_categorical(data_y, 5)

print("xtest長度:", len(y_test))

xtest長度: 260


In [25]:
score_class=model1.evaluate(x_test,y_test)
print('Model1(CNN):  loss:',score_class[0],'正確率:',score_class[1])
score_class2=model2.evaluate(data_x,y_test)
print('Model2(NN):  loss:',score_class2[0],'正確率:',score_class2[1])
print("Euclidean Distance 正確率: "+str(right/(right+wrong))+" ,正確:"+str(right)+" ,錯誤:"+str(wrong))

Model1(CNN):  loss: 0.01993767430563961 正確率: 0.9938461597149189
Model2(NN):  loss: 0.015858105281907998 正確率: 0.9930769260113056
Euclidean Distance 正確率: 0.3704954954954955 ,正確:329 ,錯誤:559


<p style="color:red;"> ** 上面可以看到，在260筆Data之中，CNN還是表現得最好，沒有超出NN很多的原因是因為Testing Data數較少，但上面在Train的時候Model中的Vaildataion accuracy已經明確的告訴我們，CNN在我們case的大數據情況至少高出1%的正確率，因此下面的結果展示還是都用CNN model的結果。**<br>
上面Model training結果如下，我將他複製到這裡方便觀看:<br>
- NN:loss: 0.0059 - acc: 0.9815 - val_loss: 0.0070 - val_acc: 0.9782<br>
- CNN: loss: 0.0182 - acc: 0.9943 - val_loss: 0.0098 - val_acc: 0.9964<br>

## 結果呈現1: 利用interact展示成果

這裡利用interact展示成果，詳情見影片網址:<br>
[小力點我](https://www.youtube.com/watch?v=vaujS9WmjJw)<br>
使用者按下按鈕後，會從TestData中random的選出一個女星，然後就會對這個女星圖片進行<br>
1.把臉找出來並在圖片中框起來<br>
2.把五官找出來，並預測人臉是哪位女星<br>

在圖片上面顯示的Label，Predict就分別是正確答案和我們預測的答案

![](https://i.imgur.com/dBIxRaA.gif)

In [29]:
from ipywidgets import interact_manual

def NameLabel(people):
    if(people==0):
        return "敏珠"
    elif(people==1):
        return "Jessica"
    elif(people==2):
        return "娜京"
    elif(people==3):
        return "Sinb"
    else:
        return "子瑜"

def ShowPredict():
    TestPeople = random.randint(0,4)
    TestImage = random.randint(0,(len(Testfiles[TestPeople])-1))
    NowTestfile = Testfiles[TestPeople][TestImage]
    print(NowTestfile)
    plt.figure(figsize=(15, 15))
    img = face_recognition.load_image_file(NowTestfile)
    
    height,width = img.shape[:2] #get image height and width
    if(height>1500 or width>1500):
        img = cv2.resize(img,(width//3,height//3),interpolation=cv2.INTER_CUBIC)
    elif(height>800 or width>800):
        img = cv2.resize(img,(width//2,height//2),interpolation=cv2.INTER_CUBIC)
   

    #=========================perdict face
    Testface_encoding = face_recognition.face_encodings(img)
    Testdata_x = np.array(Testface_encoding)
    #https://stackoverflow.com/questions/43017017/keras-model-predict-for-a-single-image
    if(len(Testdata_x)>0):
        (top, right, bottom, left) = face_recognition.face_locations(img)[0]
        #(top, right, bottom, left) = face_recognition.face_locations(img)[0]
        # rectangle(要畫的圖, 左上座標, 右下座標, 顏色, 粗細)
        size = int(img.shape[0] / 100)
        cv2.rectangle(img, (left, top), (right, bottom), (0, 130, 266), size)
        
        Testdata_x = Testdata_x.reshape(1,128, 1, 1)#注意這裡最前面要加一個1
        predict = model1.predict_classes(Testdata_x)
        predictLabel = NameLabel(predict[0])
    #===========================
    
        name = NameLabel(TestPeople)

        plt.title("Label: " + name +"   ,   Predict: "+predictLabel, fontsize=20)
        plt.axis("off")
        plt.imshow(img)
    else:
        plt.title("偵測不到臉孔或五官", fontsize=20)
        plt.imshow(img)


    
interact_manual(ShowPredict);

interactive(children=(Button(description='Run Interact', style=ButtonStyle()), Output()), _dom_classes=('widge…

## 結果呈現2: 影片 real-time偵測人臉 =>單一個人在影片中

再來利用影片呈現結果，詳情見影片網址:<br>
[小力點我](https://www.youtube.com/watch?v=DKGVdCba3no)<br>

這裡對女星的一段影片中的每個Frame，當作是一個圖片的input開始做人臉預測，如果有偵測到人臉的話會在圖片中框出來，並直接在影片中顯示文字，顯示這個Frame預測出她是哪個女星<br>

影片中演示了，如果影片越大，預測得就越慢。我們是把影片大小做2倍、3倍等等的縮小。所以必須要在Real-time和影片解析度中間做trade off。

![](https://i.imgur.com/ZyHGAlI.gif)

In [1]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import cv2
from keras.models import model_from_json
from keras.optimizers import SGD
import face_recognition

model1 = model_from_json(open('./model1.json').read())
model1.load_weights('./model1_weights.h5')
model1.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])

Using TensorFlow backend.


In [2]:
def NameLabel(people):
    if(people==0):
        return "Min-Joo"
    elif(people==1):
        return "Jessica"
    elif(people==2):
        return "Na-Gyung"
    elif(people==3):
        return "Sinb"
    else:
        return "Tzu-Yu"

cap = cv2.VideoCapture('./images/Videos/2.mp4')
font = cv2.FONT_HERSHEY_COMPLEX
color = (0, 266, 266)
while(cap.isOpened()):
    ret, frame = cap.read()
    height,width = frame.shape[:2]
    frame = cv2.resize(frame,(width//3,height//3),interpolation=cv2.INTER_CUBIC)
    #gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    #=======================================predict face
    # (top, right, bottom, left)(top, right, bottom, left) = 
    face_locations = face_recognition.face_locations(frame)
    #print(len(face_locations))
    if(len(face_locations)>0):
        (top, right, bottom, left) =  face_locations[0]
        # rectangle(要畫的圖, 左上座標, 右下座標, 顏色, 粗細)
        size = int(frame.shape[0] / 100)
        cv2.rectangle(frame, (left, top), (right, bottom), color, size)

        #=========================perdict face
        
        Testface_encoding = face_recognition.face_encodings(frame)
        Testdata_x = np.array(Testface_encoding)
        #print(Testdata_x.shape[0])
        #https://stackoverflow.com/questions/43017017/keras-model-predict-for-a-single-image
        if(Testdata_x.shape[0]>0):
            Testdata_x = Testdata_x.reshape(1,128, 1, 1)#注意這裡最前面要加一個1
            predict = model1.predict_classes(Testdata_x)
            predictLabel = NameLabel(predict[0])
            cv2.putText(frame, predictLabel, (left+20 , bottom+20 ),font,0.5,color)
            #print(predictLabel)
        #===========================
    
    cv2.imshow('frame',frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

## 結果呈現3: 影片 real-time偵測人臉 =>Youtube功能
這裡多新增了一個功能，就是可以和Youtube一樣，拉bar把影片停在某個Frame，然後可以從那個Frame自動播放下去。<br>
[小力點我](https://www.youtube.com/watch?v=m8LfDX0ZAPQ)<br>
總共有兩個Bar，Auto只有0和1，是停止和繼續播放的選項。Frame是移動影片進度到某個地方。

有趣的是影片中會有例如上一個Frame預測他是Sinb，下一個Frame預測他是Jessica的情況，還有有可能會有雖然找得到臉部的框框，但卻找不到五官的狀況。

![](https://i.imgur.com/DtKpF9K.gif)

In [6]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import cv2
from keras.models import model_from_json
from keras.optimizers import SGD
import face_recognition

model1 = model_from_json(open('./model1.json').read())
model1.load_weights('./model1_weights.h5')
model1.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])

In [18]:
def NameLabel(people):
    if(people==0):
        return "Min-Joo"
    elif(people==1):
        return "Jessica"
    elif(people==2):
        return "Na-Gyung"
    elif(people==3):
        return "Sinb"
    else:
        return "Tzu-Yu"
    
frame_no=0
cv2.namedWindow('frame', cv2.WINDOW_AUTOSIZE)

cv2.resizeWindow("frame", 640, 480);
def ChangeFrame(value):
    ;

cap = cv2.VideoCapture('./images/Videos/3.mp4')
font = cv2.FONT_HERSHEY_COMPLEX
color = (0, 266, 266)
frame_lenth = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
cv2.createTrackbar('Frame','frame',0,frame_lenth,ChangeFrame)
cv2.createTrackbar('Auto','frame',0,1,ChangeFrame)
while(cap.isOpened()):
    #自動撥放
    if(cv2.getTrackbarPos('Auto','frame')==1):
        ;
    #設定要哪個frame
    else:
        cap.set(1,cv2.getTrackbarPos('Frame','frame'))
        
    ret, frame = cap.read()
    height,width = frame.shape[:2]
    frame = cv2.resize(frame,(width//4,height//4),interpolation=cv2.INTER_CUBIC)
    #gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    #=======================================predict face
    # (top, right, bottom, left)(top, right, bottom, left) = 
    face_locations = face_recognition.face_locations(frame)
    #print(len(face_locations))
    if(len(face_locations)>0):
        (top, right, bottom, left) =  face_locations[0]
        # rectangle(要畫的圖, 左上座標, 右下座標, 顏色, 粗細)
        size = int(frame.shape[0] / 100)
        cv2.rectangle(frame, (left, top), (right, bottom), color, size)

        #=========================perdict face
        
        Testface_encoding = face_recognition.face_encodings(frame)
        Testdata_x = np.array(Testface_encoding)
        #print(Testdata_x.shape[0])
        #https://stackoverflow.com/questions/43017017/keras-model-predict-for-a-single-image
        if(Testdata_x.shape[0]>0):
            Testdata_x = Testdata_x.reshape(1,128, 1, 1)#注意這裡最前面要加一個1
            predict = model1.predict_classes(Testdata_x)
            predictLabel = NameLabel(predict[0])
            cv2.putText(frame, predictLabel, (left+20 , bottom+20 ),font,0.5,color)
            #print(predictLabel)
        #===========================
    
    cv2.imshow('frame',frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

## <p style="color:red;">  結果呈現4: 影片 real-time偵測人臉 =>多人在影片中的人臉識別
[**小力點我**](https://www.youtube.com/watch?v=s5vHYns1m0U)<br>

這裡和前面比起來比較複雜，因為我們想做出的功能是**可以偵測這5位女星的人臉，不是這幾位女星的也都可以標示為其他人**，我們試了以下幾種方法，都是失敗的:<br>
1. 去讀CNN的真實output，如果預測出某個女星的真實output值小於某個Threshold，表示我們認為他是這個女星的信心不足，就將他標示為是其他人。
2. 用我們稍早捨棄的Euclidean Distance去做輔助，如果Euclidean Distance太大表示距離很大，我們就認為他不是這個女星。將他標示為其他人。

以上兩種方法，預測錯誤的機率實在是太高了，因此我們最後決定再度收集Data，這次我們利用的是網路現有的DataSet，[UTKFace](https://susanqq.github.io/UTKFace/)，有許多不同的人臉，包括男性女性老人嬰兒，我們取出其中的**8649**張圖片，全部放在Others資料夾，當作是一個新的類別，因此現在的**類別有六項，也就是五位女星加上一類其他人，總共有17223筆Data**。<br>

這次我們就不再把17223筆Data切割Testing data，而是全部當作Training data，最後直接在影片上看效果，其實可以發現大致上成功的達到我們想要的答案，神經網路中的training accuracy 也都能達到0.993左右。<br>

但是其實還有一些可以改進的部分，如果遇到臉太相近的女團，或是影片中表情動作太豐富(例如[這裡](https://www.youtube.com/watch?v=HYich6WOVhk))，在影片中的效果就不是很好，推測是因為影片中可能會有模糊，臉部角度，明暗度也比較多元的關係，如果能夠training set是影片，testing set也是影片，我相信我們能夠做出更好的結果。但是由於本來的設定是韓國女團人臉圖片辨識，就沒用影片下去做training data。例如

### 這裡是結果呈現4的Training 過程
比較需要注意的只有多一類Others

In [14]:
import os

path = ['.\images\M','.\images\J','.\images\\N','.\images\S','.\images\Z','.\images\Others']
files = [[],[],[],[],[],[]]
# r=root, d=directories, f = files
for index in range(6):
    for r, d, f in os.walk(path[index]):
        for file in f:
            files[index].append(os.path.join(r, file))




In [15]:
import face_recognition
import cv2

def increase_brightness(img, value=30):
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    h, s, v = cv2.split(hsv)

    lim = 255 - value
    v[v > lim] = 255
    v[v <= lim] += value

    final_hsv = cv2.merge((h, s, v))
    img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
    return img

# 最後要記錄出的樣品向量(128 維度) 和標籤(明星名)
encodinglist = []
labels = []
# 一橫列畫幾個人
count=0
for index_of_person in range(6):
    # 總共幾個橫列
    #height = int(len(files[index_of_person]) / width) + 1
    # 整個大圖的size
    #plt.figure(figsize=(100,100))
    #print('------------')
    for (i, f) in enumerate(files[index_of_person]):
        #if i >600: 
        #if f == (".\images\S\SinB.full.160333.jpg"): 
        # Step1. 讀取檔案
        
            if (index_of_person <5 ):
                for k in range(3):
                    if(k==0):
                        img = face_recognition.load_image_file(f)
                    elif(k==1):
                        img = face_recognition.load_image_file(f)
                        img = cv2.flip(img, 1)
                    else:
                        img = face_recognition.load_image_file(f)
                        img = increase_brightness(img, value=80)

                    # Step2. 把臉的降維向量算出, 用已經做好的cnn, [0] 第一張臉
                    face_detect_and_encoding = face_recognition.face_encodings(img)
                            #print(face_encoding[0])
                        # Step3. 抓出臉的方框, 我們這裡沒用到, 但如果你想畫框就會需要
                        # 這裡的face_locations 可以使用cnn(深度學習) 或者haar(固定特徵) 來得到
                        # 如果需要快速, 請使用haar, 如果需要更準確, 就使用cnn, 建議可以使用預設haar 即可
                    if len(face_detect_and_encoding) <= 0:
                        #print("No faces found in the image!")
                        continue
                    else :
                        face_encoding = face_detect_and_encoding[0]

                    #print(f+", NO:"+str(count))
                    count=count+1

                        # Step4. 把它加到我自己準備的list 裡
                    encodinglist.append(face_encoding)
                        # Step6. 把人名到我自己準備的list
                    labels.append(index_of_person)
            else:
                img = face_recognition.load_image_file(f)
                face_detect_and_encoding = face_recognition.face_encodings(img)
                if len(face_detect_and_encoding) <= 0:
                    #print("No faces found in the image!")
                    continue
                else :
                    face_encoding = face_detect_and_encoding[0]
                    #print(f+", NO:"+str(count))
                    count=count+1            

                encodinglist.append(face_encoding)
                labels.append(index_of_person)            

            # 利用enumerate 得到的i 指定subplot
                #plt.subplot(height, width, i + 1)
                #plt.axis("off")
                #plt.imshow(img)
                
print("TotalCount:"+str(count))

TotalCount:17223


In [16]:
%env KERAS_BACKEND=tensorflow
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.layers import Conv2D, MaxPool2D
from keras.optimizers import Adam ,SGD

from sklearn.model_selection import train_test_split


env: KERAS_BACKEND=tensorflow


In [17]:
# 秀一下降維過後的向量
print("x長度:", len(encodinglist))
print("y長度:", len(labels))

data_length = len(encodinglist)

data_x = np.array(encodinglist)
data_y = np.array(labels)

x_train = data_x.reshape(data_length, 128, 1, 1)

y_train = np_utils.to_categorical(data_y, 6)

#x_train,x_test,y_train,y_test =train_test_split( data_x , data_y ,test_size = 0.15, random_state = 4)


print("x長度:", len(y_train))
#print("y長度:", len(y_test))

x長度: 17223
y長度: 17223
x長度: 17223


In [18]:
model3 = Sequential()

model3.add(Conv2D(40, (4,1), padding='same', input_shape=(128,1,1))) #4個filter，都是5*5
model3.add(Activation('relu'))
model3.add(MaxPool2D(pool_size=(2,1)))

model3.add(Conv2D(60, (4,1), padding='same'))
model3.add(Activation('relu'))
model3.add(MaxPool2D(pool_size=(2,1)))

model3.add(Conv2D(80, (4,1), padding='same'))
model3.add(Activation('relu'))
model3.add(MaxPool2D(pool_size=(2,1)))

model3.add(Conv2D(100, (4,1), padding='same'))
model3.add(Activation('relu'))
model3.add(MaxPool2D(pool_size=(2,1)))

model3.add(Flatten())
model3.add(Dense(150))#拉平完送進最後一個普通NN
model3.add(Activation('relu'))

model3.add(Dense(6))
model3.add(Activation('softmax'))

model3.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model3.summary()

model3_out=model3.fit(x_train, y_train, batch_size=3, epochs=20,verbose =1,validation_data = (x_train,y_train))

model3_json = model3.to_json()
open('model3.json', 'w').write(model3_json)
model3.save_weights('model3_weights.h5')

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 128, 1, 40)        200       
_________________________________________________________________
activation_1 (Activation)    (None, 128, 1, 40)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 64, 1, 40)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 64, 1, 60)         9660      
_________________________________________________________________
activation_2 (Activation)    (None, 64, 1, 60)         0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 32, 1, 60)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 1, 80)         19280     
__________

上面可以看到Training data accuracy 在0.993，算是還不錯的成績

### 這裡是結果呈現4的執行過程

多人在同一個影片中的人臉偵測和辨識

In [2]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import cv2
from keras.models import model_from_json
from keras.optimizers import SGD
import face_recognition

model3 = model_from_json(open('./model3.json').read())
model3.load_weights('./model3_weights.h5')
model3.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])

Using TensorFlow backend.


In [17]:
def NameLabel(people):
    if(people==0):
        return "Min-Joo"
    elif(people==1):
        return "Jessica"
    elif(people==2):
        return "Na-Gyung"
    elif(people==3):
        return "Sinb"
    elif(people==4):
        return "Tzu-Yu"
    else:
        return "Others"
    
frame_no=0
cv2.namedWindow('frame', cv2.WINDOW_AUTOSIZE)

def ChangeFrame(value):
    ;

cap = cv2.VideoCapture('./images/Videos/M_people3.mp4')
font = cv2.FONT_HERSHEY_COMPLEX
color = (0, 266, 266)
frame_lenth = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
cv2.createTrackbar('Frame','frame',0,frame_lenth,ChangeFrame)
cv2.createTrackbar('Auto','frame',0,1,ChangeFrame)
while(cap.isOpened()):
    #自動撥放
    if(cv2.getTrackbarPos('Auto','frame')==1):
        ;
    #設定要哪個frame
    else:
        cap.set(1,cv2.getTrackbarPos('Frame','frame'))
        
    ret, frame = cap.read()
    height,width = frame.shape[:2]
    frame = cv2.resize(frame,(width//3,height//3),interpolation=cv2.INTER_CUBIC)
    #gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    #=======================================predict face
    # (top, right, bottom, left)(top, right, bottom, left) = 
    '''
    face_locations = face_recognition.face_locations(frame)
    #print(len(face_locations))
    for i in range(len(face_locations)):
        (top, right, bottom, left) =  face_locations[i]
        # rectangle(要畫的圖, 左上座標, 右下座標, 顏色, 粗細)
        size = int(frame.shape[0] / 100)
        cv2.rectangle(frame, (left, top), (right, bottom), color, size)
        #=========================perdict face
     ''' 
    Testface_encoding = face_recognition.face_encodings(frame)
    face_locations = face_recognition.face_locations(frame)  
       #print(len(Testface_encoding))
        #print(Testdata_x.shape[0])
        #https://stackoverflow.com/questions/43017017/keras-model-predict-for-a-single-image
    for People in range(len(Testface_encoding)):
            
            (top, right, bottom, left) =  face_locations[People]
            size = int(frame.shape[0] / 100)
            cv2.rectangle(frame, (left, top), (right, bottom), color, size)
            
            
            Testdata_x = np.array(Testface_encoding[People])
            #Threshold(Testdata_x)
            Testdata_x = Testdata_x.reshape(1,128, 1, 1)#注意這裡最前面要加一個1
            predict = model3.predict_classes(Testdata_x)
            #predictvalue = model1.predict(Testdata_x)
            #print(predictvalue)
            #if(predictvalue[0][predict]>0.9999):
            predictLabel = NameLabel(predict[0])
            #print(predictLabel)
            cv2.putText(frame, predictLabel, (left+20 , bottom+20 ),font,0.5,color)
            #print(predictLabel)
        #===========================
    
    cv2.imshow('frame',frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()