# Spoofed Speech Detection via Maximum Likelihood Estimation of Gaussian Mixture Models
The goal of synthetic speech detection is to determine whether a speech segment $S$ is natural or synthetic/converted speeach.

This notebook implements a Gaussian mixture model maximum likelihood (GMM-ML) classifier for synthetic (spoofed) speech detection. This approach uses regular mel frequency cepstral coefficients (MFCC) features and gives the best performance on the [ASVspoof 2015 dataset](https://www.idiap.ch/dataset/avspoof) among the standard classifiers (GMM-SV, GMM-UBM, ...). For more background information see: *Hanilçi, Cemal, Tomi Kinnunen, Md Sahidullah, and Aleksandr Sizov. "Classifiers for synthetic speech detection: a comparison." In INTERSPEECH 2015*. The scripts use the Python package [Bob.Bio.SPEAR 2.04](https://pypi.python.org/pypi/bob.bio.spear/2.0.4) for speaker recogntion.

This work is part of the ["DDoS Resilient Emergency Dispatch Center"](https://www.dhs.gov/science-and-technology/news/2015/09/04/dhs-st-awards-university-houston-26m-cyber-security-research) project at the University of Houston, funded by the Department of Homeland Security (DHS).


April 19, 2015

Lorenzo Rossi

(lorenzo **[dot]** rossi **[at]** gmail **[dot]** com)

In [70]:
import os
import time
import numpy as np
import pandas as pd
import scipy
from bob.bio.spear import preprocessor, extractor
from bob.bio.gmm import algorithm
from bob.io.base import HDF5File
from bob.learn import em
from sklearn.metrics import classification_report, roc_curve, roc_auc_score


def read(filename):
  """Read video.FrameContainer containing preprocessed frames"""

  fileName, fileExtension = os.path.splitext(filename)
  wav_filename = filename
  rate, data = scipy.io.wavfile.read(str(wav_filename)) # the data is read in its native format
  if data.dtype =='int16':
    data = np.cast['float'](data)
  return [rate,data]


WAV_FOLDER = '/data/chendan/wav/' #'ASV2015dataset/wav/' # Path to folder containing speakers .wav subfolders
LABEL_FOLDER = '/data/chendan/protocal/CM_protocol/' #'ASV2015dataset/CM_protocol/' # Path to ground truth csv files

EXT = '.wav'
%matplotlib inline

## Loading the Ground Truth
Load the dataframes (tables) with the labels for the training, development and evaluation (hold out) sets. Each subfolder corresponds to a different speaker. For example, T1 and D4 indicate the subfolders associated to the utterances and spoofed segments of speakers T1 and D4, respectively in training and development sets. Note that number of evaluation samples >> number of development samples >> testing samples.

You can either select the speakers in each set one by one, *e.g.*:
```
train_subfls = ['T1', 'T2']
``` 
will only load segments from speakers T1 and T2 for training,

or use all the available speakers in a certain subset by leaving the list empty, *e.g.*:
```
devel_subfls = [] 
```
will load all the available Dx speaker segments for the development stage. If you are running this notebook for the first time, you may want to start only with 2 or so speakers per set for sake of quick testing. All the scripts may take several hours to run on the full size datsets.

In [80]:
train_subfls = ['T1','T3']#, 'T2', 'T3', 'T4', 'T5', 'T6', 'T7', 'T8', 'T9', 'T13']  #T13 used instead of T10 for gender balance
devel_subfls = ['D1','D3']#, 'D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'D9', 'D10']
evalu_subfls = ['E1']#, 'E2', 'E3', 'E4', 'E5', 'E6','E7',  'E8', 'E9', 'E10']
train = pd.read_csv(LABEL_FOLDER + 'cm_train.trn', sep=' ', header=None, names=['folder','file','method','source'])
if len(train_subfls): train = train[train.folder.isin(train_subfls)]
train.sort_values(['folder', 'file'], inplace=True)
# print(train)
devel = pd.read_csv(LABEL_FOLDER + 'cm_develop.ndx', sep=' ', header=None, names=['folder','file','method','source'])
if len(devel_subfls): devel = devel[devel.folder.isin(devel_subfls)]
devel.sort_values(['folder', 'file'], inplace=True)
print(devel)
evalu = pd.read_csv(LABEL_FOLDER +'cm_evaluation_0103.ndx', sep=' ', header=None, names=['folder','file','method','source'])
if len(evalu_subfls): evalu = evalu[evalu.folder.isin(evalu_subfls)]

evalu.sort_values(['folder', 'file'], inplace=True)

label_2_class = {'human':1, 'spoof':0}

print('training samples:',len(train))
print('development samples:',len(devel))
print('evaluation samples:',len(evalu))

     folder        file method source
2597     D1  D1_1002598  human  human
2598     D1  D1_1002599  human  human
2599     D1  D1_1002600  human  human
2600     D1  D1_1002601  human  human
2601     D1  D1_1002602  human  human
...     ...         ...    ...    ...
7767     D3  D3_1007768     S1  spoof
7768     D3  D3_1007769     S2  spoof
7769     D3  D3_1007770     S3  spoof
7770     D3  D3_1007771     S4  spoof
7771     D3  D3_1007772     S5  spoof

[3050 rows x 4 columns]
training samples: 1310
development samples: 3050
evaluation samples: 20


## Speech Preprocessing and MFCC Extraction
Silence removal and MFCC feature extraction for training segments. More details about the bob.bio.spear involved libraries at:
https://www.idiap.ch/software/bob/docs/latest/bioidiap/bob.bio.spear/master/implemented.html

You can also skip this stage and load a set of feaures (see **Loading features** cell).

In [72]:
# Parameters
n_ceps = 60 # number of ceptral coefficients (implicit in extractor)
silence_removal_ratio = .1

In [73]:
subfolders = train_subfls
ground_truth = train

# initialize feature matrix
features = []
y = np.zeros((len(ground_truth),))
print("Extracting features for training stage.")

vad = preprocessor.Energy_Thr(ratio_threshold=silence_removal_ratio)
cepstrum = extractor.Cepstral()

k = 0
start_time = time.clock()

n_subfls = len(subfolders)
for folder in subfolders[0:n_subfls]:
    print(folder, end=", ")
    folder = "".join((WAV_FOLDER,folder,'/'))
    # print(folder)
    f_list = os.listdir(folder)
    # print(f_list)
    for f_name in f_list:
        # ground truth
        try: 
            # 根据train中file(路径)字段，提取相应.wav文件的source(spoof or human)字段
            label = ground_truth[ground_truth.file==f_name[:-len(EXT)]].source.values[0]
        except IndexError:
            continue
        y[k] = label_2_class[label]
        # silence removal
        # print(f_name)
        # print(vad)
        x = read(folder+f_name)
        vad_data = vad(x)

        if vad_data:
            if not vad_data[2].max():
                vad = preprocessor.Energy_Thr(ratio_threshold=silence_removal_ratio*.8)
                vad_data = vad(x)
                vad = preprocessor.Energy_Thr(ratio_threshold=silence_removal_ratio)
            # MFCC extraction 
            mfcc = cepstrum(vad_data)
            features.append(mfcc)
            k += 1
        else :
            features.append(None)

Xf = np.array(features)
print(k,"files processed in",(time.clock()-start_time)/60,"minutes.")

Extracting features for training stage.
T1, bob.bio.spear@2020-01-03 13:09:30,422 -- INFO: After thresholded Energy-based VAD there are 209 frames remaining over 407
bob.bio.spear@2020-01-03 13:09:30,528 -- INFO: After thresholded Energy-based VAD there are 143 frames remaining over 273
bob.bio.spear@2020-01-03 13:09:30,601 -- INFO: After thresholded Energy-based VAD there are 140 frames remaining over 460


  del sys.path[0]


bob.bio.spear@2020-01-03 13:09:30,676 -- INFO: After thresholded Energy-based VAD there are 194 frames remaining over 351
bob.bio.spear@2020-01-03 13:09:30,774 -- INFO: After thresholded Energy-based VAD there are 116 frames remaining over 288
bob.bio.spear@2020-01-03 13:09:30,835 -- INFO: After thresholded Energy-based VAD there are 176 frames remaining over 411
bob.bio.spear@2020-01-03 13:09:30,925 -- INFO: After thresholded Energy-based VAD there are 93 frames remaining over 245
bob.bio.spear@2020-01-03 13:09:30,974 -- INFO: After thresholded Energy-based VAD there are 117 frames remaining over 249
bob.bio.spear@2020-01-03 13:09:31,034 -- INFO: After thresholded Energy-based VAD there are 140 frames remaining over 231
bob.bio.spear@2020-01-03 13:09:31,104 -- INFO: After thresholded Energy-based VAD there are 126 frames remaining over 287
bob.bio.spear@2020-01-03 13:09:31,169 -- INFO: After thresholded Energy-based VAD there are 183 frames remaining over 407
bob.bio.spear@2020-01-03 

bob.bio.spear@2020-01-03 13:09:35,280 -- INFO: After thresholded Energy-based VAD there are 173 frames remaining over 485
bob.bio.spear@2020-01-03 13:09:35,370 -- INFO: After thresholded Energy-based VAD there are 244 frames remaining over 454
bob.bio.spear@2020-01-03 13:09:35,492 -- INFO: After thresholded Energy-based VAD there are 145 frames remaining over 245
bob.bio.spear@2020-01-03 13:09:35,566 -- INFO: After thresholded Energy-based VAD there are 119 frames remaining over 315
bob.bio.spear@2020-01-03 13:09:35,628 -- INFO: After thresholded Energy-based VAD there are 108 frames remaining over 219
bob.bio.spear@2020-01-03 13:09:35,683 -- INFO: After thresholded Energy-based VAD there are 93 frames remaining over 332
bob.bio.spear@2020-01-03 13:09:35,733 -- INFO: After thresholded Energy-based VAD there are 112 frames remaining over 228
bob.bio.spear@2020-01-03 13:09:35,792 -- INFO: After thresholded Energy-based VAD there are 208 frames remaining over 402
bob.bio.spear@2020-01-03 

bob.bio.spear@2020-01-03 13:09:40,452 -- INFO: After thresholded Energy-based VAD there are 142 frames remaining over 451
bob.bio.spear@2020-01-03 13:09:40,526 -- INFO: After thresholded Energy-based VAD there are 115 frames remaining over 208
bob.bio.spear@2020-01-03 13:09:40,585 -- INFO: After thresholded Energy-based VAD there are 79 frames remaining over 211
bob.bio.spear@2020-01-03 13:09:40,628 -- INFO: After thresholded Energy-based VAD there are 222 frames remaining over 523
bob.bio.spear@2020-01-03 13:09:40,740 -- INFO: After thresholded Energy-based VAD there are 141 frames remaining over 304
bob.bio.spear@2020-01-03 13:09:40,812 -- INFO: After thresholded Energy-based VAD there are 137 frames remaining over 262
bob.bio.spear@2020-01-03 13:09:40,881 -- INFO: After thresholded Energy-based VAD there are 89 frames remaining over 241
bob.bio.spear@2020-01-03 13:09:40,928 -- INFO: After thresholded Energy-based VAD there are 113 frames remaining over 223
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:09:45,281 -- INFO: After thresholded Energy-based VAD there are 92 frames remaining over 202
bob.bio.spear@2020-01-03 13:09:45,329 -- INFO: After thresholded Energy-based VAD there are 80 frames remaining over 240
bob.bio.spear@2020-01-03 13:09:45,373 -- INFO: After thresholded Energy-based VAD there are 246 frames remaining over 402
bob.bio.spear@2020-01-03 13:09:45,494 -- INFO: After thresholded Energy-based VAD there are 147 frames remaining over 348
bob.bio.spear@2020-01-03 13:09:45,571 -- INFO: After thresholded Energy-based VAD there are 190 frames remaining over 353
bob.bio.spear@2020-01-03 13:09:45,666 -- INFO: After thresholded Energy-based VAD there are 145 frames remaining over 261
bob.bio.spear@2020-01-03 13:09:45,739 -- INFO: After thresholded Energy-based VAD there are 120 frames remaining over 218
bob.bio.spear@2020-01-03 13:09:45,801 -- INFO: After thresholded Energy-based VAD there are 134 frames remaining over 322
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:09:50,650 -- INFO: After thresholded Energy-based VAD there are 97 frames remaining over 236
bob.bio.spear@2020-01-03 13:09:50,702 -- INFO: After thresholded Energy-based VAD there are 133 frames remaining over 272
bob.bio.spear@2020-01-03 13:09:50,770 -- INFO: After thresholded Energy-based VAD there are 159 frames remaining over 310
bob.bio.spear@2020-01-03 13:09:50,850 -- INFO: After thresholded Energy-based VAD there are 125 frames remaining over 236
bob.bio.spear@2020-01-03 13:09:50,914 -- INFO: After thresholded Energy-based VAD there are 100 frames remaining over 240
bob.bio.spear@2020-01-03 13:09:50,965 -- INFO: After thresholded Energy-based VAD there are 107 frames remaining over 203
bob.bio.spear@2020-01-03 13:09:51,022 -- INFO: After thresholded Energy-based VAD there are 193 frames remaining over 553
bob.bio.spear@2020-01-03 13:09:51,122 -- INFO: After thresholded Energy-based VAD there are 84 frames remaining over 175
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:09:56,209 -- INFO: After thresholded Energy-based VAD there are 52 frames remaining over 134
bob.bio.spear@2020-01-03 13:09:56,240 -- INFO: After thresholded Energy-based VAD there are 271 frames remaining over 439
bob.bio.spear@2020-01-03 13:09:56,374 -- INFO: After thresholded Energy-based VAD there are 116 frames remaining over 254
bob.bio.spear@2020-01-03 13:09:56,434 -- INFO: After thresholded Energy-based VAD there are 138 frames remaining over 353
bob.bio.spear@2020-01-03 13:09:56,505 -- INFO: After thresholded Energy-based VAD there are 182 frames remaining over 445
bob.bio.spear@2020-01-03 13:09:56,598 -- INFO: After thresholded Energy-based VAD there are 122 frames remaining over 270
bob.bio.spear@2020-01-03 13:09:56,660 -- INFO: After thresholded Energy-based VAD there are 79 frames remaining over 183
bob.bio.spear@2020-01-03 13:09:56,702 -- INFO: After thresholded Energy-based VAD there are 165 frames remaining over 258
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:10:01,057 -- INFO: After thresholded Energy-based VAD there are 125 frames remaining over 219
bob.bio.spear@2020-01-03 13:10:01,121 -- INFO: After thresholded Energy-based VAD there are 104 frames remaining over 270
bob.bio.spear@2020-01-03 13:10:01,177 -- INFO: After thresholded Energy-based VAD there are 112 frames remaining over 370
bob.bio.spear@2020-01-03 13:10:01,236 -- INFO: After thresholded Energy-based VAD there are 139 frames remaining over 235
bob.bio.spear@2020-01-03 13:10:01,307 -- INFO: After thresholded Energy-based VAD there are 142 frames remaining over 289
bob.bio.spear@2020-01-03 13:10:01,380 -- INFO: After thresholded Energy-based VAD there are 114 frames remaining over 285
bob.bio.spear@2020-01-03 13:10:01,439 -- INFO: After thresholded Energy-based VAD there are 110 frames remaining over 228
bob.bio.spear@2020-01-03 13:10:01,497 -- INFO: After thresholded Energy-based VAD there are 239 frames remaining over 438
bob.bio.spear@2020-01-03

bob.bio.spear@2020-01-03 13:10:06,406 -- INFO: After thresholded Energy-based VAD there are 89 frames remaining over 228
bob.bio.spear@2020-01-03 13:10:06,453 -- INFO: After thresholded Energy-based VAD there are 71 frames remaining over 245
bob.bio.spear@2020-01-03 13:10:06,493 -- INFO: After thresholded Energy-based VAD there are 102 frames remaining over 279
bob.bio.spear@2020-01-03 13:10:06,547 -- INFO: After thresholded Energy-based VAD there are 229 frames remaining over 404
bob.bio.spear@2020-01-03 13:10:06,662 -- INFO: After thresholded Energy-based VAD there are 266 frames remaining over 528
bob.bio.spear@2020-01-03 13:10:06,793 -- INFO: After thresholded Energy-based VAD there are 72 frames remaining over 186
bob.bio.spear@2020-01-03 13:10:06,832 -- INFO: After thresholded Energy-based VAD there are 133 frames remaining over 360
bob.bio.spear@2020-01-03 13:10:06,901 -- INFO: After thresholded Energy-based VAD there are 103 frames remaining over 270
bob.bio.spear@2020-01-03 13

bob.bio.spear@2020-01-03 13:10:11,371 -- INFO: After thresholded Energy-based VAD there are 124 frames remaining over 287
bob.bio.spear@2020-01-03 13:10:11,435 -- INFO: After thresholded Energy-based VAD there are 168 frames remaining over 302
bob.bio.spear@2020-01-03 13:10:11,520 -- INFO: After thresholded Energy-based VAD there are 163 frames remaining over 311
bob.bio.spear@2020-01-03 13:10:11,601 -- INFO: After thresholded Energy-based VAD there are 48 frames remaining over 172
bob.bio.spear@2020-01-03 13:10:11,629 -- INFO: After thresholded Energy-based VAD there are 244 frames remaining over 414
bob.bio.spear@2020-01-03 13:10:11,751 -- INFO: After thresholded Energy-based VAD there are 318 frames remaining over 673
bob.bio.spear@2020-01-03 13:10:11,910 -- INFO: After thresholded Energy-based VAD there are 96 frames remaining over 468
bob.bio.spear@2020-01-03 13:10:11,964 -- INFO: After thresholded Energy-based VAD there are 251 frames remaining over 433
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:10:16,686 -- INFO: After thresholded Energy-based VAD there are 93 frames remaining over 215
bob.bio.spear@2020-01-03 13:10:16,735 -- INFO: After thresholded Energy-based VAD there are 113 frames remaining over 207
bob.bio.spear@2020-01-03 13:10:16,795 -- INFO: After thresholded Energy-based VAD there are 198 frames remaining over 571
bob.bio.spear@2020-01-03 13:10:16,898 -- INFO: After thresholded Energy-based VAD there are 195 frames remaining over 531
bob.bio.spear@2020-01-03 13:10:16,997 -- INFO: After thresholded Energy-based VAD there are 63 frames remaining over 251
bob.bio.spear@2020-01-03 13:10:17,034 -- INFO: After thresholded Energy-based VAD there are 255 frames remaining over 398
bob.bio.spear@2020-01-03 13:10:17,158 -- INFO: After thresholded Energy-based VAD there are 141 frames remaining over 229
bob.bio.spear@2020-01-03 13:10:17,228 -- INFO: After thresholded Energy-based VAD there are 112 frames remaining over 279
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:10:21,851 -- INFO: After thresholded Energy-based VAD there are 191 frames remaining over 349
bob.bio.spear@2020-01-03 13:10:21,946 -- INFO: After thresholded Energy-based VAD there are 102 frames remaining over 204
bob.bio.spear@2020-01-03 13:10:21,999 -- INFO: After thresholded Energy-based VAD there are 157 frames remaining over 360
bob.bio.spear@2020-01-03 13:10:22,079 -- INFO: After thresholded Energy-based VAD there are 60 frames remaining over 167
bob.bio.spear@2020-01-03 13:10:22,111 -- INFO: After thresholded Energy-based VAD there are 100 frames remaining over 204
bob.bio.spear@2020-01-03 13:10:22,162 -- INFO: After thresholded Energy-based VAD there are 113 frames remaining over 230
bob.bio.spear@2020-01-03 13:10:22,220 -- INFO: After thresholded Energy-based VAD there are 119 frames remaining over 245
bob.bio.spear@2020-01-03 13:10:22,280 -- INFO: After thresholded Energy-based VAD there are 93 frames remaining over 183
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:10:26,946 -- INFO: After thresholded Energy-based VAD there are 237 frames remaining over 590
bob.bio.spear@2020-01-03 13:10:27,066 -- INFO: After thresholded Energy-based VAD there are 203 frames remaining over 488
bob.bio.spear@2020-01-03 13:10:27,168 -- INFO: After thresholded Energy-based VAD there are 184 frames remaining over 328
bob.bio.spear@2020-01-03 13:10:27,260 -- INFO: After thresholded Energy-based VAD there are 195 frames remaining over 398
bob.bio.spear@2020-01-03 13:10:27,359 -- INFO: After thresholded Energy-based VAD there are 196 frames remaining over 552
bob.bio.spear@2020-01-03 13:10:27,460 -- INFO: After thresholded Energy-based VAD there are 288 frames remaining over 434
bob.bio.spear@2020-01-03 13:10:27,602 -- INFO: After thresholded Energy-based VAD there are 163 frames remaining over 293
bob.bio.spear@2020-01-03 13:10:27,686 -- INFO: After thresholded Energy-based VAD there are 322 frames remaining over 552
bob.bio.spear@2020-01-03

bob.bio.spear@2020-01-03 13:10:32,447 -- INFO: After thresholded Energy-based VAD there are 137 frames remaining over 419
bob.bio.spear@2020-01-03 13:10:32,520 -- INFO: After thresholded Energy-based VAD there are 95 frames remaining over 467
bob.bio.spear@2020-01-03 13:10:32,575 -- INFO: After thresholded Energy-based VAD there are 312 frames remaining over 485
bob.bio.spear@2020-01-03 13:10:32,729 -- INFO: After thresholded Energy-based VAD there are 102 frames remaining over 219
bob.bio.spear@2020-01-03 13:10:32,783 -- INFO: After thresholded Energy-based VAD there are 66 frames remaining over 204
bob.bio.spear@2020-01-03 13:10:32,819 -- INFO: After thresholded Energy-based VAD there are 64 frames remaining over 139
bob.bio.spear@2020-01-03 13:10:32,854 -- INFO: After thresholded Energy-based VAD there are 162 frames remaining over 390
bob.bio.spear@2020-01-03 13:10:32,936 -- INFO: After thresholded Energy-based VAD there are 73 frames remaining over 228
bob.bio.spear@2020-01-03 13:

bob.bio.spear@2020-01-03 13:10:37,765 -- INFO: After thresholded Energy-based VAD there are 106 frames remaining over 283
bob.bio.spear@2020-01-03 13:10:37,821 -- INFO: After thresholded Energy-based VAD there are 184 frames remaining over 379
bob.bio.spear@2020-01-03 13:10:37,916 -- INFO: After thresholded Energy-based VAD there are 161 frames remaining over 505
bob.bio.spear@2020-01-03 13:10:38,000 -- INFO: After thresholded Energy-based VAD there are 108 frames remaining over 264
bob.bio.spear@2020-01-03 13:10:38,057 -- INFO: After thresholded Energy-based VAD there are 69 frames remaining over 245
bob.bio.spear@2020-01-03 13:10:38,095 -- INFO: After thresholded Energy-based VAD there are 141 frames remaining over 230
bob.bio.spear@2020-01-03 13:10:38,167 -- INFO: After thresholded Energy-based VAD there are 322 frames remaining over 493
bob.bio.spear@2020-01-03 13:10:38,325 -- INFO: After thresholded Energy-based VAD there are 103 frames remaining over 245
bob.bio.spear@2020-01-03 

bob.bio.spear@2020-01-03 13:10:43,392 -- INFO: After thresholded Energy-based VAD there are 171 frames remaining over 242
bob.bio.spear@2020-01-03 13:10:43,477 -- INFO: After thresholded Energy-based VAD there are 295 frames remaining over 437
bob.bio.spear@2020-01-03 13:10:43,621 -- INFO: After thresholded Energy-based VAD there are 69 frames remaining over 245
bob.bio.spear@2020-01-03 13:10:43,660 -- INFO: After thresholded Energy-based VAD there are 160 frames remaining over 313
bob.bio.spear@2020-01-03 13:10:43,740 -- INFO: After thresholded Energy-based VAD there are 93 frames remaining over 195
bob.bio.spear@2020-01-03 13:10:43,788 -- INFO: After thresholded Energy-based VAD there are 90 frames remaining over 178
bob.bio.spear@2020-01-03 13:10:43,836 -- INFO: After thresholded Energy-based VAD there are 211 frames remaining over 383
bob.bio.spear@2020-01-03 13:10:43,941 -- INFO: After thresholded Energy-based VAD there are 169 frames remaining over 319
bob.bio.spear@2020-01-03 13

bob.bio.spear@2020-01-03 13:10:48,382 -- INFO: After thresholded Energy-based VAD there are 212 frames remaining over 348
bob.bio.spear@2020-01-03 13:10:48,491 -- INFO: After thresholded Energy-based VAD there are 226 frames remaining over 565
bob.bio.spear@2020-01-03 13:10:48,605 -- INFO: After thresholded Energy-based VAD there are 89 frames remaining over 233
bob.bio.spear@2020-01-03 13:10:48,654 -- INFO: After thresholded Energy-based VAD there are 227 frames remaining over 360
bob.bio.spear@2020-01-03 13:10:48,766 -- INFO: After thresholded Energy-based VAD there are 71 frames remaining over 140
bob.bio.spear@2020-01-03 13:10:48,804 -- INFO: After thresholded Energy-based VAD there are 101 frames remaining over 189
bob.bio.spear@2020-01-03 13:10:48,857 -- INFO: After thresholded Energy-based VAD there are 203 frames remaining over 420
bob.bio.spear@2020-01-03 13:10:48,959 -- INFO: After thresholded Energy-based VAD there are 205 frames remaining over 347
bob.bio.spear@2020-01-03 1

bob.bio.spear@2020-01-03 13:10:54,188 -- INFO: After thresholded Energy-based VAD there are 83 frames remaining over 182
bob.bio.spear@2020-01-03 13:10:54,233 -- INFO: After thresholded Energy-based VAD there are 197 frames remaining over 450
bob.bio.spear@2020-01-03 13:10:54,333 -- INFO: After thresholded Energy-based VAD there are 109 frames remaining over 246
bob.bio.spear@2020-01-03 13:10:54,391 -- INFO: After thresholded Energy-based VAD there are 189 frames remaining over 365
bob.bio.spear@2020-01-03 13:10:54,485 -- INFO: After thresholded Energy-based VAD there are 101 frames remaining over 203
bob.bio.spear@2020-01-03 13:10:54,538 -- INFO: After thresholded Energy-based VAD there are 216 frames remaining over 347
bob.bio.spear@2020-01-03 13:10:54,646 -- INFO: After thresholded Energy-based VAD there are 245 frames remaining over 523
bob.bio.spear@2020-01-03 13:10:54,770 -- INFO: After thresholded Energy-based VAD there are 137 frames remaining over 377
bob.bio.spear@2020-01-03 

bob.bio.spear@2020-01-03 13:10:59,535 -- INFO: After thresholded Energy-based VAD there are 134 frames remaining over 279
bob.bio.spear@2020-01-03 13:10:59,604 -- INFO: After thresholded Energy-based VAD there are 123 frames remaining over 254
bob.bio.spear@2020-01-03 13:10:59,668 -- INFO: After thresholded Energy-based VAD there are 209 frames remaining over 382
bob.bio.spear@2020-01-03 13:10:59,772 -- INFO: After thresholded Energy-based VAD there are 134 frames remaining over 261
bob.bio.spear@2020-01-03 13:10:59,839 -- INFO: After thresholded Energy-based VAD there are 138 frames remaining over 208
bob.bio.spear@2020-01-03 13:10:59,910 -- INFO: After thresholded Energy-based VAD there are 255 frames remaining over 484
bob.bio.spear@2020-01-03 13:11:00,042 -- INFO: After thresholded Energy-based VAD there are 157 frames remaining over 263
bob.bio.spear@2020-01-03 13:11:00,123 -- INFO: After thresholded Energy-based VAD there are 197 frames remaining over 385
bob.bio.spear@2020-01-03

bob.bio.spear@2020-01-03 13:11:04,763 -- INFO: After thresholded Energy-based VAD there are 96 frames remaining over 238
bob.bio.spear@2020-01-03 13:11:04,815 -- INFO: After thresholded Energy-based VAD there are 136 frames remaining over 420
bob.bio.spear@2020-01-03 13:11:04,887 -- INFO: After thresholded Energy-based VAD there are 201 frames remaining over 334
bob.bio.spear@2020-01-03 13:11:04,989 -- INFO: After thresholded Energy-based VAD there are 135 frames remaining over 297
bob.bio.spear@2020-01-03 13:11:05,057 -- INFO: After thresholded Energy-based VAD there are 171 frames remaining over 291
bob.bio.spear@2020-01-03 13:11:05,143 -- INFO: After thresholded Energy-based VAD there are 109 frames remaining over 274
bob.bio.spear@2020-01-03 13:11:05,199 -- INFO: After thresholded Energy-based VAD there are 168 frames remaining over 343
bob.bio.spear@2020-01-03 13:11:05,284 -- INFO: After thresholded Energy-based VAD there are 158 frames remaining over 445
bob.bio.spear@2020-01-03 

bob.bio.spear@2020-01-03 13:11:09,797 -- INFO: After thresholded Energy-based VAD there are 322 frames remaining over 518
bob.bio.spear@2020-01-03 13:11:09,954 -- INFO: After thresholded Energy-based VAD there are 160 frames remaining over 257
bob.bio.spear@2020-01-03 13:11:10,033 -- INFO: After thresholded Energy-based VAD there are 65 frames remaining over 228
bob.bio.spear@2020-01-03 13:11:10,070 -- INFO: After thresholded Energy-based VAD there are 245 frames remaining over 436
bob.bio.spear@2020-01-03 13:11:10,190 -- INFO: After thresholded Energy-based VAD there are 108 frames remaining over 184
bob.bio.spear@2020-01-03 13:11:10,246 -- INFO: After thresholded Energy-based VAD there are 240 frames remaining over 441
bob.bio.spear@2020-01-03 13:11:10,369 -- INFO: After thresholded Energy-based VAD there are 132 frames remaining over 261
bob.bio.spear@2020-01-03 13:11:10,435 -- INFO: After thresholded Energy-based VAD there are 48 frames remaining over 131
bob.bio.spear@2020-01-03 1



In [74]:
print(Xf[y==0].shape)

(1010,)


### Saving features

In [25]:
np.save('X1.npy',Xf)
np.save('y1.npy',y)
print('Feature and label matrices saved to disk')

Feature and label matrices saved to disk


### Loading features

In [9]:
# Load already extracter features to skip the preprocessing-extraction stage
# Xf = np.load('train_features_10.npy')
# y = np.load('y_10.npy')
Xf = np.load('X.npy',allow_pickle=True)
y = np.load('y.npy',allow_pickle=True)

In [23]:
print(Xf.shape)

(655,)


# GMM - ML Classification

## GMM Training
Train the GMMs for natural and synthetic speach. For documentation on bob.bio k-means and GMM machines see:
https://pythonhosted.org/bob.learn.em/guide.html

You can also skip the training stage and load an already trained GMM model (see cell **Loading GMM Model**).

In [75]:
# Parameters of the GMM machines
n_gaussians = 128 # number of Gaussians
max_iterats = 25 # maximum number of iterations

### GMM for natural speech 

In [76]:
# Initialize and train k-means machine: the means will initialize EM algorithm for GMM machine
start_time = time.clock()
kmeans_nat = em.KMeansMachine(n_gaussians,n_ceps)
kmeansTrainer = em.KMeansTrainer()
print(kmeans_nat.shape,Xf[y==1].shape)
em.train(kmeansTrainer, kmeans_nat, np.vstack(Xf[y==1]), max_iterations = max_iterats, convergence_threshold = 1e-5)
#kmeans_nat.means

# initialize and train GMM machine
gmm_nat = em.GMMMachine(n_gaussians,n_ceps)
trainer = em.ML_GMMTrainer(True, True, True)
gmm_nat.means = kmeans_nat.means
em.train(trainer, gmm_nat, np.vstack(Xf[y==1]), max_iterations = max_iterats, convergence_threshold = 1e-5)
#gmm_nat.save(HDF5File('gmm_nat.hdf5', 'w'))
print("Done in:", (time.clock() - start_time)/60, "minutes")
print(gmm_nat)

  


(128, 60) (300,)
Done in: 0.7205086166666661 minutes
<bob.learn.em.GMMMachine object at 0x7f9649115af0>


  from ipykernel import kernelapp as app


### GMM for synthetic speech

In [77]:
# initialize and train k-means machine: the means will initialize EM algorithm for GMM machine
start_time = time.clock()
kmeans_synt = em.KMeansMachine(n_gaussians,n_ceps)
kmeansTrainer = em.KMeansTrainer()
print(kmeans_synt.shape,Xf[y==0].shape)
em.train(kmeansTrainer, kmeans_synt, np.vstack(Xf[y==0]), max_iterations = max_iterats, convergence_threshold = 1e-5)

# initialize and train GMM machine
gmm_synt = em.GMMMachine(n_gaussians,n_ceps)
trainer = em.ML_GMMTrainer(True, True, True)
gmm_synt.means = kmeans_synt.means
em.train(trainer, gmm_synt, np.vstack(Xf[y==0]), max_iterations = max_iterats, convergence_threshold = 1e-5)
print("Done in:", (time.clock() - start_time)/60, "minutes")
#gmm_synt.save(HDF5File('gmm_synt.hdf5', 'w'))
print(gmm_synt)

  


(128, 60) (1010,)
Done in: 2.4534923833333324 minutes
<bob.learn.em.GMMMachine object at 0x7f9649c09470>


  del sys.path[0]


### Loading GMM model

In [4]:
gmm_nat = em.GMMMachine()
gmm_nat.load(HDF5File('gmm_nat.hdf5', 'r'))
gmm_synt = em.GMMMachine()
gmm_synt.load(HDF5File('gmm_synt.hdf5','r'))

In [None]:
np.save('p_gmm_ml_eval_10.npy',llr_score)
np.save('z_gmm_ml_eval_est_10.npy',z_gmm)

## GMM-ML Scoring
Extract the features for the testing data, compute the likelihood ratio test and  compute ROC AUC and estimated EER scores.

In [82]:
status = 'evalu' # 'devel'(= test) OR 'evalu'(= hold out)
start_time = time.clock()

if status == 'devel':
    subfolders = devel_subfls
    ground_truth = devel
elif status == 'evalu':
    subfolders = evalu_subfls
    ground_truth = evalu
n_subfls = len(subfolders)
# initialize score and class arrays
llr_gmm_score = np.zeros(len(ground_truth),)
z_gmm = np.zeros(len(ground_truth),)
print(status)

vad = preprocessor.Energy_Thr(ratio_threshold=.1)
cepstrum = extractor.Cepstral()

k = 0
thr = .5
speaker_list = ground_truth.folder.unique()

for speaker_id in speaker_list:
    #speaker = ground_truth[ground_truth.folder==speaker_id]
    f_list = list(ground_truth[ground_truth.folder==speaker_id].file)
    folder = "".join([WAV_FOLDER,speaker_id,'/'])
    print(f_list)
    print(speaker_id, end=',')

    for f in f_list:
        f_name = "".join([folder,f,'.wav'])
        if os.path.exists(f_name):
            x = read(f_name)
            # voice activity detection
            vad_data = vad(x)
            if vad_data:
                if not vad_data[2].max():
                    vad = preprocessor.Energy_Thr(ratio_threshold=.08)
                    vad_data = vad(x)
                    vad = preprocessor.Energy_Thr(ratio_threshold=.1)
                # MFCC extraction 
                mfcc = cepstrum(vad_data)
                # Log likelihood ratio computation
                llr_gmm_score[k] = gmm_nat(mfcc)-gmm_synt(mfcc)
                z_gmm[k] = int(llr_gmm_score[k]>0)
                k += 1
        
ground_truth['z'] = ground_truth.source.map(lambda x: int(x=='human'))
ground_truth['z_gmm'] = z_gmm
ground_truth['score_gmm'] = llr_gmm_score
#print(roc_auc_score(ground_truth.z, ground_truth.z_gmm))
print(k,"files processed in",(time.clock()-start_time)/60,"minutes.")

evalu
['ELJ10001', 'ELJ10002', 'ELJ10003', 'ELJ10004', 'ELJ10005', 'ELJ10006', 'ELJ10007', 'ELJ10008', 'ELJ10009', 'ELJ10010', 'ELJ10011', 'ELJ10012', 'ELJ10013', 'ELJ10014', 'ELJ10015', 'ELJ10016', 'ELJ10017', 'ELJ10018', 'ELJ10019', 'ELJ10020']
ELJ1,bob.bio.spear@2020-01-03 14:17:37,622 -- INFO: After thresholded Energy-based VAD there are 239 frames remaining over 338
bob.bio.spear@2020-01-03 14:17:37,749 -- INFO: After thresholded Energy-based VAD there are 168 frames remaining over 255


  


bob.bio.spear@2020-01-03 14:17:37,840 -- INFO: After thresholded Energy-based VAD there are 284 frames remaining over 302
bob.bio.spear@2020-01-03 14:17:37,988 -- INFO: After thresholded Energy-based VAD there are 222 frames remaining over 283
bob.bio.spear@2020-01-03 14:17:38,105 -- INFO: After thresholded Energy-based VAD there are 115 frames remaining over 234
bob.bio.spear@2020-01-03 14:17:38,168 -- INFO: After thresholded Energy-based VAD there are 171 frames remaining over 260
bob.bio.spear@2020-01-03 14:17:38,260 -- INFO: After thresholded Energy-based VAD there are 167 frames remaining over 275
bob.bio.spear@2020-01-03 14:17:38,350 -- INFO: After thresholded Energy-based VAD there are 183 frames remaining over 255
bob.bio.spear@2020-01-03 14:17:38,447 -- INFO: After thresholded Energy-based VAD there are 197 frames remaining over 296
bob.bio.spear@2020-01-03 14:17:38,552 -- INFO: After thresholded Energy-based VAD there are 194 frames remaining over 265
bob.bio.spear@2020-01-03



In [85]:
# Performance evaluation

#实际humans个数
humans = sum(ground_truth.z==1)
print("实际humans个数:", humans)

#实际spoofed个数
spoofed = sum(ground_truth.z==0)
print("实际spoofed个数:", spoofed)

#预测humans个数
pre_humans = sum(ground_truth.z_gmm==1)
print("预测humans个数:", pre_humans)

#预测spoofed个数
pre_spoofed = sum(ground_truth.z_gmm==0)
print("预测spoofed个数:", pre_spoofed)

tp = 0
tn = 0
fp = 0
fn = 0
for i in ground_truth.index:
    if ground_truth.z[i] == 1 and ground_truth.z_gmm[i]==1:
        tp = tp + 1
    elif ground_truth.z[i] == 1 and ground_truth.z_gmm[i]==0:
        fn = fn + 1
    elif ground_truth.z[i] == 0 and ground_truth.z_gmm[i]==0:
        tn = tn + 1
    elif ground_truth.z[i] == 0 and ground_truth.z_gmm[i]==1:
        fp = fp + 1
print("TP_Num:", tp)
print("TN_Num:", tn)
print("FP_Num:", fp)
print("FN_Num:", fn)

tpr = tp/(tp+fn)
fpr = fp/(fp+tn)
tnr = tn/(fp+tn)
fnr = fn/(tp+fn)
acc = (tp+tn)/(tp+fp+tn+fn)
print("TPR:", tpr)
print("FPR:", fpr)
print("TNR:", tnr)
print("FNR:", fnr)
print("Accuracy:", acc)

# humans = z_gmm[z_dvl==0]
# spoofed = z_gmm[z_dvl==1]
# fnr = 100*(1-(humans<thr).sum()/len(humans))
# fpr = 100*(1-(spoofed>=thr).sum()/len(spoofed))
# print("ROC AUC score:", roc_auc_score(z_dvl,z_gmm))
# print("False negative rate %:", fnr)
# print("False positive rate %:", fpr)
# print("EER %: <=", (fnr+fpr)/2)

实际humans个数: 0
实际spoofed个数: 20
预测humans个数: 11
预测spoofed个数: 9
TP_Num: 0
TN_Num: 9
FP_Num: 11
FN_Num: 0


ZeroDivisionError: division by zero

In [47]:
ground_truth

Unnamed: 0,folder,file,method,source,z,z_gmm,score_gmm
2597,D1,D1_1002598,human,human,1,1.0,1.447147
2598,D1,D1_1002599,human,human,1,1.0,2.336380
2599,D1,D1_1002600,human,human,1,1.0,3.082637
2600,D1,D1_1002601,human,human,1,1.0,3.716948
2601,D1,D1_1002602,human,human,1,1.0,0.914256
...,...,...,...,...,...,...,...
7767,D3,D3_1007768,S1,spoof,0,0.0,-2.641017
7768,D3,D3_1007769,S2,spoof,0,0.0,-0.571958
7769,D3,D3_1007770,S3,spoof,0,0.0,-5.441518
7770,D3,D3_1007771,S4,spoof,0,0.0,-4.762759


### EER computation

Adjust the threshold $thr$ to reduce $FNR-FPR$ for a more accurate estimate of the $EER$.

The Equal Error Rate ($EER$) is the value where the false negative rate ($FNR$) equals the false positive rate ($FPR$). It's an error metric commonly used to characterize biometric systems.

In [55]:
thr = -.115
pz = llr_gmm_score
spoofed = pz[np.array(ground_truth.z)==1]
humans = pz[np.array(ground_truth.z)==0]
fnr = 100*(humans>thr).sum()/len(humans)
fpr = 100*(spoofed<=thr).sum()/len(spoofed)
print("False negative vs positive rates %:", fnr, fpr)
print("FNR - FPR %:", fnr-fpr)
if np.abs(fnr-fpr) <.25:
    print("EER =", (fnr+fpr)/2,"%")
else:
    print("EER ~", (fnr+fpr)/2,"%")

False negative vs positive rates %: 4.701754385964913 10.0
FNR - FPR %: -5.298245614035087
EER ~ 7.350877192982456 %
