Skip to content

zhenghuatan/GMM-UBM_MAP_SV

Repository files navigation

Python code for GMM-UBM and MAP adaptation based speaker verification

Citation:
[1] Z.-H. Tan, A.k. Sarkara and N. Dehakb, "rVAD: an unsupervised segment-based robust voice activity detection method," Computer Speech and Language, 2019.
where speaker verification is used as one down-stream application of VAD. 

Code was tested on python 2.7
0/workflow of code:  

 feature extraction -->> GMM-UBM-training -->> GMM-UBM+MAP (target model) -->>  Scoring [log likelihood ratio]



(1)/ Feature extraction(MFCC+rasta, vad, cmn):
=========================================================================
     -1.1) First create  the list file for feature extraction i.e.  "feat.lst"
      Contents:
      [1st column] -> source wave file, [2nd column] - > destination feature file

      e.g.
      wav/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.htk
      wav/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.htk

     1.2) run the following command in "Bash shell"
     >> OMP_NUM_THREADS=1 python featureExtract.py 

       

      [#] Change the following parameters as per your requirement for the feature extraction in "featureExtract.py"
      e.g. (default)
      winlen, ovrlen, pre_coef, nfilter, nftt = 0.025, 0.01, 0.97, 20, 512  #[window size (sec)], [frame shift(sec)], [pre-emp coeff],
                                                                             #[no. of filter in MFCC], [N-point FFT]
      [#] If you don't like to apply the "default RASTA filtering" on MFFC
           -please "comment the following line in "mfcc.py"
            t=rastaFilter(t).T
            and make "t=t.T" 

      [#] Default vad: energy threshold i.e. "opts==1"
           -To incorporate "rVAD label generated by matlab" .. 
           - please make "opts==0" and  then follow the instruction to plugin the vad file inside the code "featureExtract.py"

      [#] To discard VAD
          - put "opts= value except 0 or 1" e.g. "opts==3"

      [#] To discard "cmn", comment the folowing line in "featureExtract.py"
          - f=cmvn(f) i.e. "#f=cmvn(f)"
      
        

(2)/GMM-UBM-training
================================================================================
     2.1) First, create the list file for the "GMM/UBM" training data  i.e. "UBM.lst"
          e.g. [each row contents  the feature  file]

         feat/TIMIT/TEST/DR1/FAKS0/SA1.htk
         feat/TIMIT/TEST/DR1/FAKS0/SA2.htk
         feat/TIMIT/TEST/DR1/FAKS0/SI1573.htk
         feat/TIMIT/TEST/DR1/FAKS0/SI2203.htk

        **Importante note: it first discards the "only single frame/feature vector" before start "UBM training".
                         -Due to the different way of indexing  "array/matrix" element in python
         
    2.2) run the following command in "Bash shell"      
         >> OMP_NUM_THREADS=1 python GMMtrn.py

   [#] Default parameter(edit the following parameters as per your requirement, different way of training GMM) in "GMMtrn.py"
    nmix, dsfactor, rmd, emIter =4, 10, 0, 5 #[mixture power of 2], dfactor= decimination of frames during itermediate UBM training/file (speed up),[EM i
ter]
                                           # rmd =1 ;  1) randomize frames  --> 2) decimination [llh may not increasing in EM for interm. model]         

  [#] Default directory of saving GMM (change it as per your requirement)
      ubmDir= 'GMM' + str(nmix)  


(3)/ GMM-UBM+MAP (target model) 
=================================================================================
    3.1) First, prepare the list file for the target model derived  from UBM i.e. "target.ndx"
         e.g.[1st column] --> target model id, [2nd column] --> feature file
 
          m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084154554_m0001_31.htk
          m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084155412_m0001_31.htk
          m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156114_m0001_31.htk
          m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156879_m0001_32.htk
          m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084157752_m0001_32.htk
          m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084158439_m0001_32.htk
          m0001_33,feat/reddots_r2015q4_v1/pcm/m0001/20150130084159156_m0001_33.htk

          **Importante note: make sure none of the file contents "only single frame/feature vector".
                         Please "discard those files from the list" or "duplicate the frame at least twice"
                         -otherwise error will occur due to the different way of indexing  "array/matrix" element in python

    3.2) run the following command in "Bash shell"
         >> OMP_NUM_THREADS=1 python TargetTRN.py

   [#] Default parameter in [MAP] (please change it as per your requirement in "TragetTRN.py")
       MapItr, Tau =3, 10.0 #[no of MAP iteration], [value of relevance factor]

   [#] Default UBM model store in the current director with the folder name e.g GMM512 (change it per your requirement)
       ubmDir= 'GMM' + str(nmix)


(4)/  Scoring [log likelihood ratio]
=============================================================================
  4.1) First, prepare the trail list file i.e. "m_part_01.ndx"
       e.g. [1st column] -claimant model id, [2nd column] --> test trial feature file

       m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213253016_m0001_36.htk 
       m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213254935_m0001_32.htk 
       m0060_40,feat/reddots_r2015q4_v1/pcm/m0067/20150611185843833_m0067_36.htk

       **Importante note: make sure none of the file contents "only single frame/feature vector".
                         Please "discard those file from list" or "duplicate the frame at least twice"
                         -otherwise error will occur due to the different way of indexing  "array/matrix" element in python
 
  4.2) Set number of thread for parallel scoring (default)
        CORES=2

  4.3) Set the score file "name and directory" (default)
       Scorefile='score.txt'   #output file : scores


  4.4) run the following command in "Bash shell"
       >> OMP_NUM_THREADS=1 python Scoring.py

About

Python code for training and testing of GMM-UBM and maximum a posterirori (MAP) adaptation based speaker verification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages