How to embedding audio stream data to k-vector (512) #9

buaapengbo · 2018-11-29T07:05:02Z

Hi, thank you for open source it !

I read your paper and tests/integration_test.py , my question is that I want to know the way you use, to embedding the audio stream data with D = 512.
Actually it's like the question here
The way you generate train data or test data from a audio stream.

Is that like librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=40) ?
In your paper, say:
In this system, audio signals are first transformed into frames of width 25ms and step 10ms, and log-mel-filterbank energies of dimension 40 are extracted from each frame as the network input. These frames form overlapping sliding windows of a fixed length, on which we run the LSTM network. The last-frame output of the LSTM is then used as the d-vector representation of this sliding window
How can I reproduce this part ~

I appreciate it, waiting for your response!
Thanks,
Bo

The text was updated successfully, but these errors were encountered:

wq2012 · 2018-11-29T15:50:59Z

The feature extraction system and d-vector system at Google are proprietary code, and cannot be open-sourced. You need to either find a third-party implementation, or use your own implementation. This repo is dedicated to the UIS-RNN system.

wq2012 added the question Further information is requested label Nov 29, 2018

wq2012 closed this as completed Nov 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to embedding audio stream data to k-vector (512) #9

How to embedding audio stream data to k-vector (512) #9

buaapengbo commented Nov 29, 2018

wq2012 commented Nov 29, 2018

How to embedding audio stream data to k-vector (512) #9

How to embedding audio stream data to k-vector (512) #9

Comments

buaapengbo commented Nov 29, 2018

wq2012 commented Nov 29, 2018