Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add integration test for asr egs #114

Merged
merged 74 commits into from
Nov 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
80e0618
fix integration with ops feat
zh794390558 Sep 26, 2019
c16f21d
fix for interation test
zh794390558 Sep 26, 2019
152c792
add speech utils
zh794390558 Sep 26, 2019
95c3477
fix path of FE shell
Sep 27, 2019
28dfbf3
add kaldi prepare
Sep 27, 2019
3cbfa83
fix kaldi tools compile
zh794390558 Sep 27, 2019
5e7131e
format
zh794390558 Sep 27, 2019
813e1ea
fix path
Sep 27, 2019
dd71db4
Merge branch 'ci' of https://github.com/didi/delta into ci
Sep 27, 2019
fc56c06
add tools for kaldi
zh794390558 Sep 27, 2019
6ecf309
Merge branch 'ci' of https://github.com/didi/delta into ci
Sep 29, 2019
fc4fbb0
Merge branch 'ci' of https://github.com/didi/delta into ci
Sep 29, 2019
1b7402f
Merge branch 'ci' of https://github.com/didi/delta into ci
Sep 29, 2019
8d88070
fix kaldi install
zh794390558 Sep 30, 2019
39e1164
update hparam.py
GaryGao99 Sep 30, 2019
13d563a
add make stft features
Oct 8, 2019
ee42e09
add cmvn
Oct 8, 2019
d098564
add snip_edges to stft
Oct 11, 2019
b4fbe3f
Merge branch 'master' into ci
zh794390558 Oct 11, 2019
7702415
change delta_delta shape
Oct 12, 2019
dd5c76c
fix fbank as kaldi
Oct 14, 2019
80826f5
Merge branch 'ci' of https://github.com/didi/delta into ci
Oct 14, 2019
3a4514b
set default fbank features as kaldi
Oct 15, 2019
79bf62b
fix spectrum_test real value
Oct 15, 2019
23b4810
rm do_preemphasis2 and fix fbank_test
Oct 15, 2019
18f5f92
fix delta_delta shape
Oct 16, 2019
666f928
fix sample rate and audio data format
Oct 16, 2019
e9d1f4e
fix sr and test
Oct 16, 2019
c39bf8b
fix high-freq
Oct 22, 2019
4813aec
Merge branch 'master' into ci
zh794390558 Oct 23, 2019
232d4af
Merge branch 'master' into ci
zh794390558 Oct 30, 2019
669f991
change MAIN_ROOT to PACKAGE_ROOT_DIR
Oct 30, 2019
0c42185
Merge branch 'master' into ci
zh794390558 Oct 30, 2019
f9ff8a5
fix make_fbank on tf2.0
Oct 30, 2019
6e70b00
Merge branch 'ci' of https://github.com/didi/delta into ci
Oct 30, 2019
d524482
fix apply-cmvn
Nov 1, 2019
329e793
fix dump.sh
Nov 7, 2019
61e1688
delete print
Nov 7, 2019
3bb78b4
add mfcc features
Nov 8, 2019
5541c8a
add mfcc FE
Nov 8, 2019
b9dbec4
add make_mfcc.sh
Nov 11, 2019
4e25833
add add_noise_rir_aecres
Nov 11, 2019
789c363
fix tf import
Nov 11, 2019
34cbd44
fix loader setting
Nov 11, 2019
badc0c7
fix import get_session()
Nov 11, 2019
751e16a
fix kaldi tools install script
zh794390558 Nov 11, 2019
70b1660
fix ci
zh794390558 Nov 11, 2019
ece530f
fix plp_test
Nov 11, 2019
ddb51df
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 11, 2019
ecc8aba
Merge branch 'master' into ci
zh794390558 Nov 11, 2019
c3a9b92
fix kaldi install
zh794390558 Nov 11, 2019
d33ab52
fix
zh794390558 Nov 11, 2019
60434fd
fix makefile
Nov 11, 2019
0e8eb0f
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 11, 2019
b8ed921
fix apt tools
zh794390558 Nov 11, 2019
07ae1dd
fix kaldi install
zh794390558 Nov 11, 2019
296139d
fix
zh794390558 Nov 11, 2019
e4290bf
fix test
Nov 11, 2019
9eac2ce
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 11, 2019
b3c7457
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 11, 2019
53d93bb
fix format && params
Nov 11, 2019
be206de
fix old params
Nov 12, 2019
1a6af63
fix path error
Nov 15, 2019
62c0db6
delete old FE test files
Nov 15, 2019
49a65a2
fix delete test files
Nov 15, 2019
b69685e
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 15, 2019
ac239e0
fix test files
Nov 15, 2019
d19bdf0
fix dpl spk examples
zh794390558 Nov 15, 2019
a040ef4
fix
Nov 15, 2019
4f5c362
fix file mode
zh794390558 Nov 15, 2019
8d98607
fix get_session import
Nov 15, 2019
0a0d826
Merge branch 'ci' of https://github.com/didi/delta into ci
Nov 15, 2019
eaecf1e
fix test
zh794390558 Nov 15, 2019
749ee7d
format
zh794390558 Nov 15, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,9 @@ before_install:
- docker run -it -d --name travis_con --user root -v ${DELTA_PATH}:${DOCKER_DELTA} ${CI_IMAGE} bash
- docker exec travis_con bash -c "gcc -v && g++ -v"
- docker exec travis_con bash -c "cd ${DOCKER_DELTA}; source env.sh"
- docker exec travis_con bash -c "cd ${DOCKER_DELTA}/tools; touch test.done"
#- docker exec travis_con bash -c "cd ${DOCKER_DELTA}/tools; make basic check_install test"
- docker exec travis_con bash -c "cd ${DOCKER_DELTA}/tools; make basic check_install"
- docker exec travis_con bash -c "cd ${DOCKER_DELTA}/tools; git clone --depth=1 https://github.com/kaldi-asr/kaldi.git"
- docker exec travis_con bash -c "cd ${DOCKER_DELTA}/tools/install; bash prepare_kaldi.sh"

jobs:
include:
Expand Down
2 changes: 2 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Hui Zhang <zhtclz@foxmail.com>
Chengyun Deng <deng_chengyun@126.com>
2 changes: 0 additions & 2 deletions delta/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

import os


PACKAGE_ROOT_DIR = os.path.dirname(os.path.abspath(__file__))
12 changes: 9 additions & 3 deletions delta/data/feat/speech_feature.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,9 @@
# ==============================================================================
''' speech feat entrypoint unittest'''
import os

import numpy as np
import delta.compat as tf
from absl import logging

from delta.data.feat import speech_ops
from delta.layers.ops import py_x_ops
from delta.data.feat import python_speech_features as psf
Expand Down Expand Up @@ -86,7 +84,15 @@ def _freq_feat_graph(feat_name, **kwargs):
spec = py_x_ops.spectrum(
waveforms[:, 0],
tf.cast(sample_rate, tf.dtypes.float32),
output_type=1) #output_type: 1, power spec; 2 log power spec
window_length=0.025,
frame_length=0.010,
output_type=1,
snip_edges=1,
raw_energy=1,
preEph_coeff=0.97,
window_type='povey',
remove_dc_offset=True,
is_fbank=False) #output_type: 1, power spec; 2 log power spec
spec = tf.sqrt(spec)
# shape must be [T, D, C]
spec = tf.expand_dims(spec, -1)
Expand Down
6 changes: 2 additions & 4 deletions delta/data/feat/speech_feature_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,10 @@
''' speech feature entrypoint unittest'''
import os
from pathlib import Path

import librosa
import numpy as np
import delta.compat as tf
from absl import logging

from delta.data.feat import speech_ops
from delta.data.feat import speech_feature
from delta import PACKAGE_ROOT_DIR
Expand All @@ -42,9 +40,9 @@ def setUp(self):

package_root = Path(PACKAGE_ROOT_DIR)
self.wavfile = str(
package_root.joinpath('data/feat/python_speech_features/english.wav'))
package_root.joinpath('data/feat/python_speech_features/english.wav'))
self.featfile = str(
package_root.joinpath('data/feat/python_speech_features/english.npy'))
package_root.joinpath('data/feat/python_speech_features/english.npy'))

def tearDown(self):
''' tear down '''
Expand Down
3 changes: 1 addition & 2 deletions delta/data/feat/tf_speech_feature_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,7 @@ def setUp(self):
package_root = Path(PACKAGE_ROOT_DIR)
self.params = tffeat.speech_params(sr=8000, bins=40, cmvn=False)
self.wavpath = str(
package_root.joinpath(
'data/feat/python_speech_features/english.wav'))
package_root.joinpath('data/feat/python_speech_features/english.wav'))
self.sr_true, self.audio_true = load_wav(str(self.wavpath), sr=8000)

def test_extract_feature(self):
Expand Down
91 changes: 91 additions & 0 deletions delta/data/frontend/add_noise_end_to_end.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Copyright (C) 2017 Beijing Didi Infinity Technology and Development Co.,Ltd.
# All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

import delta.compat as tf
from delta.utils.hparam import HParams
from delta.data.frontend.read_wav import ReadWav
from delta.data.frontend.add_rir_noise_aecres import Add_rir_noise_aecres
from delta.data.frontend.write_wav import WriteWav
from delta.data.frontend.base_frontend import BaseFrontend


class AddNoiseEndToEnd(BaseFrontend):

def __init__(self, config: dict):
super().__init__(config)
self.add_noise = Add_rir_noise_aecres(config)
self.read_wav = ReadWav(config)
self.write_wav = WriteWav(config)

@classmethod
def params(cls, config=None):
"""
Set params.
:param config: contains nine optional parameters:
--sample_rate : Sample frequency of waveform data. (int, default = 16000)
--if_add_rir : If true, add rir to audio data. (bool, default = False)
--rir_filelist : FileList path of rir.(string, default = 'rirlist.scp')
--if_add_noise : If true, add random noise to audio data. (bool, default = False)
--snr_min : Minimum SNR adds to signal. (float, default = 0)
--snr_max : Maximum SNR adds to signal. (float, default = 30)
--noise_filelist : FileList path of noise.(string, default = 'noiselist.scp')
--if_add_aecres : If true, add aecres to audio data. (bool, default = False)
--aecres_filelist : FileList path of aecres.(string, default = 'aecreslist.scp')
:return: An object of class HParams, which is a set of hyperparameters as name-value pairs.
"""

sample_rate = 16000
if_add_rir = False
rir_filelist = 'rirlist.scp'
if_add_noise = False
noise_filelist = 'noiselist.scp'
snr_min = 0
snr_max = 30
if_add_aecres = False
aecres_filelist = 'aecreslist.scp'
audio_channels = 1

hparams = HParams(cls=cls)
hparams.add_hparam('sample_rate', sample_rate)
hparams.add_hparam('if_add_rir', if_add_rir)
hparams.add_hparam('if_add_noise', if_add_noise)
hparams.add_hparam('rir_filelist', rir_filelist)
hparams.add_hparam('noise_filelist', noise_filelist)
hparams.add_hparam('snr_min', snr_min)
hparams.add_hparam('snr_max', snr_max)
hparams.add_hparam('if_add_aecres', if_add_aecres)
hparams.add_hparam('aecres_filelist', aecres_filelist)
hparams.add_hparam('audio_channels', audio_channels)

if config is not None:
hparams.override_from_dict(config)

return hparams

def call(self, in_wavfile, out_wavfile):
"""
Read a clean wav return a noisy wav.
:param in_wavfile: clean wavfile path.
:param out_wavfile: noisy wavfile path.
:return: write wav opration.
"""

with tf.name_scope('add_noise_end_to_end'):
input_data, sample_rate = self.read_wav(in_wavfile)
noisy_data = self.add_noise(input_data, sample_rate) / 32768
write_op = self.write_wav(out_wavfile, noisy_data, sample_rate)

return write_op
64 changes: 64 additions & 0 deletions delta/data/frontend/add_noise_end_to_end_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright (C) 2017 Beijing Didi Infinity Technology and Development Co.,Ltd.
# All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

import os
from pathlib import Path
import delta.compat as tf
from delta.data.frontend.add_noise_end_to_end import AddNoiseEndToEnd
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
from delta import PACKAGE_ROOT_DIR


def change_file_path(scp_path, filetype, newfilePath):
with open(scp_path + filetype, 'r') as f:
s = f.readlines()
f.close()
with open(scp_path + newfilePath, 'w') as f:
for line in s:
f.write(scp_path + line)
f.close()


class AddNoiseEndToEndTest(tf.test.TestCase):

def test_add_noise_end_to_end(self):

wav_path = str(
Path(PACKAGE_ROOT_DIR).joinpath('layers/ops/data/sm1_cln.wav'))

# reset path of noise && rir
data_path = str(Path(PACKAGE_ROOT_DIR).joinpath('layers/ops/data')) + '/'
noise_file = data_path + 'noiselist_new.scp'
change_file_path(data_path, 'noiselist.scp', 'noiselist_new.scp')
rir_file = data_path + 'rirlist_new.scp'
change_file_path(data_path, 'rirlist.scp', 'rirlist_new.scp')

with self.cached_session(use_gpu=False, force_gpu=False) as sess:
config = {
'if_add_noise': True,
'noise_filelist': noise_file,
'if_add_rir': True,
'rir_filelist': rir_file
}
noisy_path = wav_path[:-4] + '_noisy.wav'
add_noise_end_to_end = AddNoiseEndToEnd.params(config).instantiate()
writewav_op = add_noise_end_to_end(wav_path, noisy_path)
sess.run(writewav_op)


if __name__ == '__main__':

tf.test.main()
100 changes: 100 additions & 0 deletions delta/data/frontend/add_rir_noise_aecres.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Copyright (C) 2017 Beijing Didi Infinity Technology and Development Co.,Ltd.
# All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

import delta.compat as tf
from delta.utils.hparam import HParams
from delta.layers.ops import py_x_ops
from delta.data.frontend.base_frontend import BaseFrontend


class Add_rir_noise_aecres(BaseFrontend):

def __init__(self, config: dict):
super().__init__(config)

@classmethod
def params(cls, config=None):
"""
Set params.
:param config: contains nine optional parameters:
--sample_rate : Sample frequency of waveform data. (int, default = 16000)
--if_add_rir : If true, add rir to audio data. (bool, default = False)
--rir_filelist : FileList path of rir.(string, default = 'rirlist.scp')
--if_add_noise : If true, add random noise to audio data. (bool, default = False)
--snr_min : Minimum SNR adds to signal. (float, default = 0)
--snr_max : Maximum SNR adds to signal. (float, default = 30)
--noise_filelist : FileList path of noise.(string, default = 'noiselist.scp')
--if_add_aecres : If true, add aecres to audio data. (bool, default = False)
--aecres_filelist : FileList path of aecres.(string, default = 'aecreslist.scp')
:return: An object of class HParams, which is a set of hyperparameters as name-value pairs.
"""

sample_rate = 16000
if_add_rir = False
rir_filelist = 'rirlist.scp'
if_add_noise = False
noise_filelist = 'noiselist.scp'
snr_min = 0
snr_max = 30
if_add_aecres = False
aecres_filelist = 'aecreslist.scp'

hparams = HParams(cls=cls)
hparams.add_hparam('sample_rate', sample_rate)
hparams.add_hparam('if_add_rir', if_add_rir)
hparams.add_hparam('if_add_noise', if_add_noise)
hparams.add_hparam('rir_filelist', rir_filelist)
hparams.add_hparam('noise_filelist', noise_filelist)
hparams.add_hparam('snr_min', snr_min)
hparams.add_hparam('snr_max', snr_max)
hparams.add_hparam('if_add_aecres', if_add_aecres)
hparams.add_hparam('aecres_filelist', aecres_filelist)

if config is not None:
hparams.override_from_dict(config)

return hparams

def call(self, audio_data, sample_rate=None):
"""
Caculate power spectrum or log power spectrum of audio data.
:param audio_data: the audio signal from which to compute spectrum. Should be an (1, N) tensor.
:param sample_rate: [option]the samplerate of the signal we working with, default is 16kHz.
:return: A float tensor of size N containing add-noise audio.
"""

p = self.config
with tf.name_scope('add_rir_noise_aecres'):
if sample_rate == None:
sample_rate = tf.constant(p.sample_rate, dtype=tf.int32)

assert_op = tf.assert_equal(
tf.constant(p.sample_rate), tf.cast(sample_rate, dtype=tf.int32))
with tf.control_dependencies([assert_op]):
sample_rate = tf.cast(sample_rate, dtype=float)
add_rir_noise_aecres_out = py_x_ops.add_rir_noise_aecres(
audio_data,
sample_rate,
if_add_rir=p.if_add_rir,
rir_filelist=p.rir_filelist,
if_add_noise=p.if_add_noise,
snr_min=p.snr_min,
snr_max=p.snr_max,
noise_filelist=p.noise_filelist,
if_add_aecres=p.if_add_aecres,
aecres_filelist=p.aecres_filelist)

return tf.squeeze(add_rir_noise_aecres_out)
Loading