Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Whisper-style training as a new task S2T #5120

Merged
merged 72 commits into from
Sep 23, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
6273859
create task template for s2t1
pyf98 Apr 11, 2023
25cc776
create recipe for s2t1
pyf98 Apr 11, 2023
3ee7223
set default cmd
pyf98 Apr 11, 2023
be97590
set default slurm
pyf98 Apr 11, 2023
e1ace5e
add dependency check in local/path.sh
pyf98 Apr 12, 2023
00f0b19
add data prep
pyf98 Apr 12, 2023
767c850
update data prep stages
pyf98 Apr 13, 2023
fe2530c
add py files for s2t
pyf98 Apr 15, 2023
ac507ee
remove failed config
pyf98 Apr 16, 2023
0cd83cb
add train config
pyf98 Apr 17, 2023
14124c1
Merge branch 'master' into whisper-public
pyf98 Apr 17, 2023
9554286
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 17, 2023
b2d9726
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 Apr 19, 2023
5a0c46e
add inference code
pyf98 Apr 23, 2023
e1f06f6
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 Apr 23, 2023
46234a4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2023
ff4e68b
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 May 1, 2023
9ed7322
add mixed_v1
pyf98 May 3, 2023
91043d9
update gigaspeech script
pyf98 May 3, 2023
ef82b27
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2023
f585752
add aishell prep
pyf98 May 3, 2023
3152d71
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2023
08b7e9c
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 May 7, 2023
08bbabe
add data prep scripts; simplify s2t
pyf98 May 7, 2023
2f6a330
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 7, 2023
d228424
fix issue for passing symbols in cmd
pyf98 May 7, 2023
b6db03c
Update egs2/must_c_v2/s2t1/conf/tuning/train_s2t_ebf_lr1e-3_warmup5k.…
pyf98 May 11, 2023
24245cb
Merge branch 'master' into whisper-public
pyf98 May 11, 2023
7693de5
simplify s2t
pyf98 May 11, 2023
37f7122
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 11, 2023
1ed5892
update scoring
pyf98 May 13, 2023
f1d1c88
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 May 13, 2023
79ffebc
add prep scripts
pyf98 May 13, 2023
ce4516e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2023
3e4b797
add data prep and config
pyf98 May 13, 2023
17ec129
Merge branch 'whisper-public' of github.com:pyf98/espnet into whisper…
pyf98 May 13, 2023
b21826d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2023
19965b8
add second version
pyf98 May 16, 2023
9bcdd73
add mixed_v2 scripts
pyf98 May 30, 2023
94d3ca5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 30, 2023
3c4000e
Merge branch 'master' into whisper-public
pyf98 Jun 5, 2023
fa097d7
add configs in v2
pyf98 Jun 5, 2023
79408d7
Merge branch 'whisper-public' of github.com:pyf98/espnet into whisper…
pyf98 Jun 5, 2023
63ec788
add layer drop in transformer encoder
pyf98 Jul 5, 2023
778fbd3
add v3
pyf98 Jul 19, 2023
a032c3b
fix conflicts
pyf98 Jul 19, 2023
4e029d6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 19, 2023
d6d064c
remove version check
pyf98 Jul 19, 2023
515f239
merge master
pyf98 Jul 27, 2023
6be0143
update run.sh
pyf98 Jul 27, 2023
b3ed20c
add LID inference code
pyf98 Jul 27, 2023
9db3528
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 27, 2023
3c70493
add s2t to pack
pyf98 Jul 29, 2023
ad7aa6c
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 Jul 29, 2023
dc61487
Merge branch 'master' of github.com:espnet/espnet into whisper-public
pyf98 Jul 29, 2023
2ec7cc4
update v2 and v3
pyf98 Jul 29, 2023
5b394c1
add readme in v2 and v3
pyf98 Jul 29, 2023
d564648
recover preprocessor
Jul 29, 2023
43f20e0
fix too long line
pyf98 Jul 29, 2023
c4cbeec
fix format
pyf98 Jul 29, 2023
44f62ba
fix python format
pyf98 Jul 30, 2023
eb31120
add tests
pyf98 Jul 30, 2023
da20167
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2023
af0d786
init s2t1 in mini_an4
pyf98 Jul 30, 2023
59276cc
Merge branch 'whisper-public' of github.com:pyf98/espnet into whisper…
pyf98 Jul 30, 2023
d3100b2
add integration tests
pyf98 Jul 30, 2023
77f4cfa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2023
2750f35
fix python format
pyf98 Jul 30, 2023
496f3f3
fix shell
pyf98 Jul 30, 2023
dba0d76
fix shell error
pyf98 Jul 30, 2023
d42e241
remove transducer from s2t model
pyf98 Jul 30, 2023
8f70993
fix error in test
pyf98 Jul 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
110 changes: 110 additions & 0 deletions egs2/TEMPLATE/s2t1/cmd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# ====== About run.pl, queue.pl, slurm.pl, and ssh.pl ======
# Usage: <cmd>.pl [options] JOB=1:<nj> <log> <command...>
# e.g.
# run.pl --mem 4G JOB=1:10 echo.JOB.log echo JOB
#
# Options:
# --time <time>: Limit the maximum time to execute.
# --mem <mem>: Limit the maximum memory usage.
# -–max-jobs-run <njob>: Limit the number parallel jobs. This is ignored for non-array jobs.
# --num-threads <ngpu>: Specify the number of CPU core.
# --gpu <ngpu>: Specify the number of GPU devices.
# --config: Change the configuration file from default.
#
# "JOB=1:10" is used for "array jobs" and it can control the number of parallel jobs.
# The left string of "=", i.e. "JOB", is replaced by <N>(Nth job) in the command and the log file name,
# e.g. "echo JOB" is changed to "echo 3" for the 3rd job and "echo 8" for 8th job respectively.
# Note that the number must start with a positive number, so you can't use "JOB=0:10" for example.
#
# run.pl, queue.pl, slurm.pl, and ssh.pl have unified interface, not depending on its backend.
# These options are mapping to specific options for each backend and
# it is configured by "conf/queue.conf" and "conf/slurm.conf" by default.
# If jobs failed, your configuration might be wrong for your environment.
#
#
# The official documentation for run.pl, queue.pl, slurm.pl, and ssh.pl:
# "Parallelization in Kaldi": http://kaldi-asr.org/doc/queue.html
# =========================================================~


# Select the backend used by run.sh from "local", "stdout", "sge", "slurm", or "ssh"
cmd_backend='local'

# Local machine, without any Job scheduling system
if [ "${cmd_backend}" = local ]; then

# The other usage
export train_cmd="run.pl"
# Used for "*_train.py": "--gpu" is appended optionally by run.sh
export cuda_cmd="run.pl"
# Used for "*_recog.py"
export decode_cmd="run.pl"

# Local machine logging to stdout and log file, without any Job scheduling system
elif [ "${cmd_backend}" = stdout ]; then

# The other usage
export train_cmd="stdout.pl"
# Used for "*_train.py": "--gpu" is appended optionally by run.sh
export cuda_cmd="stdout.pl"
# Used for "*_recog.py"
export decode_cmd="stdout.pl"


# "qsub" (Sun Grid Engine, or derivation of it)
elif [ "${cmd_backend}" = sge ]; then
# The default setting is written in conf/queue.conf.
# You must change "-q g.q" for the "queue" for your environment.
# To know the "queue" names, type "qhost -q"
# Note that to use "--gpu *", you have to setup "complex_value" for the system scheduler.

export train_cmd="queue.pl"
export cuda_cmd="queue.pl"
export decode_cmd="queue.pl"


# "qsub" (Torque/PBS.)
elif [ "${cmd_backend}" = pbs ]; then
# The default setting is written in conf/pbs.conf.

export train_cmd="pbs.pl"
export cuda_cmd="pbs.pl"
export decode_cmd="pbs.pl"


# "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

export train_cmd="slurm.pl"
export cuda_cmd="slurm.pl"
export decode_cmd="slurm.pl"

elif [ "${cmd_backend}" = ssh ]; then
# You have to create ".queue/machines" to specify the host to execute jobs.
# e.g. .queue/machines
# host1
# host2
# host3
# Assuming you can login them without any password, i.e. You have to set ssh keys.

export train_cmd="ssh.pl"
export cuda_cmd="ssh.pl"
export decode_cmd="ssh.pl"

# This is an example of specifying several unique options in the JHU CLSP cluster setup.
# Users can modify/add their own command options according to their cluster environments.
elif [ "${cmd_backend}" = jhu ]; then

export train_cmd="queue.pl --mem 2G"
export cuda_cmd="queue-freegpu.pl --mem 2G --gpu 1 --config conf/queue.conf"
export decode_cmd="queue.pl --mem 4G"

else
echo "$0: Error: Unknown cmd_backend=${cmd_backend}" 1>&2
return 1
fi
2 changes: 2 additions & 0 deletions egs2/TEMPLATE/s2t1/conf/fbank.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--sample-frequency=16000
--num-mel-bins=80
11 changes: 11 additions & 0 deletions egs2/TEMPLATE/s2t1/conf/pbs.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Default configuration
command qsub -V -v PATH -S /bin/bash
option name=* -N $0
option mem=* -l mem=$0
option mem=0 # Do not add anything to qsub_opts
option num_threads=* -l ncpus=$0
option num_threads=1 # Do not add anything to qsub_opts
option num_nodes=* -l nodes=$0:ppn=1
default gpu=0
option gpu=0
option gpu=* -l ngpus=$0
1 change: 1 addition & 0 deletions egs2/TEMPLATE/s2t1/conf/pitch.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
--sample-frequency=16000
12 changes: 12 additions & 0 deletions egs2/TEMPLATE/s2t1/conf/queue.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Default configuration
command qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64*
option name=* -N $0
option mem=* -l mem_free=$0,ram_free=$0
option mem=0 # Do not add anything to qsub_opts
option num_threads=* -pe smp $0
option num_threads=1 # Do not add anything to qsub_opts
option max_jobs_run=* -tc $0
option num_nodes=* -pe mpi $0 # You must set this PE as allocation_rule=1
default gpu=0
option gpu=0
option gpu=* -l gpu=$0 -q g.q
14 changes: 14 additions & 0 deletions egs2/TEMPLATE/s2t1/conf/slurm.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Default configuration
command sbatch --export=PATH
option name=* --job-name $0
option time=* --time $0
option mem=* --mem-per-cpu $0
option mem=0
option num_threads=* --cpus-per-task $0
option num_threads=1 --cpus-per-task 1
option num_nodes=* --nodes $0
default gpu=0
option gpu=0 -p cpu
option gpu=* -p gpu --gres=gpu:$0 -c $0 # Recommend allocating more CPU than, or equal to the number of GPU
# note: the --max-jobs-run option is supported as a special case
# by slurm.pl and you don't have to handle it in the config file.
1 change: 1 addition & 0 deletions egs2/TEMPLATE/s2t1/db.sh
Empty file.
23 changes: 23 additions & 0 deletions egs2/TEMPLATE/s2t1/path.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
MAIN_ROOT=$PWD/../../..

export PATH=$PWD/utils/:$PATH
export LC_ALL=C

if [ -f "${MAIN_ROOT}"/tools/activate_python.sh ]; then
. "${MAIN_ROOT}"/tools/activate_python.sh
else
echo "[INFO] "${MAIN_ROOT}"/tools/activate_python.sh is not present"
fi
. "${MAIN_ROOT}"/tools/extra_path.sh

export OMP_NUM_THREADS=1

# NOTE(kan-bayashi): Use UTF-8 in Python to avoid UnicodeDecodeError when LC_ALL=C
export PYTHONIOENCODING=UTF-8

# You need to change or unset NCCL_SOCKET_IFNAME according to your network environment
# https://docs.nvidia.com/deeplearning/sdk/nccl-developer-guide/docs/env.html#nccl-socket-ifname
export NCCL_SOCKET_IFNAME="^lo,docker,virbr,vmnet,vboxnet"

# NOTE(kamo): Source at the last to overwrite the setting
. local/path.sh
1 change: 1 addition & 0 deletions egs2/TEMPLATE/s2t1/pyscripts