Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tedlium chain config question #914

Closed
vince62s opened this issue Jul 24, 2016 · 28 comments
Closed

Tedlium chain config question #914

vince62s opened this issue Jul 24, 2016 · 28 comments

Comments

@vince62s
Copy link
Contributor

When running the chain config script for Tedlium, I saw that run_tdnn.sh and run_ivector_common.sh are not in sync since the former call the latter with --speed-perturb true which is not defined the repo version.
Also Dan fixed this one ali_dir=tri3_ali_sp.

Now my question is: is the repo version of the Chain script the last version related to the paper, and subsequently is it the best config according to the various testing done in the SWBD repo ?

thanks.
Vincent

@galv
Copy link
Contributor

galv commented Jul 24, 2016

Sounds like the --speed-perturb thing would cause a crash. Let me check it out.

Regarding ali_dir=tri3_ali_sp, Vijay actually advised me to to use non-speed-perturbed alignments for tree building. I can't rembmer if $ali_dir was used anywhere else.

The repo version was the last version as of the first submission. We got about 1% absolute WER decrease for the camera ready submission with some data cleanup scripts by Vimal Manohar, but that hasn't made it into master. I'd recommend waiting on Dan's changes to tedlium to explore the effect of data cleanup for tedlium 2

@vince62s
Copy link
Contributor Author

okay, thanks. Just out of curiosity what was the swbd baseline script you took to start with ? 'cause now there are so many.

@galv
Copy link
Contributor

galv commented Jul 25, 2016

Initially, it was based on swbd/s5c/local/chain/run_tdnn_2y.sh. However, this used the "jesus" nonlinearity, which we later abandoned. Since, then it organically developed, partially with hints from Vijay and Dan. Are you wondering anything in particular?

@vince62s
Copy link
Contributor Author

Nothing particular, just making sure it was the best known parameters. Actually it has 7 layers right now with the current splices indexes. (run_tdnn_2y.sh has only 5).
Btw, with the same relu dim (500) and same splicing indexes ("-1,0,1 -1,0,1,2 -3,0,3 -3,0,3 -3,0,3 -6,-3,0 0"), the chain model ends up with more parameters than the nnet3 equivalent. Might be due to the output dim (?).

Anyway on tedlium2 it's giving me the best results right now (with 7.98M parameters vs 6.39 for nnet3)
%WER 12.4 | 507 17792 | 89.8 7.4 2.8 2.2 12.4 83.6 | -0.162 | exp/chain/tdnn/decode_dev/score_8_0.5/ctm.filt.filt.sys
%WER 11.6 | 507 17792 | 90.5 6.9 2.5 2.1 11.6 80.1 | -0.247 | exp/chain/tdnn/decode_dev_rescore/score_8_0.0/ctm.filt.filt.sys
%WER 11.4 | 1155 27512 | 90.2 6.8 3.0 1.6 11.4 78.5 | -0.042 | exp/chain/tdnn/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 10.4 | 1155 27512 | 91.1 6.2 2.7 1.5 10.4 74.6 | -0.167 | exp/chain/tdnn/decode_test_rescore/score_8_0.0/ctm.filt.filt.sys

numbers for nnet3 [13.2% 12.1% 12.2% 10.9%]

@danpovey
Copy link
Contributor

Cool. Hopefully the chain models will be even better once we get the
transcript clean-up finished. Tedlium [at least release 1] has extremely
noisy transcripts which will tend to affect the chain model a lot.
Dan

On Mon, Jul 25, 2016 at 4:44 AM, vince62s notifications@github.com wrote:

Nothing particular, just making sure it was the best known parameters.
Actually it has 7 layers right now with the current splices indexes.
(run_tdnn_2y.sh has only 5).
Btw, with the same relu dim (500) and same splicing indexes ("-1,0,1
-1,0,1,2 -3,0,3 -3,0,3 -3,0,3 -6,-3,0 0"), the chain model ends up with
more parameters than the nnet3 equivalent. Might be due to the output dim
(?).

Anyway on tedlium2 it's giving me the best results right now (with 7.98M
parameters vs 6.39 for nnet3)
%WER 12.4 | 507 17792 | 89.8 7.4 2.8 2.2 12.4 83.6 | -0.162 |
exp/chain/tdnn/decode_dev/score_8_0.5/ctm.filt.filt.sys
%WER 11.6 | 507 17792 | 90.5 6.9 2.5 2.1 11.6 80.1 | -0.247 |
exp/chain/tdnn/decode_dev_rescore/score_8_0.0/ctm.filt.filt.sys
%WER 11.4 | 1155 27512 | 90.2 6.8 3.0 1.6 11.4 78.5 | -0.042 |
exp/chain/tdnn/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 10.4 | 1155 27512 | 91.1 6.2 2.7 1.5 10.4 74.6 | -0.167 |
exp/chain/tdnn/decode_test_rescore/score_8_0.0/ctm.filt.filt.sys

numbers for nnet3 [13.2% 12.1% 12.2% 10.9%]


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu_S6iQxSSYHkkzWYAMv1NDz-iG70ks5qZKGggaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

vince62s commented Aug 5, 2016

I would like to try the left-biphone config with Tedlium. If my understanding is ok it is just a matter of changing the following line
utils/mkgraph.sh --left-biphone
or do I need to do more ?
thanks

@danpovey
Copy link
Contributor

danpovey commented Aug 5, 2016

You need to do more.
See diff ../../swbd/s5c/local/chain/run_tdnn_7{b,d}.sh

The key changes are, firstly if you use a shared tree-dir you should change
it, but also add the option

>       --context-opts "--context-width=2 --central-position=1" \

to steps/nnet3/chain/build_tree.sh.

On Fri, Aug 5, 2016 at 4:20 AM, vince62s notifications@github.com wrote:

I would like to try the left-biphone config with Tedlium. If my
understanding is ok it is just a matter of changing the following line
utils/mkgraph.sh --left-biphone
or do I need to do more ?
thanks


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu4Sj6hpw8GnmesCyNvGysBKNo92dks5qcxxygaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

vince62s commented Aug 7, 2016

ok left-biphone results very close to "regular" chain.

%WER 12.5 | 507 17792 | 89.6 7.2 3.2 2.2 12.5 84.8 | -0.077 | exp/chain/tdnn/decode_dev/score_9_0.0/ctm.filt.filt.sys
%WER 11.6 | 507 17792 | 90.5 6.6 2.9 2.1 11.6 82.1 | -0.214 | exp/chain/tdnn/decode_dev_rescore/score_8_0.0/ctm.filt.filt.sys
%WER 11.4 | 1155 27512 | 90.3 7.0 2.7 1.7 11.4 77.9 | -0.067 | exp/chain/tdnn/decode_test/score_8_0.0/ctm.filt.filt.sys
%WER 10.3 | 1155 27512 | 91.2 6.0 2.9 1.4 10.3 73.9 | -0.153 | exp/chain/tdnn/decode_test_rescore/score_8_0.0/ctm.filt.filt.sys

@danpovey
Copy link
Contributor

danpovey commented Aug 7, 2016

OK- it's hard to interpret these without seeing the baseline; but anyway,
please commit the change.
BTW, when I tried the left-biphone models on the AMI s5b recipe, there was
a clear improvement-- something like 0.4% absolute on average across all
the conditions.
Dan

On Sun, Aug 7, 2016 at 6:47 AM, vince62s notifications@github.com wrote:

ok left-biphone results very close to "regular" chain.

%WER 12.5 | 507 17792 | 89.6 7.2 3.2 2.2 12.5 84.8 | -0.077 |
exp/chain/tdnn/decode_dev/score_9_0.0/ctm.filt.filt.sys
%WER 11.6 | 507 17792 | 90.5 6.6 2.9 2.1 11.6 82.1 | -0.214 |
exp/chain/tdnn/decode_dev_rescore/score_8_0.0/ctm.filt.filt.sys
%WER 11.4 | 1155 27512 | 90.3 7.0 2.7 1.7 11.4 77.9 | -0.067 |
exp/chain/tdnn/decode_test/score_8_0.0/ctm.filt.filt.sys
%WER 10.3 | 1155 27512 | 91.2 6.0 2.9 1.4 10.3 73.9 | -0.153 |
exp/chain/tdnn/decode_test_rescore/score_8_0.0/ctm.filt.filt.sys


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu_Zhb9k-GVJ2d2iLBBcV3RDcscO3ks5qdeH5gaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

vince62s commented Aug 7, 2016

it was 5 comments above last one, same thread.
Not committing since there is no script for nnet3 or chain right now.
But just letting you know for when you commit your script, baseline coud become left-biphone then.

@danpovey
Copy link
Contributor

danpovey commented Aug 7, 2016

OK thanks.
I'm starting from the AMI s5b scripts when creating the TEDLIUM nnet3 and chain scripts. The AMI s5b scripts currently use left-biphone, which was quite a bit better on that setup.
Of course if the results are worse than the existing scripts we'll have to do some more investigation.

@vince62s
Copy link
Contributor Author

vince62s commented Aug 9, 2016

@danpovey @vijayaditya , I noticed that the splicing indexes copied over from the AMI s5b recipes are different from what was usually taken. it has 2 final layers 0 0. Is there any improvements using this splicing versus previous ones ?

@vijayaditya
Copy link
Contributor

IIRC TDNN in AMI recipe has more layers as I wanted to create a comparison
with a recipe involving a different neural network toolkit. I think it was
also giving me better results at that time.

--Vijay

On Tue, Aug 9, 2016 at 5:43 PM, vince62s notifications@github.com wrote:

@danpovey https://github.com/danpovey @vijayaditya
https://github.com/vijayaditya , I noticed that the splicing indexes
copied over from the AMI s5b recipes are different from what was usually
taken. it has 2 final layers 0 0. Is there any improvements using this
splicing versus previous ones ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADtwoFyb1L5hjt7WFUPtXFVIXSLu4EV2ks5qePR1gaJpZM4JTj_v
.

@danpovey
Copy link
Contributor

danpovey commented Aug 9, 2016

I just copied it over. @vince62s-- I haven't really done any tuning on
this recipe at all, I've just guessed at all the parameter settings. If
you have time to do any tuning it would be great.
I'm about to check in some changes to the nnet3 and chain TDNN setups for
tedlium s5_r2, which will include the results.

Dan

On Tue, Aug 9, 2016 at 2:49 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

IIRC TDNN in AMI recipe has more layers as I wanted to create a comparison
with a recipe involving a different neural network toolkit. I think it was
also giving me better results at that time.

--Vijay

On Tue, Aug 9, 2016 at 5:43 PM, vince62s notifications@github.com wrote:

@danpovey https://github.com/danpovey @vijayaditya
https://github.com/vijayaditya , I noticed that the splicing indexes
copied over from the AMI s5b recipes are different from what was usually
taken. it has 2 final layers 0 0. Is there any improvements using this
splicing versus previous ones ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
<https://github.com/notifications/unsubscribe-auth/
ADtwoFyb1L5hjt7WFUPtXFVIXSLu4EV2ks5qePR1gaJpZM4JTj_v>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu8kZ0qCEs3MUGU4oBbDxHIs6PoU4ks5qePXpgaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

I am running the new baseline now to see if I match the results first.
BTW, I had to comment a line in run_tdnn.sh

touch $dir/egs/.nodelete # keep egs around when that run dies.

otherwise it hangs because before first run this dir does not exist.

when done I will try various set ups

@danpovey
Copy link
Contributor

thanks, fixing that bug.

On Fri, Aug 12, 2016 at 7:11 AM, vince62s notifications@github.com wrote:

I am running the new baseline now to see if I match the results first.
BTW, I had to comment a line in run_tdnn.sh
touch $dir/egs/.nodelete # keep egs around when that run dies.

otherwise it hangs because before first run this dir does not exist.

when done I will try various set ups


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu-C9MLYaQ_leDUbkKHxkZxDkqbZaks5qfH8OgaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

hmm weird, results chain no cleanup are different from yours:
%WER 10.6 | 507 17783 | 90.8 6.5 2.7 1.5 10.6 80.1 | -0.007 | exp/chain/tdnn_sp_bi/decode_dev/score_8_0.0/ctm.filt.filt.sys
%WER 10.1 | 507 17783 | 91.0 5.5 3.5 1.1 10.1 80.3 | 0.096 | exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.5 | 1155 27500 | 90.8 6.2 3.1 1.3 10.5 76.5 | 0.048 | exp/chain/tdnn_sp_bi/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 9.9 | 1155 27500 | 91.1 5.4 3.4 1.1 9.9 73.0 | 0.073 | exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys

yours are:
%WER 11.0 | 507 17783 | 90.9 6.5 2.6 1.9 11.0 80.5 | 0.004 | exp/chain/tdnn_sp_bi/decode_dev/score_8_0.0/ctm.filt.filt.sys
%WER 10.6 | 507 17783 | 90.7 5.5 3.8 1.3 10.6 79.3 | 0.070 | exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.1 | 1155 27500 | 91.2 6.0 2.8 1.3 10.1 75.5 | -0.004 | exp/chain/tdnn_sp_bi/decode_test/score_8_0.0/ctm.filt.filt.sys
%WER 9.8 | 1155 27500 | 91.2 5.2 3.7 1.0 9.8 73.2 | 0.055 | exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys

@danpovey
Copy link
Contributor

It looks like, for, you, the LM rescoring makes much less difference.
Did you regenerate your language models based on the latest scripts?
Because I pruned the model smaller than yours, so the graphs wouldn't be as
large.
Dan

On Fri, Aug 12, 2016 at 3:05 PM, vince62s notifications@github.com wrote:

hmm weird, results chain no cleanup are different from yours:
%WER 10.6 | 507 17783 | 90.8 6.5 2.7 1.5 10.6 80.1 | -0.007 |
exp/chain/tdnn_sp_bi/decode_dev/score_8_0.0/ctm.filt.filt.sys
%WER 10.1 | 507 17783 | 91.0 5.5 3.5 1.1 10.1 80.3 | 0.096 |
exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.5 | 1155 27500 | 90.8 6.2 3.1 1.3 10.5 76.5 | 0.048 |
exp/chain/tdnn_sp_bi/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 9.9 | 1155 27500 | 91.1 5.4 3.4 1.1 9.9 73.0 | 0.073 |
exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys

yours are:
%WER 11.0 | 507 17783 | 90.9 6.5 2.6 1.9 11.0 80.5 | 0.004 |
exp/chain/tdnn_sp_bi/decode_dev/score_8_0.0/ctm.filt.filt.sys
%WER 10.6 | 507 17783 | 90.7 5.5 3.8 1.3 10.6 79.3 | 0.070 |
exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.1 | 1155 27500 | 91.2 6.0 2.8 1.3 10.1 75.5 | -0.004 |
exp/chain/tdnn_sp_bi/decode_test/score_8_0.0/ctm.filt.filt.sys
%WER 9.8 | 1155 27500 | 91.2 5.2 3.7 1.0 9.8 73.2 | 0.055 |
exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu3edLroR36Jqao0hoO7J9CdzVBo8ks5qfO42gaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

Yes I rebuilt everything

@vince62s
Copy link
Contributor Author

Actually i have to check the lexicon i took...

@vince62s
Copy link
Contributor Author

I think you misread, my rescoring is -0.5% and -0.6% when yours is -0.4% and -0.3%.
I checked my vocab which is different from yours.
You take the Cantab-tedlium.dct which is 150000 words (which is from the original recipe script) and I take the merger of this one and the tedlium-release 2 one, which leads to 191010 words.
It does explain the little improvement on the dev set but does not explain the worse results on the test set. the difference 10.5 versus 10.1 seems too big.

@vince62s
Copy link
Contributor Author

I think also that method is different. I guess you trained on cleaned data and decoded first cleaned data and then just decoded non cleaned data on pre trained (on cleaned) since I read this:
local/chain/run_tdnn.sh --train-set train --gmm tri3 --nnet3-affix "" --stage 20
stage 20 is just decoding.

my run was on non cleaned data all the way through, including training.

@danpovey
Copy link
Contributor

Firstly, ignore the --stage, that shouldn't have been there. I didn't do
anything weird or complicated-- I would never do that without making it
clear in the README. Anyway the cleanup only happens on training data, not
test, so what you're saying isn't feasible.

The runs aren't exactly replicable anyway, but if your LMs and lexicons are
not the same it's even less expected that it would be exactly the same.
If you can verify that some change in the lexicon-building procedure or
LM-building procedure would be better, then we can talk about making that
change.

Dan

On Sat, Aug 13, 2016 at 1:57 AM, vince62s notifications@github.com wrote:

I think also that method is different. I guess you trained on cleaned data
and decoded first cleaned data and then just decoded non cleaned data on
pre trained (on cleaned) since I read this:
local/chain/run_tdnn.sh --train-set train --gmm tri3 --nnet3-affix ""
--stage 20
stage 20 is just decoding.

my run was on non cleaned data all the way through, including training.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu0dK0AP__D-ctLuYY8amugVcQb9lks5qfYb5gaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

ok but I want to replicate your results first then.
Can I just rebuild the LM + L + G and then make the graph without retraining everything ?

@danpovey
Copy link
Contributor

That should take you most of the way there-- however, you should verify
that the phones.txt does not change (it shouldn't, just check).
But you decode with the lexicon that has pronprobs, so you'd need to rerun
the things in stage 5 of the run.sh that get the pronprobs.

The training of the models wouldn't be exactly the same, but it should be
about the same.

Dan

On Sat, Aug 13, 2016 at 1:31 PM, vince62s notifications@github.com wrote:

ok but I want to replicate your results first then.
Can I just rebuild the LM + L + G and then make the graph without
retraining everything ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu0rvosHMflKdmaLiiUntLBUKcOnDks5qfimPgaJpZM4JTj_v
.

@vince62s
Copy link
Contributor Author

very different results so re-running from scratch, I want a clean start.

@vince62s
Copy link
Contributor Author

my results, from scratch (no cleanup):
for dev set , -0.1 diff with yours before rescoring.
for test set +0.3 diff with yours before rescoring.
does this make sense ? I thought it would be the same.
the only difference is that I run on a single machine with start and final job numbers=2 at nnet training level.
%WER 10.9 | 507 17783 | 90.4 6.0 3.5 1.3 10.9 81.3 | 0.048 | exp/chain/tdnn_sp_bi/decode_dev/score_9_0.0/ctm.filt.filt.sys
%WER 10.6 | 507 17783 | 90.5 5.4 4.1 1.1 10.6 80.1 | 0.060 | exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.4 | 1155 27500 | 90.8 5.9 3.3 1.2 10.4 75.7 | 0.082 | exp/chain/tdnn_sp_bi/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 9.9 | 1155 27500 | 91.0 5.2 3.8 1.0 9.9 73.9 | 0.090 | exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys

@danpovey
Copy link
Contributor

If the num-jobs-nnet are different, we don't expect the WERs to be the same.
[even if they are the same, things aren't 100% reproducible anyway, this is
unavoidable due to the way GPUs work]. So it's OK.

Dan

On Tue, Aug 16, 2016 at 6:17 AM, vince62s notifications@github.com wrote:

my results, from scratch (no cleanup):
for dev set , -0.1 diff with yours before rescoring.
for test set +0.3 diff with yours before rescoring.
does this make sense ? I thought it would be the same.
the only difference is that I run on a single machine with start and final
job numbers=2 at nnet training level.
%WER 10.9 | 507 17783 | 90.4 6.0 3.5 1.3 10.9 81.3 | 0.048 |
exp/chain/tdnn_sp_bi/decode_dev/score_9_0.0/ctm.filt.filt.sys
%WER 10.6 | 507 17783 | 90.5 5.4 4.1 1.1 10.6 80.1 | 0.060 |
exp/chain/tdnn_sp_bi/decode_dev_rescore/score_10_0.0/ctm.filt.filt.sys
%WER 10.4 | 1155 27500 | 90.8 5.9 3.3 1.2 10.4 75.7 | 0.082 |
exp/chain/tdnn_sp_bi/decode_test/score_9_0.0/ctm.filt.filt.sys
%WER 9.9 | 1155 27500 | 91.0 5.2 3.8 1.0 9.9 73.9 | 0.090 |
exp/chain/tdnn_sp_bi/decode_test_rescore/score_10_0.0/ctm.filt.filt.sys


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#914 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu1pD4uv_DrKLVluH_naS-qm8MQjVks5qgbhogaJpZM4JTj_v
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants