New beam-search framework: ScorerInterface, CPU/GPU float16/32/64 decoding, and new language models (SeqRNNLM and TransformerLM) #1092

ShigekiKarita · 2019-08-15T09:04:53Z

This PR refactors beam search as discussed in #489

I will implement these essential feature not to break current features

custom language model training with lm_train.py --model-module xxx
ASR decoding with custom language model asr_recog.py --lm xxx (since it is not --rnnlm anymore. I will leave the existing --rnnlm as-is.)

and I will not support the other existing features with custom LMs like ASR training with custom LMs, GPU batch decoding, streaming decoding, word LM, etc in this PR.

codecov · 2019-08-15T09:42:21Z

Codecov Report

Merging #1092 into master will increase coverage by 2.29%.
The diff coverage is 87.11%.

@@            Coverage Diff             @@
##           master    #1092      +/-   ##
==========================================
+ Coverage   76.36%   78.65%   +2.29%     
==========================================
  Files          83       91       +8     
  Lines        7810     8045     +235     
==========================================
+ Hits         5964     6328     +364     
+ Misses       1846     1717     -129

Impacted Files	Coverage Δ
espnet/nets/scorers/ctc.py	`100% <100%> (ø)`
espnet/nets/pytorch_backend/transformer/decoder.py	`95.74% <100%> (+0.62%)`	⬆️
espnet/nets/pytorch_backend/ctc.py	`96.07% <100%> (ø)`	⬆️
espnet/nets/pytorch_backend/e2e_asr_transformer.py	`82.52% <100%> (+12.68%)`	⬆️
espnet/nets/pytorch_backend/transformer/mask.py	`100% <100%> (ø)`
espnet/nets/scorers/length_bonus.py	`100% <100%> (ø)`
espnet/nets/pytorch_backend/rnn/attentions.py	`98.15% <100%> (ø)`	⬆️
espnet/nets/pytorch_backend/e2e_asr.py	`68.42% <100%> (+0.08%)`	⬆️
...pnet/nets/pytorch_backend/transformer/attention.py	`100% <100%> (ø)`	⬆️
espnet/lm/chainer_backend/lm.py	`34.58% <33.33%> (-0.05%)`	⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7323eaf...f74d810. Read the comment docs.

sw005320 · 2019-08-15T12:40:47Z

and I will not support the other existing features with custom LMs like ASR training with custom LMs, GPU batch ASR decoding, etc.

Actually, the current GPU batch ASR decoding does not do special things for rnnlm. See

espnet/espnet/lm/pytorch_backend/lm.py

Lines 134 to 146 in f1bb241

    
           def buff_predict(self, state, x, n): 
        
               if self.predictor.__class__.__name__ == 'RNNLM': 
        
                   return self.predict(state, x) 
        
               new_state = [] 
        
               new_log_y = [] 
        
               for i in range(n): 
        
                   state_i = None if state is None else state[i] 
        
                   state_i, log_y = self.predict(state_i, x[i].unsqueeze(0)) 
        
                   new_state.append(state_i) 
        
                   new_log_y.append(log_y) 
        
               return new_state, torch.cat(new_log_y)

So, you could just do the same thing for now.
I want to revisit it after #980.

ShigekiKarita · 2019-08-15T17:21:24Z

@sw005320 As there is no breaking change, I expect master is better.

sw005320 · 2019-08-15T17:22:26Z

@sw005320 As there is no breaking change, I expect master is better.

OK.

ShigekiKarita · 2019-08-15T17:46:06Z

I have finished easy ones: Transformer and length penalty because they are state-less in the current implementation. Tomorrow, I will implement the stateful ones (RNN ASR, RNN LM, CTC Prefix Score). Currently, the old and new implementations results completely same scores. However, IMO, I do not want to reimplement this local pruning with ctc_beam (its value is hard-coded)

espnet/espnet/nets/pytorch_backend/e2e_asr_transformer.py

Lines 320 to 333 in 6759e7f

    
           if lpz is not None: 
        
               local_best_scores, local_best_ids = torch.topk( 
        
                   local_att_scores, ctc_beam, dim=1) 
        
               ctc_scores, ctc_states = ctc_prefix_score( 
        
                   hyp['yseq'], local_best_ids[0], hyp['ctc_state_prev']) 
        
               local_scores = \ 
        
                   (1.0 - ctc_weight) * local_att_scores[:, local_best_ids[0]] \ 
        
                   + ctc_weight * torch.from_numpy(ctc_scores - hyp['ctc_score_prev']) 
        
               if rnnlm: 
        
                   local_scores += recog_args.lm_weight * local_lm_scores[:, local_best_ids[0]] 
        
               local_best_scores, joint_best_ids = torch.topk(local_scores, beam, dim=1) 
        
               local_best_ids = local_best_ids[:, joint_best_ids[0]] 
        
           else: 
        
               local_best_scores, local_best_ids = torch.topk(local_scores, beam, dim=1)

I have no idea to share scores between decoder and ctc modules because every decoder runs independently in my impl.

espnet/espnet/nets/pytorch_backend/beam_search.py

Lines 130 to 133 in be342ae

    
           # scoring 
        
           for k, (d, w) in dec_weights.items(): 
        
               scores[k], states[k] = d.score(hyp.yseq, hyp.states[k], x) 
        
               wscores += w * scores[k]

Does this ctc_beam really matter? If so, I will reconsider how to do that...

sw005320 · 2019-08-15T17:52:05Z

Does this ctc_beam really matter? If so, I will reconsider how to do that...

ctc_beam would have a big impact when we have large numbers of output units (e.g., Japanese, Chinese, and BPE). If we don't have it, the beam search would be significantly slower. Actually, the current batch beam search does not have ctc_beam and causes significant speed degradation for Japanese, Chinese, and BPE. One of the benefits of #980 is to fix this.

ShigekiKarita · 2019-08-15T18:31:59Z

OK. I got how to do it. For simplicity, I will add new option like pre_beam_ratio that is same to this

espnet/espnet/nets/pytorch_backend/e2e_asr_transformer.py

Line 287 in 6759e7f

ctc_beam = min(lpz.shape[-1], int(beam * CTC_SCORING_RATIO))

So first beam_search performs topk with pre-beam size on pre-decoders scores (s2s, nn-lm), then add post-decoders scores (ctc, fst-lm?), and finally topk with the full beam size.

ShigekiKarita · 2019-08-19T05:26:19Z

@kan-bayashi I have a suggestion for flake8-docstrings. I like this but it is not checked by our CI. So I want CI to do that as follows:

pip install flake8-docstrings in ci/install.sh
change ci/test_python.sh like

# --extend-ignore for wip files for flake8-docstrings
flake8 --extend-ignore=D espnet test utils

# white list of files that should support flake8-docstrings
flake8 \
  espnet/nets/beam_search.py \
  espnet/nets/lm_interface.py \
  espnet/nets/scorer_interface.py \
  ...

see how to disable flake8-docstrings temporally https://stackoverflow.com/questions/41860123/how-to-ignore-installed-flake8-plugin-easily-for-one-time

kan-bayashi · 2019-08-19T05:29:14Z

That’s a nice idea.
Maybe the blacklist is better?
Anyway, let us introduce it.

ShigekiKarita · 2019-08-19T07:00:24Z

I agree. I made the black list. No more invalid docstrings 🍺

Now, I need to fix ppl things in lm.py. Just a moment.

ShigekiKarita · 2019-08-19T10:12:34Z

I believe it is done. Additionally, I split the flake8 part in ci/test_python.sh to ci/test_flake8.sh because it also helps rapid docstring annotation in local. I also updated doc/README.md for this.

ShigekiKarita · 2019-08-19T12:15:29Z

Fixed!

espnet/bin/asr_recog.py

espnet/lm/chainer_backend/lm.py

espnet/nets/asr_interface.py

kan-bayashi

LGTM

Co-Authored-By: Tomoki Hayashi <hayashi.tomoki@g.sp.m.is.nagoya-u.ac.jp>

sw005320 · 2019-08-19T22:02:54Z

Many thanks!

ShigekiKarita added 2 commits August 15, 2019 17:36

support LMInterface in pytorch

fcdbc96

workaround lm interface for chainer

c42d148

ShigekiKarita added WIP Work in process New Features labels Aug 15, 2019

ShigekiKarita added 3 commits August 15, 2019 18:18

update lm test

62427ba

add assertions

27e7261

fix test_recog

d2ab881

ShigekiKarita added 4 commits August 15, 2019 22:31

wip beam search impl

4d4cf38

Merge branch 'master' of https://github.com/espnet/espnet into decodable

0664c7b

update ASRInterface for beam_search

f129957

add DecoderInterface

422a400

sw005320 changed the base branch from master to v.0.6.0 August 15, 2019 16:53

sw005320 added this to the v.0.6.0 milestone Aug 15, 2019

implement transformer + length-pernalty beam search

9d110d2

sw005320 changed the base branch from v.0.6.0 to master August 15, 2019 17:22

sw005320 modified the milestones: v.0.6.0, v.0.5.2 Aug 15, 2019

add score comparison

be342ae

ShigekiKarita added 5 commits August 16, 2019 12:53

add pre beam search

5f23083

support legacy rnnlm in decoder interface

d3beb7a

wip: support partial scoring

9791280

support RNN asr model in beam search API

3f2b738

refactor tests

4eb5dec

rename to pre_beam_score_key

5a402f5

fix docstrings

c7ead45

ShigekiKarita added 4 commits August 19, 2019 16:05

update centos7

7df5126

add exhaustive test in lm

56adf16

fix ppl report in lm.py

4223e87

update doc/README.md

bba713e

ShigekiKarita added 3 commits August 19, 2019 19:58

Merge branch 'master' of https://github.com/espnet/espnet into decodable

14e3942

fix flake8 in transformer.py

aad0581

fix test_lm

b788fe1