Update x_ops.cc #51

zh794390558 · 2019-03-26T03:38:38Z

No description provided.

drpngx

Thanks!

PiperOrigin-RevId: 240686612

…els. Highlights: 1. Introducing additional terminating condition "force_last_chunk_eoc_in_topk". This option is in spirit similar to the existing "force_eos_in_topk". If force_last_chunk_eoc_in_topk set to True, when the local score of last chunk eoc is not high enough to enter the top-(k+2) extensions of the input hyp, we force it to be included. This makes it easier to terminate by last chunk eoc. Even with the rest changes, this modification was crucial to bring down the TPU WER for voice search to 6.4% from around 17.0% (see b/199517196, #70). In the unit test of beam_search_equivalence_test.py, this option helps the voice search model to terminate on the "triumphal ornaments" utterance even for the cpu beam search helper. The implementation of this logic was recently checked in for the cpu beam search helper in cl/402971620, and here we test the equivalence with real model and data. 2. Writing a new merge_hyps operation to perform exact path merging. This is achieved by maintaining the token history for active hyps, similarly to the "prev_labels" field of the Hyp structure in cpu beam search. The proposed implementation does not use any loop and shall be time efficient. Also the history is overwritten at each step, and we do not store this info across all decoding steps. The time and memory complexity for merge_hyps is of the order O(num_beams * num_hyps_per_beam^2 * target_seq_len) and in our typical RNN-T use case, num_beams=32, num_hyps_per_beam=4 or 8, target_seq_len=256 for EMBR training; such complexity shall be affordable. Note the previous merge_hyps implementation uses hash maps to approximately maintain hyp history, and it was observed that hash map collision occur frequently (see b/199517196, #51), causing the merged score to be wrong. Other changes: 3. Providing the option to disable fast gather of float32 and bool matrices by matmul, throughout the tpu beam search helper. It was observed that matmul may cause severe loss of precision, see cl/402747554 and b/199517196 (#52). 4. Switched the order of merge_hyps and determining terminating hyps. Previously we merge hyps before terminating, and thus it is possible that a last chunk eoc (last frame blank) is merged and this extension is killed, making it harder to terminate the hyp. 5. Move last chunk eoc off the beam regardless of whether it can terminate. The reason is that if a eoc extension survives pruning, it will advance frame index for RNN-T models. And if a last chunk eoc survives pruning, it will try to move frame index out of bound (though we do have safeguard for this in steps/rnnt_features.py). 6. Due to changes 4 and 5, we now handle terminations before merging hyps, and therefore terminating scores are not merged to active hyps. I think this is the correct behavior and is consistent with CPU beam search. Another advantage with this change is that we can now perform top-k selection directly on the remaining extensions (there are k * (k+2) of them) without any more eos and eoc tokens. This allows us to completely remove the confusing 2*k pre-selection logic in previous implementation. 7. Introduced new option "combine_eos_and_eoc_strategy". When both eoc and eos can terminate the hypothesis, this option specifies whether to use the higher score (what TPU beam search was using) or the lower score (what the CPU beam search has been using). PiperOrigin-RevId: 405278044

Update x_ops.cc

7759316

googlebot added the cla: yes For the CLA bot label Mar 26, 2019

Update formatting.

b7e3cea

drpngx previously approved these changes Mar 28, 2019

View reviewed changes

drpngx added the ready to pull label Mar 28, 2019

Fix typo.

d91f97e

drpngx dismissed their stale review via d91f97e March 28, 2019 00:39

drpngx added ready to pull and removed ready to pull labels Mar 28, 2019

lingvo-bot merged commit d91f97e into tensorflow:master Mar 28, 2019

lingvo-bot pushed a commit that referenced this pull request Mar 28, 2019

Merge pull request #51 from zh794390558:patch-1

4f66a39

PiperOrigin-RevId: 240686612

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update x_ops.cc #51

Update x_ops.cc #51

zh794390558 commented Mar 26, 2019

drpngx left a comment

Update x_ops.cc #51

Update x_ops.cc #51

Conversation

zh794390558 commented Mar 26, 2019

drpngx left a comment

Choose a reason for hiding this comment