Skip to content

beam_width & maximum_iterations in BeamSearchDecoder.call instead of init? #2087

@StarWang

Description

@StarWang

Describe the feature and the current behavior/state.
Hi TFA team, the current seq2seq beam search decoder implementation takes in beam_width & maximum_iterations during decoder initialization init() instead of dynamic calls call. This makes it difficult for tuning beam_width when there's a trained model. However, it's a common request that in practice, when we have a trained model, we would like to tune the beam_width & maximum_iterations for best inference latency and performance without retraining or tweaking the model.

Can you support this request? Or is there any way to fulfill the above request in a simple way?

Relevant information

  • Are you willing to contribute it (yes/no): no
  • Are you willing to maintain it going forward? (yes/no): no
  • Is there a relevant academic paper? (if so, where): no
  • Is there already an implementation in another framework? (if so, where): no
  • Was it part of tf.contrib? (if so, where): no

Which API type would this fall under (layer, metric, optimizer, etc.)
Addons - Seq2seq

Who will benefit with this feature?
TensorFlow Addons users who want to tune beam_width with input params beam_width during inference time (TF serving, JNI etc.), instead of tweaking models

Any other info.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions