beam_width & maximum_iterations in BeamSearchDecoder.call instead of init?

**Describe the feature and the current behavior/state.**
Hi TFA team, the current seq2seq beam search decoder implementation takes in beam_width & maximum_iterations during decoder initialization `init()` instead of dynamic calls `call`. This makes it difficult for tuning beam_width when there's a trained model. However, it's a common request that in practice, when we have a trained model, we would like to tune the beam_width & maximum_iterations for best inference latency and performance without retraining or tweaking the model.

Can you support this request? Or is there any way to fulfill the above request in a simple way?

**Relevant information**
- Are you willing to contribute it (yes/no): no
- Are you willing to maintain it going forward? (yes/no): no
- Is there a relevant academic paper? (if so, where): no
- Is there already an implementation in another framework? (if so, where): no
- Was it part of tf.contrib? (if so, where): no

**Which API type would this fall under (layer, metric, optimizer, etc.)**
Addons - Seq2seq

**Who will benefit with this feature?**
TensorFlow Addons users who want to tune beam_width with input params beam_width during inference time (TF serving, JNI etc.), instead of tweaking models

**Any other info.**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

beam_width & maximum_iterations in BeamSearchDecoder.call instead of init? #2087

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

beam_width & maximum_iterations in BeamSearchDecoder.call instead of init? #2087

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions