[BUG] Decode with attention_lm (with or without MOE)

This issue has been reported in various ways in #302 #337 #53 #86 #299 

It is unclear whether t2t-decoder is supposed to work with attenion_lm models.

As I have observed it does not work.
Sunsequent question is, what is the ouput supposed to be ?
in a "normal" situation, the output should be the input string shifted one token right.

Can someone @lukaszkaiser @rsepassi check this issue ?

output for attention_lm is nonsense
output for attention_lm_moe is nonsense AND log carries some strange warnings.

Thanks !


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Decode with attention_lm (with or without MOE) #439

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Decode with attention_lm (with or without MOE) #439

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions