Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

[BUG] Decode with attention_lm (with or without MOE) #439

@vince62s

Description

@vince62s

This issue has been reported in various ways in #302 #337 #53 #86 #299

It is unclear whether t2t-decoder is supposed to work with attenion_lm models.

As I have observed it does not work.
Sunsequent question is, what is the ouput supposed to be ?
in a "normal" situation, the output should be the input string shifted one token right.

Can someone @lukaszkaiser @rsepassi check this issue ?

output for attention_lm is nonsense
output for attention_lm_moe is nonsense AND log carries some strange warnings.

Thanks !

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions