Marian models broken with latest transformers #26271

orendar · 2023-09-19T18:49:09Z

System Info

Transformers >= 4.31(and maybe also older), including master.

Who can help?

@Joaoga I noticed that you merged an update to Marian models.

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Thank you for your great work :)

I can confirm that pretrained Marian models are broken on transformers 4.31 and on master, and that they are working on 4.28.

For reproduction see for example the following snippet, taken from Helsinki-NLP/opus-mt-tc-big-he-en model card - on 4.28 I get the expected output, on 4.31 and master I get random gibberish. I don't know if this also happens with other models and languages but I suspect so.

from transformers import MarianMTModel, MarianTokenizer
import torch

src_text = [
    "היא שכחה לכתוב לו.",
    "אני רוצה לדעת מיד כשמשהו יקרה."
]

model_name = "Helsinki-NLP/opus-mt-tc-big-he-en"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name, torch_dtype=torch.bfloat16).to('cuda')
translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True).to('cuda'))

for t in translated:
    print(tokenizer.decode(t, skip_special_tokens=True))

# expected output:
#     She forgot to write to him.
#     I want to know as soon as something happens.

Expected behavior

See above

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2023-09-19T19:34:24Z

Hey, this is related to Helsinki-NLP/Tatoeba-Challenge#35 and a duplicate of #26216. The conversion script was a bit faulty and models need a re-upload. I can try to open PRs for relevant models

orendar · 2023-09-20T06:26:39Z

That would be greatly appreciated, thank you! Right now I am running on transformers 4.28 but I am sure many others are interested in using the 1k+ translation models on latest version.

ArthurZucker · 2023-10-20T14:31:49Z

I pushed a lot of PR on the hub, if you are still seeing this feel free to ping me. I think some of the big models were not updated yet, we'll see

github-actions · 2023-11-14T08:05:21Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

mgrbyte · 2024-02-14T13:01:30Z

Hello, we have Marian-converted models that are experiencing this issue (1 and 2).

Could anyone recommend a solution please?

These models are known to work with transformers versions greater than 4.26.1 up to 4.30.2, after which garbage is returned by both using the model's generate() method and the transformers.pipeline.

Many thanks!

Edit: Sorry, just found #26216 which seems to be the correct place for this!

ArthurZucker · 2024-02-15T13:42:45Z

no worries, I think @LysandreJik is planning to finish updating the models that were missed in the great refactor!

ArthurZucker mentioned this issue Oct 2, 2023

Wrong translation of the Helsinki-NLP/opus-mt-es-en encoder and decoder in ONNX #26523

Closed

4 tasks

Disastorm mentioned this issue Oct 7, 2023

At least one model's inference seems to have broken from transformers 4.29.2 -> 4.30.* #24657

Closed

4 tasks

huggingface deleted a comment from github-actions bot Oct 20, 2023

github-actions bot closed this as completed Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marian models broken with latest transformers #26271

Marian models broken with latest transformers #26271

orendar commented Sep 19, 2023

ArthurZucker commented Sep 19, 2023

orendar commented Sep 20, 2023

ArthurZucker commented Oct 20, 2023

github-actions bot commented Nov 14, 2023

mgrbyte commented Feb 14, 2024 •

edited

ArthurZucker commented Feb 15, 2024

Marian models broken with latest transformers #26271

Marian models broken with latest transformers #26271

Comments

orendar commented Sep 19, 2023

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented Sep 19, 2023

orendar commented Sep 20, 2023

ArthurZucker commented Oct 20, 2023

github-actions bot commented Nov 14, 2023

mgrbyte commented Feb 14, 2024 • edited

ArthurZucker commented Feb 15, 2024

mgrbyte commented Feb 14, 2024 •

edited