Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pytorch] Unexpected task example translation : text-generation instead of Translation in model card and Hub #25931

Closed
6 tasks done
SoyGema opened this issue Sep 3, 2023 · 3 comments

Comments

@SoyGema
Copy link
Contributor

SoyGema commented Sep 3, 2023

System Info

Hello there!
Thanks for making translation example with Pytorch.
馃檹馃檹 The documentation is amazing and the script is very well structured! 馃檹馃檹

System Info

- `transformers` version: 4.32.0.dev0
- Platform: macOS-13.4.1-arm64-arm-64bit
- Python version: 3.10.10
- Huggingface_hub version: 0.16.4
- Safetensors version: 0.3.2
- Accelerate version: 0.21.0
- Accelerate config:    not found
- PyTorch version (GPU?): 2.0.1 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed

Who can help?

@patil-suraj

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Context

Fine-tuning english-hindi Translation model with t5-small and opus100 dataset
Running the example run_translation.py from transformers repository.

Small modification for making the dataset a little bit smaller for testing end-to-end

Checked recommendations from README.md when using T5 family models

  • 1. Add --source_prefix flag
  • 2. Change 3 flags accordingly --source_lang , --target_lang and --source_prefix
python run_translation.py \
    --model_name_or_path t5-small \
    --do_train \
    --do_eval \
    --source_lang en \
    --target_lang hi \
    --source_prefix "translate English to Hindi: " \
    --dataset_name opus100 \
    --dataset_config_name en-hi \
    --output_dir=/tmp/english-hindi \
    --per_device_train_batch_size=4 \
    --per_device_eval_batch_size=4 \
    --overwrite_output_dir \
    --num_train_epochs=3 \
    --push_to_hub=True \
    --predict_with_generate=True
    --report_to_all
    --do_predict

Model trains correctly. It is also connected to W&B
Trace of model card . Once the model is trained

[INFO|modelcard.py:452] 2023-09-02 23:08:32,386 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Sequence-to-sequence Language Modeling', 'type': 'text2text-generation'}, 'dataset': {'name': 'opus100', 'type': 'opus100', 'config': 'en-hi', 'split': 'validation', 'args': 'en-hi'}}

The model is pushed to the HUB

Expected behavior

  • Correct task recognition and inference : Somehow the task is uploaded in Hub as text-generation and not as a translation task.
    Inference shows text-generation as well, And the model card seems to point at that too.
    During search , I visited/Read forum, but I think it makes reference to the BLEU generation metric and not the task (if Im understanding well ) I麓ve also checked Tasks docs and I think it gives you a guide on how to add a task, not change it - please let me know if I shall follow this path - and Troubleshoot page , but couldn麓t find anything.
text-generation-instead-translation

Tangential Note :
Im aware that the Bleu Score is 0 , and I tried another languages and modifying some logic in compute_metrics function , as well as trying with another language that computed BLEU well. However the model was also loaded as text-generation. If keeping the experimentation up can prove some hypothesis I might have about this logic and BLEU ( that impact languages with alphabets distinct from latin ones) I will let you know , but I made those experiments to test if the task issue was somehow related to the task

Captura de pantalla 2023-09-03 a las 10 14 15

Any help with clarifying and poiniting to translation task would be much appreciated
And if some change in the script or docs might come from this happy to contribute
Thanks for making transformers 馃 , for the time dedicated to this issue 馃暈 and have a nice day 馃!

@SoyGema SoyGema changed the title [Pytorch] Unexpected task example translation : Generation instead of Translation in model card and Hub [Pytorch] Unexpected task example translation : text-generation instead of Translation in model card and Hub Sep 3, 2023
@amyeroberts
Copy link
Collaborator

Hi @SoyGema, thanks for raising this issue!

If you want to change the task that a model is mapped to, you can do so by clicking on the Edit model card button

Screenshot 2023-09-04 at 14 52 39

and then selecting the desired pipeline_tag.

Screenshot 2023-09-04 at 14 53 26

Why this is automapped to text2text generation when the task is specified in the script, I'm not sure. However, this tag isn't technically incorrect - T5 is an encoder-decoder model and this is a text generation task. cc @Narsil @muellerzr do either of you know?

With regards to your questions about BLEU, this is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.

@Narsil
Copy link
Contributor

Narsil commented Sep 7, 2023

Why this is automapped to text2text generation when the task is specified in the script, I'm not sure. However, this tag isn't technically incorrect - T5 is an encoder-decoder model and this is a text generation task. cc @Narsil @muellerzr do either of you know?

The hub infers tasks automatically from the config.json@architectures when it's missing from the README.

@SoyGema
Copy link
Contributor Author

SoyGema commented Sep 9, 2023

Thanks for the support given in this issue. I consider this complete as the main challenge has been supported and some derivative as well. With that and the fact that I tend to own the issues I open, Im proceeding to close it. Feel free to reopen if necessary. Thanks so much!!馃ス

@SoyGema SoyGema closed this as completed Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants