-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting machine-translated prompts of xP3mt #24
Comments
The prompts are here: https://github.com/Muennighoff/promptsource/blob/xp3mt/promptsource/templates/paws-x/es/templates.yaml
You can just download the paws-x files: https://huggingface.co/datasets/bigscience/xP3mt/tree/main/es e.g. https://huggingface.co/datasets/bigscience/xP3mt/blob/main/es/xp3_paws-x_es_train_task_description-no-label_esmt.jsonl Also see the usage guidelines here that may help: https://huggingface.co/datasets/Muennighoff/xP3x#usage
We use Google Machine Translate to translate the prompts and then just put them in the same place for all languages. For right-to-left languages like Arabic everything is the same (i.e. they are processed from beginning of sentence to the end). Usually browsers handle displaying it as right-to-left so we can treat it as left-to-right in the modelling phase. |
Thank you for the quick response and the pointer. It is very helpful. In the templates |
Yes you can use accuracy. The metric field in that file is never used. |
Great. Thank you for all your help! |
Hi,
Thank you for the very interesting work and releasing the code. It is very helpful!
Is there a way I can get the machine-translated prompts per task?
For example, how would I get the Spanish (es) prompt for Paws-x only?
bigscience/xP3mt
seems to contains the input, output pairs in Spanish for all the training tasks. Is there a way I can get the input, output pairs for Paws-x only?data/xp3/prepare_xp3_train.py
, settingUSE_ENGLISH_PROMPTS
toFalse
seems to load prompts in different languages from PromptSource, but PromptSource only has prompts in English for Paws-x (https://github.com/bigscience-workshop/promptsource/tree/main/promptsource/templates/paws-x
)Also, more generally, how do you do machine-translation for prompts if the language is from right-to-left instead of left-to-right or has different ordering like subject-object-verb instead of subject-verb-object? Would the target come before the input or would you reorder the sentences in the input (i.e premise or hypothesis) in the prompt? And if the target comes before the input, how would the model work since it generates from left to right?
Thank you,
Derek
The text was updated successfully, but these errors were encountered: