Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internationalization (I18N) enablement #15

Open
amfred opened this issue Jan 29, 2021 · 19 comments
Open

Internationalization (I18N) enablement #15

amfred opened this issue Jan 29, 2021 · 19 comments
Labels
good first issue Good for newcomers stale

Comments

@amfred
Copy link
Member

amfred commented Jan 29, 2021

We woud like to be able to translate what users see in the future. Enabling the repos for translation is step 1 of that.

@RickPoleshuck
Copy link

This seems like a good first project. Does Github let you add dependencies? My wife might help with Spanish language translations, but that issue seems dependent on this one.

@amfred
Copy link
Member Author

amfred commented Mar 2, 2021

That's great that we might have a Spanish translator!

You're right - we at least need to get all of the English strings in one file first. Issue #16
Then it's not hard to create another copy of the file with the Spanish in it.
The exact file format depends on the translation library we choose; still open to suggestions.

Heh. Github Enterprise+Zenhub (which is what I'm used to) lets us add dependencies, but I don't see how to do it here. Anyone else know how to do that?

@amfred
Copy link
Member Author

amfred commented Mar 2, 2021

By the way, the intent of this issue #15 is to choose the open source translation library. Then #16 is to put the English strings into the format that the library needs. And #17 is for a translation of that file into Spanish (or any other language we can get).

@amfred
Copy link
Member Author

amfred commented Mar 2, 2021

We might well have to choose different translation libraries/plug-ins for different repos because they use different programming languages, but if we choose one for the UI to start with that uses a standard file format, maybe we can at least use the same file format for the other repos.

@RickPoleshuck
Copy link

RickPoleshuck commented Mar 2, 2021 via email

@RickPoleshuck
Copy link

RickPoleshuck commented Mar 2, 2021 via email

@blueivywave
Copy link

It sounds like the intent of this task is to provide content for multiple languages. Correct?

If that is the case, what APIs are being considered for this, or are being evaluated?

@amfred
Copy link
Member Author

amfred commented Mar 2, 2021

The lower level components like the Bias Detection Engine might end up generating text too, so I could see us passing a language code to them and getting back translated text. It's fine to start with the UI first since we know that needs it for sure.

@amfred
Copy link
Member Author

amfred commented Mar 2, 2021

It sounds like the intent of this task is to provide content for multiple languages. Correct?

If that is the case, what APIs are being considered for this, or are being evaluated?

You mean like Google translate?

@RickPoleshuck
Copy link

RickPoleshuck commented Mar 2, 2021 via email

@blueivywave
Copy link

We might well have to choose different translation libraries/plug-ins for different repos because they use different programming languages, but if we choose one for the UI to start with that uses a standard file format, maybe we can at least use the same file format for the other repos.

Have you looked into IBM's Language Translator? It instantly translates web content. Here is the link: https://www.ibm.com/watson/services/language-translator/

@RickPoleshuck
Copy link

RickPoleshuck commented Mar 3, 2021 via email

@amfred
Copy link
Member Author

amfred commented Mar 12, 2021

I recently learned that Angular has a built-in translation framework, so I think we should start there with the UI, because it's built in Angular.
https://angular.io/guide/i18n

Each language has a separate file in xml/json format containing the translated strings.
If we use json as our file format, maybe there's also a json translation framework that would work with our Java/Maven Aggregator, for example.

@RickPoleshuck
Copy link

This article, https://phrase.com/blog/posts/best-libraries-for-angular-i18n/ , says that the Angular builtin has caught up with NGX-Translate. I am happy with either.

@RickPoleshuck
Copy link

I would get started with the implementation, but I haven't fully set up my development environment. authentication.service.ts and client_id is still confusing to me. Does it make sense for me to ask for help?

@upkarlidder upkarlidder added this to To do in Open Sentencing Jul 30, 2021
@carleyreardon3
Copy link

Not sure if you are still looking for ideas on this, or if this would even be something you consider, but if you considered using machine translation for this and wanted to make sure the model works well on the type of language you're using (legal language, etc), you could try fine tuning a machine translation model on a set of english sentences and a small set of manually translated ones. HuggingFace has a ton of machine translation models that are fairly easy to implement -- here is the link for one that translates English to Arabic (for example): https://huggingface.co/Helsinki-NLP/opus-mt-en-ar

@github-actions
Copy link

github-actions bot commented Sep 3, 2021

👋 Hi! This issue has been marked stale due to inactivity. If no further activity occurs, it will automatically be closed.

@github-actions github-actions bot added the stale label Sep 3, 2021
@demilolu
Copy link
Contributor

demilolu commented Sep 9, 2021

@carleyreardon3 we're still looking at this. we're also doing some translation for our five fifths voter project. For that we're looking at Watson Natural Language Translator, but we're also thinking about how to crowdsource human translators/reviewers as well. For Hugging Face, i forget is it via API, I think the issue with BERT or similar transformer models is the computation required.

We might need a unified view of how we deal with translation across all our projects :). cc @upkarlidder

@github-actions github-actions bot removed the stale label Oct 29, 2021
@github-actions
Copy link

👋 Hi! This issue has been marked stale due to inactivity. If no further activity occurs, it will automatically be closed.

@github-actions github-actions bot added the stale label Nov 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers stale
Projects
Status: No status
Development

No branches or pull requests

5 participants