Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] fetch translations online #30759

Open
wants to merge 14 commits into
base: master
from

Conversation

@mart-e
Copy link
Contributor

mart-e commented Feb 1, 2019

This is a PoC to test the feasibility of such approach, nothing decided yet, creating a PR to make it easier to discuss

Fetch the latest translations from an external provider (e.g. nightly/CDN server) instead of relying on local files. This would allow to:

  • fetch translations more regularly than once per week
  • be able to update translations without having to wait for the source code to be updated
  • reduce the size of the repository/install files by removing .po files from this repository

Questions:

  • how to manage SaaS installations (you do no want the action on one db to impact the files used by another)
    --> solution used, two extraction methods : save to fs or save to ir.attachment
  • how to manage no longer supported versions and nightly server (we won't serve v14 translation files in 2042)?
  • triple check security related stuff as executing files downloaded

This PR uses #38859 as base (still to merge).

Misc:

Todo/Problems to solve:

  • we use the .pot files as reference, this is no longer working (as the files are no longer loaded from fs), this is why a test is still failing (fr_BE.po is outdated while the .pot is right)
  • use a last_fetch_date to avoid multiple refresh when not needed
@mart-e mart-e requested review from odony and rco-odoo Feb 1, 2019
@C3POdoo C3POdoo added the RD label Feb 1, 2019
@robodoo robodoo added the seen 🙂 label Feb 1, 2019
@Yenthe666

This comment has been minimized.

Copy link
Collaborator

Yenthe666 commented Feb 1, 2019

🍰 oh man, this one is awesome.
@mart-e is this what we talked about at Odoo Experience 2 years ago where you said you heavily wanted to reduce the amount of source files for translations?

@@ -47,6 +61,7 @@ class Lang(models.Model):
"Provided ',' as the thousand separator in each case.")
decimal_point = fields.Char(string='Decimal Separator', required=True, default='.', trim=False)
thousands_sep = fields.Char(string='Thousands Separator', default=',', trim=False)
last_fetch_date = fields.Datetime(string='Last Translations Fetch')

This comment has been minimized.

Copy link
@rco-odoo

rco-odoo Feb 1, 2019

Member

should be by fetch location and lang...

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
@Yenthe666

This comment has been minimized.

Copy link
Collaborator

Yenthe666 commented Feb 1, 2019

@mart-e I don't have the time to go through the whole PR but I have a question.
What happens in the following case?
Transifex term in Dutch on 01/02/2019:
Documents
Term in Dutch on 01/02/2019 on local instance:
Documents

On 02/02/2019 somebody locally edits the source term as they want to give it another name, Files instead of Documents
On 03/02/2019 this PR goes off and syncs translations. Will the local translation be kept when it was locally modified (and thus differs from the Transifex terms)? Will this work like a noupdate functionality then or?

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
@rim-odoo

This comment has been minimized.

Copy link
Contributor

rim-odoo commented Feb 1, 2019

👍 for the purpose of this PR !

@sbidoul

This comment has been minimized.

Copy link
Contributor

sbidoul commented Feb 1, 2019

@mart-e it do you consider supporting custom addons too?

@mart-e

This comment has been minimized.

Copy link
Contributor Author

mart-e commented Feb 1, 2019

@Yenthe666 ideally, it will be invisible for the user, if the terms are fetched online or on the fs, it should not change the way you interact with translations and Transifex.

@sbidoul of course, this is why there is a field i18n_location on the ir.module.module. The implementation may still change but the goal is to support multiple translation servers.

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated
bio = BytesIO()
bio.write(stream.content)
bio.seek(0)
self._extract_i18n_file_content(bio, lang, urls[url], extraction_method)

This comment has been minimized.

Copy link
@xmo-odoo

xmo-odoo Feb 1, 2019

Collaborator

With stream.raw the tarfile can be decompressed on the fly (streaming). Requires opening with r|* to ensure it never seeks.

"""
with tarfile.open(mode='r:xz', fileobj=fileobj) as tar_content:
with tempfile.TemporaryDirectory() as tmp:
for filename in tar_content.getnames():

This comment has been minimized.

Copy link
@xmo-odoo

xmo-odoo Feb 1, 2019

Collaborator

Might be worth iterating on the TarFile directly to get TarEntry objects so we can check the entry's size, & keep a running tally or somesuch to avoid decompression bombs? (using xz, 6.5GB of zeroes compress to <1MB).

Might also need to check if the entry isfile(), extract/extractfile apparently blows up with non-file entries.

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
Copy link
Contributor

naglis left a comment

I find it an awesome feature. Great effort! 👍

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated
for url in urls:
full_url = self._get_i18n_url(url, lang)
try:
stream = requests.get(full_url, stream=True)

This comment has been minimized.

Copy link
@naglis

naglis Feb 9, 2019

Contributor

Since it is likely there will be multiple requests to the same host, I'd suggest to utilize requests.Session, to reuse the underlying TCP connection (potential performance increase).

odoo/addons/base/models/res_lang.py Outdated
for url in urls:
full_url = self._get_i18n_url(url, lang)
try:
stream = requests.get(full_url, stream=True)

This comment has been minimized.

Copy link
@naglis

naglis Feb 9, 2019

Contributor

From requests docs:

If you set stream to True when making a request, Requests cannot release the connection back to the pool unless you consume all the data or call Response.close. This can lead to inefficiency with connections. If you find yourself partially reading request bodies (or not reading them at all) while using stream=True, you should make the request within a with statement to ensure it’s always closed:

with requests.get('https://httpbin.org/get', stream=True) as r:
    # Do things with the response here.

So I think it would be safer to use the response as a context manager.

full_url = self._get_i18n_url(url, lang)
try:
stream = requests.get(full_url, stream=True)
if stream.status_code != 200:

This comment was marked as outdated.

Copy link
@naglis

naglis Feb 9, 2019

Contributor

What if we used if not stream.ok to be more lenient?

odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Show resolved Hide resolved
odoo/addons/base/models/ir_module.py Show resolved Hide resolved
odoo/addons/base/models/res_lang.py Outdated Show resolved Hide resolved
@mart-e

This comment has been minimized.

Copy link
Contributor Author

mart-e commented Feb 11, 2019

@naglis Thank you for the review but keep in mind this is very early work. It does not work yet and it will probably still change a lot so no need to spend too much time on details of implementation at the moment.

@mart-e mart-e force-pushed the odoo-dev:master-download-languages-mat branch 6 times, most recently Feb 19, 2019
Copy link
Collaborator

xmo-odoo left a comment

I am cow, hear me moo

@mart-e mart-e requested a review from xmo-odoo Mar 5, 2019
@mart-e mart-e force-pushed the odoo-dev:master-download-languages-mat branch from c38baa1 to dabe55b Oct 21, 2019
@KangOl KangOl force-pushed the odoo:master branch from 86c80d3 to ab6d0c3 Nov 6, 2019
@mart-e mart-e force-pushed the odoo-dev:master-download-languages-mat branch 13 times, most recently from 1032ac6 to 24885e8 Nov 6, 2019
mart-e added 14 commits Oct 16, 2019
Only activated languages should be used in "amount to text" features.
If a language code of a not-used language is used, it should be
ignored for consistency with the rest of the interface.
load_lang was a kind of hybrid method trying to active or creating a
language if not found. This was error prone.
Instead rely on two methods with clear purpose:
ResLang._create_lang(lang, lang_name=None)
  - create a new res.lang entry using the locale of the server
    return the res.lang record to match the API of _activate_lang

ResLang._active_lang(code)
  - activate the given code lang

Most of the time, _active_lang is what is expected

tools.trans_load_data and IrTranslation._load_module_terms no longer
activate the language if not active.
Loading the translations should be explicit on an activated language,
it is too error prone to silently activate/create a language if not
found.
Remove lang_name from trans_load_data as no longer needed.
Instead of relying on the context content, pass explicit values for
overwrite and create_empty_translations
applu this to trans_load and trans_load_data
Adapt the test that was relying on empty translations.
To keep up with the changes
Instead, updating the translations of a module should be done directly
on the ir.module.module record
It was misleading as only forced for translations of type 'code' but
for the other translations, it was retrieved from the imported file
(the comment in a .po file or column in a .csv)
Instead of previous long methods, use a class to clarify what the
export actually does.
Remove the 'all_installed' possibility in modules as it was not
working (creating query with 2 WHERE clause).
Simplify writer by deducing modules from exported translations
So much performance
Fetch the latest translations from an external provider (e.g. nightly/CDN
server) instead of relying on local files. This would allow to:
- fetch translations more regularly than once per week
- be able to update translations without having to wait for the source code to
  be updated
- reduce the size of the repository/install files by removing .po files from
  this repository

Changes in the way the translations are loaded.
Instead of having public methods that does several unrelated things, split in
small private methods with only one purpose:
* load_lang: no longer activates or create a language, one method
  _activate_lang, one method _create_lang

* trans_load_data: no longer activates the language if inactive, should be
  tested and activated explicitly

* import_lang: used to create a new language if it was not present or
  activated. Will now raise an error.
Remove the wrapper to keep reading inside the i18n folder
Remove translation as was only checking that the translations are
correctly loaded anyway
@mart-e mart-e force-pushed the odoo-dev:master-download-languages-mat branch from 6a5cb04 to d904db9 Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants
You can’t perform that action at this time.