You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 4, 2023. It is now read-only.
// if we already sent this message, we just skip it
return;
}
consttext=node.textContent;
if(text.trim().length){
/*
* send the content back to mediator in order to have the translation
* requested by it
*/
constpayload={
text,
type: "inpage",
attrId: [
this.processingNodeMap,
key
],
};
this.notifyMediator("translate",payload);
this.messagesSent.add(key);
}
}
This leads to very poor translation quality because the system does not have sentence context. Even if it were to have sentence context, keeping text spans as is prevents reordering and is an impossible translation problem. For example, chien translates to dog. In this HTML, what is the translation of h? <span id="0">c</span><span id="1">h</span><span id="2">i</span><span id="3">e</span><span id="4">n</span>
Since block elements are sentence-breaking, individual block elements can be sent for translation using their innerHTML. The HTML parser also knows to break sentences at block boundaries so larger elements can also be sent in. It does assume well-formed HTML though; Firefox is better at fixing HTML and this ensures consistency between rendering and how the engine perceives tags. Well-formed implies tags that open also close inside the same block of text; #23 is a blocker.
#51 is a partial blocker. Specifically this part needs to be fixed first:
Even if HTML was being submitted, it would not be properly used (and cause an abort()) because the model doesn't produce alignment information. In the model configuration yaml, the line alignment: soft is missing.
The team has very little confidence in the current bergamot-translator's embedded HTML translation capabilities due the very well documented issues and long time it took to have it implemented along the constant belittling displayed from your team in regards to the way we were trying to solve this problem while your team was still unable to provide the right tools (again also well documented).
But we decided to give it another try in a couple weeks and submit it to QA, but if we still have issues like page defacing, stripping of tags and etc, we will abandon this altogether and remain with textNodes which will be the approach utilized in the user test we are mandated to run internally.
#111 imported the latest API changes in bergamot-translator which will enable the extension to use HTML translation feature.
Now, the extension needs to parse the content to be translated and send a boolean flag to indicate whether the content is html or not per batch item to get the translations.
Now, the extension needs to parse the content to be translated
The extension (developer) knows whether it is translating Node.textContent, Node.value or Node.innerHTML if I'm correct. I don't think there is ever a need for parsing the content to determine this flag's value.
Currently the code sends text snippets for translation:
firefox-translations/extension/view/js/InPageTranslation.js
Lines 135 to 158 in c21b61f
This leads to very poor translation quality because the system does not have sentence context. Even if it were to have sentence context, keeping text spans as is prevents reordering and is an impossible translation problem. For example,
chien
translates todog
. In this HTML, what is the translation ofh
?<span id="0">c</span><span id="1">h</span><span id="2">i</span><span id="3">e</span><span id="4">n</span>
Since block elements are sentence-breaking, individual block elements can be sent for translation using their
innerHTML
. The HTML parser also knows to break sentences at block boundaries so larger elements can also be sent in. It does assume well-formed HTML though; Firefox is better at fixing HTML and this ensures consistency between rendering and how the engine perceives tags. Well-formed implies tags that open also close inside the same block of text; #23 is a blocker.#51 is a partial blocker. Specifically this part needs to be fixed first:
Once that is fixed, HTML processing coming out of the engine should be consistent with https://translate.ikhoefgeen.nl/ .
Quality issues with HTML processing should be raised on https://github.com/browsermt/bergamot-translator
The text was updated successfully, but these errors were encountered: