New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html2po misses translatable text #234

Open
transl8bzimport opened this Issue Aug 11, 2006 · 5 comments

Comments

Projects
None yet
3 participants
@transl8bzimport

transl8bzimport commented Aug 11, 2006

Version: 0.9a1

Originally posted by Clytie Siddall:

Today I used html2po to convert a webpage for translation. It did fairly well, still bringing a fair bit of
code along, but more importantly, it missed some text. The section missed is enclosed in double braces
below (there were no errors reported by html2po):

Hỏi: 6.3  Tôi liên lạc
với SIL như thế nào?

Đáp:</
span> Nơi Mạng chính của chúng tôi là: {{ http://www.sil.org

Our site about complex
scripts is:  http://scripts.sil.orgInformation about this license (including contact email
information) is at:  http://scripts.sil.org/OFL


}}7  About using the OFL for your original fonts

@dwaynebailey

This comment has been minimized.

Show comment
Hide comment
@dwaynebailey

dwaynebailey Aug 11, 2006

Member

If you see instances of HTML snippets that should not have been included please
file a bug with the associated snippet of HTML. Please give us more then what
appears in the PO file.

Member

dwaynebailey commented Aug 11, 2006

If you see instances of HTML snippets that should not have been included please
file a bug with the associated snippet of HTML. Please give us more then what
appears in the PO file.

@dwaynebailey

This comment has been minimized.

Show comment
Hide comment
@dwaynebailey

dwaynebailey Aug 11, 2006

Member

Some of that appears to already be in vi. Is this the output of po2html? Ie is
it merging in vi translations. Also was that text available for translation ie
did html2po pull it out of the original HTML correctly.

Can you attach a snippet of the original HTML file so that we can see the
original formatting.

Member

dwaynebailey commented Aug 11, 2006

Some of that appears to already be in vi. Is this the output of po2html? Ie is
it merging in vi translations. Also was that text available for translation ie
did html2po pull it out of the original HTML correctly.

Can you attach a snippet of the original HTML file so that we can see the
original formatting.

@dwaynebailey

This comment has been minimized.

Show comment
Hide comment
@dwaynebailey

dwaynebailey Nov 22, 2006

Member

This is a pretty ugly piece of HTML

Even after sending it through ‘tidy’ to clean up the HTML and even trying to
produce XHTML the bit of text is still missed.

The issue is that all of the text appears in

tags, but we only extract the
first section which is also in

tags and thus skip the next which is outside
the

but in the

This summarises the problem:

Got this text

Misses this

Member

dwaynebailey commented Nov 22, 2006

This is a pretty ugly piece of HTML

Even after sending it through ‘tidy’ to clean up the HTML and even trying to
produce XHTML the bit of text is still missed.

The issue is that all of the text appears in

tags, but we only extract the
first section which is also in

tags and thus skip the next which is outside
the

but in the

This summarises the problem:

Got this text

Misses this

@transl8bzimport

This comment has been minimized.

Show comment
Hide comment
@transl8bzimport

transl8bzimport Nov 23, 2006

Originally posted by Clytie Siddall:

Talk to SIL about it. ;)

But how much HTML is well-written? If we translators only work on good HTML, we won’t be too busy. :(

So html2po has to be able to handle some of the common eccentricities practised by so-called website
authors.

Think how long it takes us to train people to check their translation files properly. Imagine trying to do that
with all the website authors out there. And we have no incentive. Not even a nice, quiet cattle-prod.

Although I’ve suggested building motivating electric shocks into PCs… ;)

transl8bzimport commented Nov 23, 2006

Originally posted by Clytie Siddall:

Talk to SIL about it. ;)

But how much HTML is well-written? If we translators only work on good HTML, we won’t be too busy. :(

So html2po has to be able to handle some of the common eccentricities practised by so-called website
authors.

Think how long it takes us to train people to check their translation files properly. Imagine trying to do that
with all the website authors out there. And we have no incentive. Not even a nice, quiet cattle-prod.

Although I’ve suggested building motivating electric shocks into PCs… ;)

@clouserw

This comment has been minimized.

Show comment
Hide comment
@clouserw

clouserw Jun 26, 2008

Contributor

Dan patched html2po to handle some javascript and PHP parsing problems we were having. Changes aren’t committed yet but should land soon. They are currently waiting for review from me at https://bugzilla.mozilla.org/show_bug.cgi?id=437342 – anyone else please feel free to comment as well. Also CCing Dan here.

Contributor

clouserw commented Jun 26, 2008

Dan patched html2po to handle some javascript and PHP parsing problems we were having. Changes aren’t committed yet but should land soon. They are currently waiting for review from me at https://bugzilla.mozilla.org/show_bug.cgi?id=437342 – anyone else please feel free to comment as well. Also CCing Dan here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment