New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate the use of = (equals) as a dots operand in opcodes or fix documentation #500

Closed
egli opened this Issue Dec 20, 2017 · 8 comments

Comments

Projects
None yet
4 participants
@egli
Member

egli commented Dec 20, 2017

The documentation states that

the = shortcut for dot patterns is deprecated. Dot patterns should be written out. Otherwise
back-translation may not be correct.

This text has been in the documentation since Oct 1 2013 (140a04a) when I added this. I probably copied this from a mail from @johnjboyer but I can't find it in the mailing list archive now.

@hammera did some tests and concluded that the use of the equal sign as a dot operand did not cause problems with backtranslation.

Hungarian translation tables this type rules newer produce backtranslation related issues, not have difference with back translation quality if I using for example the begword pilis = or the begword pilis 1234-24-123-24-234 style rules.

So the question is whether this should really be deprecated or maybe the documentation should just be fixed.

@hammera

This comment has been minimized.

Contributor

hammera commented Dec 20, 2017

Hi Chris,

Look some lou_allround output with en-us-g2.ctb table related, I tested few equals style deprecated rules:

Command: t
Enter the name of a table: en-us-g2.ctb
Command: r
Type something, press enter, and view the results.
A blank line returns to command entry.

wiseacr
Translation:
wiseacr
Back-translation:
wiseacr
Perfect roundtrip!

dledum      
Translation:
dledum
Back-translation:
dledum
Perfect roundtrip!

tweedledum
Translation:
twe$l$um
Back-translation:
tweedledum
Perfect roundtrip!

Rosenlaer
Translation:
,ros5laer
Back-translation:
Rosenlaer
Perfect roundtrip!

Shanghai
Translation:
,%anghai
Back-translation:
Shanghai
Perfect roundtrip!

The US english language grade 2 table use estimated 283 equals style operation rules, I will attaching the list.

Attila

@hammera

This comment has been minimized.

Contributor

hammera commented Dec 20, 2017

@egli

This comment has been minimized.

Member

egli commented Dec 21, 2017

Maybe we should ask @johnjboyer if he remembers why he said those should be deprecated

@BueVest

This comment has been minimized.

Collaborator

BueVest commented Feb 14, 2018

The reason this works most of the time is the fact that most of the time, a given dot pattern will back-translate to its original characters if no rules tell the back-translator otherwise.

It will not work when the "=" rule exists to circumvent another rule that would also affect back-translation, e.g. if a table contains the following rules:

always ooo 135-135
word foobar =

The dot pattern 124-135-135-12-1235 will back-translate to fooobar as specified by the first rule (see the attached yaml test).

Not a very practical example, I admit, but it demonstrates the problem.

I think the = operand is originally a good idea, because it is a clear way of saying that the string should be matched by the basic dot pattern. If it were to work also for back-translation, perhaps the = sign should be replaced internally by the corresponding dot pattern at compile time.

Just an idea.
broken_equals_operand.yaml.txt

@egli egli added this to the 3.5 milestone Feb 19, 2018

@egli egli self-assigned this Feb 19, 2018

@bertfrees

This comment has been minimized.

Member

bertfrees commented Feb 19, 2018

Agree with Bue that the = operand is useful and Bue's test shows that there are indeed cases where it breaks back-translation. The issue isn't as bad as the documentation makes it out to be though. Maybe it shouldn't be deprecated, but a small warning is still appropriate.

There is nothing that keeps Liblouis from handling the = correctly during back-translation. It's a shortcoming of Liblouis, not a shortcoming of the table format.

@egli

This comment has been minimized.

Member

egli commented Feb 20, 2018

So am I correct in summarizing that the solution is to change the documentation? Change it from saying that = is deprecated to maybe give a warning that there might be a bug hiding when using = together with back-translation?

@bertfrees

This comment has been minimized.

Member

bertfrees commented Feb 20, 2018

Yes. And also add Bue's test to the test suite.

@egli

This comment has been minimized.

Member

egli commented Feb 20, 2018

OK, cool, thanks

@egli egli closed this in 7170822 Feb 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment