Inconsistent precedence of character definition opcodes when capsletter is in use #384

rimas-kudelis · 2017-08-24T19:29:40Z

I have a table like this:

$ cat a.tbl 
uplow Aa 1

uppercase B 12
uppercase B 1
lowercase b 12
lowercase b 1

uplow Cc 14

uplow Dd 145
uplow Dd 14

uplow Ee 15

uplow Ff 124
uplow Ff 15,15

uplow Gg 1245

uplow Hh 125,125
uplow Hh 1245,1245

#sign U 136
#capsletter 136

With this table, as long as the capsletter opcode is not in use, the precedence of conflicting definitions of the same character appears to be top-to-bottom, that is, the topmost definition wins:

$ echo "Aa Bb Cc Dd Ee Ff Gg Hh" | lou_translate -f a.tbl 
aa BB cc dd ee ff gg hh

However, if I uncomment the two capsletter lines, a funny thing starts happening: for uppercase letters, this precedence rule gets reversed and the bottommost rule becomes the winning one:

$ echo "Aa Bb Cc Dd Ee Ff Gg Hh" | lou_translate -f a.tbl 
Uaa UaB Ucc Ucd Uee Uef Ugg Ugh

Am I wrong, or is this a bug?

The text was updated successfully, but these errors were encountered:

bertfrees · 2017-08-24T21:59:44Z

Not sure what exactly is happening here. That needs some investigation.

Whether or not it is a "bug" is hard to say because the documentation doesn't mention this specific case, and the behavior has probably always been like this, so it's hard to find out whether it is intentional.

If it turns out the current behavior has no clear purpose and is really too confusing, we can change it (and document the new behavior).

It is kind of understandable though that this is currently not documented. Defining the same character twice with different dot patterns is really something you aren't expected to do. Why would you ever want to do this in real life?

rimas-kudelis · 2017-08-25T05:02:50Z

The real life example is including a generalized table file, and overriding some of its definitions from within the including table.

In my particular case, I want to create fallbackAccentedLatinLetterDef6Dots.uti and fallbackAccentedLatinLetterDef8Dots.uti files (better name suggestions are welcome) which will map all accented Latin letters and ligatures to their non-accented equivalents, like this:

noback uplow     \x00c5\x00e5 1        Åå LATIN CAPITAL LETTER A WITH RING ABOVE,LATIN SMALL LETTER A WITH RING ABOVE
noback uplow     \x00c6\x00e6 1-15     Ææ LATIN CAPITAL LETTER AE,LATIN SMALL LETTER AE

There is a huge number of characters like this in Unicode, and having such generic table would allow to have at least some degree of support for all these characters without cluttering the language table with hundreds of these barely relevant definitions. But there has to be a guaranteed way to override such definitions from within the including table, e.g. to define characters that exist in the national alphabet.

In fact, I just checked and this is exactly what the Latvian table already does. For historical reasons, Latvians use a slightly different mapping of three Latin letters, and here's the relevant excerpt from Lv-Lv-g1.utb:

# define the dot combinations that are different from the default.
# placed before the include to take precedence.
uplow Uu 34                      letter U *** Different from other langs ***
uplow Vv 2456                    letter V *** Different from other langs ***
uplow Zz 345                     letter Z *** Different from other langs ***
include latinLetterDef6Dots.uti

And it doesn't work as expected:

$ echo 'Tt Uu Vv Zz' | lou_translate -f lv.tbl 
,tt ,u/ ,vw ,z>
$ echo 'Tt Uu Vv Zz' | lou_translate -f unicode.dis,lv.tbl
⠠⠞⠞ ⠠⠥⠌ ⠠⠧⠺ ⠠⠵⠜

Note how capital letters U, V and Z get different dot mappings from lowercase ones, which is clearly unintended and unexpected.

bertfrees · 2017-08-25T09:55:34Z

Thanks. OK I guess this is a valid use case indeed. In general I think including another table and overriding parts of it is not a good idea, but if it is done in a controlled way, like in your example where the included table has a very specific function, it is OK.

So the next step is to find out what exactly happens in the code and document it. Don't hesitate if you want to help with this.

Also if you are sure that the Latvian table does not work correctly, could you please add some tests?

rimas-kudelis · 2017-08-31T10:30:27Z

@bertfrees I don't really speak or write Latvian, but I guess I could prepare a few simplistic tests, like I did for Lithuanian.

bertfrees · 2017-08-31T10:37:24Z

Oh, didn't realize that. Anyway, a few simplistic tests would already be great. Thanks!

…ions to it. The tests and corrections are based on the information supplied in the World Braille Usage report, third edition. The second test currently fails due to liblouis#384.

rimas-kudelis · 2017-09-13T19:23:06Z

@bertfrees as you can see, I've made a couple pull requests with failing tests related to this issue. Hope this can progress.

rimas-kudelis · 2017-09-13T19:56:15Z

Another observation: the dots reported by lou_trace are different than the ones present in the output:

$ lou_trace -f unicode.dis,lv.tbl
Aa Uu Vv
⠠⠁⠁ ⠠⠥⠌ ⠠⠧⠺
1.	uppercase	A	1
2.	lowercase	a	1
3.	space	 	0
4.	uppercase	U	34
5.	lowercase	u	34
6.	space	 	0
7.	uppercase	V	2456
8.	lowercase	v	2456

bertfrees · 2017-09-13T22:04:47Z

Brilliant, thank you!

…ions to it. The tests and corrections are based on the information supplied in the World Braille Usage report, third edition. The second test currently fails due to #384.

bertfrees · 2018-11-12T16:26:20Z

See 6bf242e

rimas-kudelis · 2018-11-18T20:29:49Z

@bertfrees that patch problably works around the problem for the Latvian case, but doesn't really fix the underlying issue. I suggest to at least mention this issue in a comment in that table.
Also, this is still a bug, isn't it? Will you maybe have time to look at what exactly is happening here and why?

bertfrees · 2018-11-18T22:09:02Z

Yes, it's just a workaround. OK, I'll add a reference to this issue.

I don't have time now, but maybe after the release...

bertfrees · 2019-06-21T21:56:02Z

I've added the label "needs test" because there needs to be a YAML file whose purpose is to give an overview of the various precedence rules. Initially it should simply document the current behavior including bugs (and obvious bugs can be fixed of course), but it will also provide us a way to look at the whole picture and find inconsistencies and more subtle issues, and define a new expected behavior based on this.

This has to do with issue liblouis#384.

bertfrees · 2020-02-17T14:31:55Z

I've created the start of this "precedence.yaml" test in 80f8d5a, and also merged issue-384.yaml into it.

jrbowden · 2023-02-28T14:50:18Z

I have the same use case: needing to override parts of English tables for example for special treatment of accented letters.

liblouis#384

bertfrees · 2023-12-01T13:11:46Z

Will be fixed by #1481

rimas-kudelis mentioned this issue Sep 13, 2017

Tests and small fixes for the Latvian table #409

Closed

rimas-kudelis added a commit to rimas-kudelis/liblouis that referenced this issue Sep 13, 2017

Added test for issue liblouis#384

5f915be

rimas-kudelis added a commit to rimas-kudelis/liblouis that referenced this issue Sep 13, 2017

Added test for issue liblouis#384

3aae421

bertfrees added the bug Bug in the code (not in a table) label Sep 14, 2017

egli added a commit to rimas-kudelis/liblouis that referenced this issue Sep 15, 2017

Add a reference to liblouis#384

645d5a6

bertfrees pushed a commit that referenced this issue Oct 6, 2017

Added test for issue #384

745935e

bertfrees pushed a commit that referenced this issue Oct 6, 2017

Add a reference to #384

551a5ec

rimas-kudelis mentioned this issue Nov 20, 2017

Added Lithuanian 6-dot table #457

Merged

4 tasks

rimas-kudelis changed the title ~~Inconsistent precedency of character definition opcodes when capsletter is in use~~ Inconsistent precedence of character definition opcodes when capsletter is in use Nov 21, 2017

bertfrees added the needs test A YAML test is needed (and should be committed) to explain the bug or expected behavior of a table label Jun 21, 2019

bertfrees mentioned this issue Jul 22, 2019

Irish may2019 #793

Merged

bertfrees added a commit to Ronan555/liblouis that referenced this issue Aug 3, 2019

Fix uppercase ÁÉÍÓÚ

c8215c2

This has to do with issue liblouis#384.

bertfrees changed the title ~~Inconsistent precedence of character definition opcodes when capsletter is in use~~ Inconsistent precedence of character definition opcodes when capsletter is in use Aug 14, 2019

bertfrees added a commit to Ronan555/liblouis that referenced this issue Aug 22, 2019

Fix uppercase ÁÉÍÓÚ

e89019b

This has to do with issue liblouis#384.

bertfrees added a commit to Ronan555/liblouis that referenced this issue Aug 27, 2019

Fix uppercase ÁÉÍÓÚ

c647d33

This has to do with issue liblouis#384.

bertfrees mentioned this issue Nov 19, 2019

Add an includediff opcode #868

Closed

bertfrees added help wanted Maintainers want help because they don't have the knowledge or the time, or for another reason prio:low Low priority - minor issue, might never be fixed (but a reminder is kept) labels Feb 17, 2020

bertfrees removed the needs test A YAML test is needed (and should be committed) to explain the bug or expected behavior of a table label Feb 17, 2020

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 24, 2022

Repeat certain rules in uk.utb to work around issue liblouis#384

eb3fc58

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 24, 2022

Small correction in precedence.yaml (related to issue liblouis#384)

6f9f433

Futyn-Maker mentioned this issue Aug 24, 2022

Problems in the Ukrainian braille table and priority of base rules #1238

Merged

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 24, 2022

Repeat certain rules in uk.utb to work around issue liblouis#384

d7cf456

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 24, 2022

Small correction in precedence.yaml (related to issue liblouis#384)

6529faa

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 29, 2022

Repeat certain rules in uk.utb to work around issue liblouis#384

0e5d673

bertfrees added a commit to Futyn-Maker/liblouis that referenced this issue Aug 29, 2022

Small correction in precedence.yaml (related to issue liblouis#384)

807e5bb

bertfrees mentioned this issue Mar 3, 2023

Improved Vietnamese Braille Tables #1282

Merged

bertfrees added a commit to danghoaiphuc/liblouis that referenced this issue Mar 4, 2023

Refer to issue liblouis#384 in vi-saigon-g1.ctb

9e6557d

liblouis#384

bertfrees added a commit to danghoaiphuc/liblouis that referenced this issue Mar 4, 2023

Refer to issue liblouis#384 in vi-saigon-g1.ctb

a192952

liblouis#384

bertfrees added a commit to danghoaiphuc/liblouis that referenced this issue Mar 4, 2023

Refer to issue liblouis#384 in vi-saigon-g1.ctb

e6c688a

liblouis#384

bertfrees self-assigned this Jun 5, 2023

bertfrees added this to the 3.27 milestone Jun 5, 2023

bertfrees modified the milestones: 3.27, 3.28 Aug 24, 2023

bertfrees removed help wanted Maintainers want help because they don't have the knowledge or the time, or for another reason prio:low Low priority - minor issue, might never be fixed (but a reminder is kept) labels Nov 24, 2023

bertfrees removed their assignment Dec 1, 2023

bertfrees closed this as completed Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent precedence of character definition opcodes when capsletter is in use #384

Inconsistent precedence of character definition opcodes when capsletter is in use #384

rimas-kudelis commented Aug 24, 2017 •

edited by bertfrees

bertfrees commented Aug 24, 2017

rimas-kudelis commented Aug 25, 2017 •

edited

bertfrees commented Aug 25, 2017

rimas-kudelis commented Aug 31, 2017 •

edited

bertfrees commented Aug 31, 2017

rimas-kudelis commented Sep 13, 2017

rimas-kudelis commented Sep 13, 2017

bertfrees commented Sep 13, 2017

bertfrees commented Nov 12, 2018

rimas-kudelis commented Nov 18, 2018

bertfrees commented Nov 18, 2018

bertfrees commented Jun 21, 2019

bertfrees commented Feb 17, 2020 •

edited

jrbowden commented Feb 28, 2023

bertfrees commented Dec 1, 2023

Inconsistent precedence of character definition opcodes when capsletter is in use #384

Inconsistent precedence of character definition opcodes when capsletter is in use #384

Comments

rimas-kudelis commented Aug 24, 2017 • edited by bertfrees

bertfrees commented Aug 24, 2017

rimas-kudelis commented Aug 25, 2017 • edited

bertfrees commented Aug 25, 2017

rimas-kudelis commented Aug 31, 2017 • edited

bertfrees commented Aug 31, 2017

rimas-kudelis commented Sep 13, 2017

rimas-kudelis commented Sep 13, 2017

bertfrees commented Sep 13, 2017

bertfrees commented Nov 12, 2018

rimas-kudelis commented Nov 18, 2018

bertfrees commented Nov 18, 2018

bertfrees commented Jun 21, 2019

bertfrees commented Feb 17, 2020 • edited

jrbowden commented Feb 28, 2023

bertfrees commented Dec 1, 2023

rimas-kudelis commented Aug 24, 2017 •

edited by bertfrees

rimas-kudelis commented Aug 25, 2017 •

edited

rimas-kudelis commented Aug 31, 2017 •

edited

bertfrees commented Feb 17, 2020 •

edited