Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631

BueVest · 2018-08-31T08:37:07Z

This issue is a spin-off of the discussion in #611 and the tests in issue-400.yaml.

If the first letter after a number (with no intervening space) is a numericnocontchar, a nocontractsign is always needed. Otherwise the letter will be interpreted as a digit by the reader and the back-translation routine. However, this does currently not happen, if the construct is part of a begcaps/endcaps block.
Once a caps block has been started, it is presumably not canceled by anything other than an endcaps indicator (as opposed to capsword). So, inserting a nocontractsign should not pose any problem to interpreting the caps block correctly.

BueVest · 2018-09-05T20:54:29Z

Sorry, @bertfrees, but this happens during forward translation , not during back-translation.
I will see if I can make any sense of the code.

bertfrees · 2018-09-26T18:28:52Z

To do:

Reference this issue from the issue-400.yaml test:

  # number does not cancel a block in capitals
  - - "ABC123ABC"
    - ",abc#abc;abc."
    - {xfail: missing nocontractsign after number}

BueVest · 2018-10-02T20:47:28Z

The test in question seems to fail for a simple reason: a, b and c are members of numericnocontchars, while A, B and C are not. We have a similar situation in the capsletter tests in line 29 ff, but here, we apparently don't want an extra nocontractsign because we already have the capsletter sign to separate digits from letters.

I thought there was a genuine bug, but it appears that there is rather a problem in the defined behaviour. with capsletter and capsword, capital letters after digits will automatically be marked with either of the two indicators, but if the number appears in the middle of a capsphrase, there will be no indicator too separate digits from capital letters.

I suppose this is mostly a UEB problem, if it is a problem. So, perhaps we should hear what the UEB people have to say about it.

In Danish, all letters (cap and small) are numericnocontchars, which I currently address through context lines. And we don't have the concept of capsphrase in Danish Braille, only capsletter and capsword.

So the test in question is xfail for a good reason, namely the problematic combi of capsphrase and numericnocontchars.

I am still willing to help address it, but what should we do? remove the test? Change the table to include
numericnocontchars abcdefghijABCDEFGHIJ?

bertfrees · 2018-10-22T12:41:46Z

No I think we should keep the test. Maybe it's not a bug, but then at least we document a possible pitfall. The documentation should maybe also be clear about the fact that numericnocontchars is case sensitive.

numericnocontchars abcdefghijABCDEFGHIJ might work. But you might end up with a nocontractsign followed by a capsign in some situations, so you'd have to remove one of them in a second pass.

You could also argue that it should work out of the box: that numericnocontchars should compare dots rather than text characters. I think this would make the most sense.

An alternative solution could be to not treat digits as capital letters by default (and use capsmodechars 123456789 to override this), so that the capsign would have to be repeated, but this of course would need to match the braille standard of the language in question.

BueVest · 2018-10-22T20:41:43Z

Yes, I agree. We should document the current behavior. Then, since capsphrase is mainly a UEB thing, if the UEB people see the need to change it, they can do so at a later stage, as long as they document what they do (smile). However, as long as we use the tests to document a feature, I would prefer to have positive tests, i.e. tests that demonstrate how things work rather than how things do not work. Especially, since we must assume that the current behavior is the expected behavior for the time being. So, I suggest simply changing the expected output to the actual output with an explanation about numericnocontchars being case sensitive and then get rid of the xfails. Should I do that and then add a few sentences to the documentation?

bertfrees · 2018-10-23T09:00:44Z

Sure, that would be great, thanks! I'd maybe also add a test to show how to work around the issue (with numericnocontchars abcdefghijABCDEFGHIJ e.g.), and maybe one specific to UEB (in tests/braille-specs/ueb-issue-x.yaml, and create a new issue).

bertfrees · 2018-10-23T09:06:33Z

It doesn't seem to be an issue in UEB though. "ABC123ABC" translates to ",,abc#abc,,abc"...

bertfrees added the back-translation Anything related to backward translation label Sep 4, 2018

bertfrees removed the back-translation Anything related to backward translation label Sep 6, 2018

bertfrees added the bug Bug in the code (not in a table) label Sep 26, 2018

BueVest added a commit to BueVest/liblouis that referenced this issue Sep 30, 2018

Added reference to liblouis#631 in issue-400.yaml.

e37052e

bertfrees mentioned this issue Oct 1, 2018

Ensure nocontractsign before numericnocontchars in capital blocks #648

Merged

egli added documentation Change in the user manual or wiki and removed bug Bug in the code (not in a table) labels Nov 12, 2018

bertfrees assigned BueVest Aug 14, 2019

BueVest mentioned this issue Sep 7, 2019

Buevest#631 #845

Merged

bertfrees added this to the 3.12 milestone Sep 7, 2019

egli closed this as completed in #845 Sep 9, 2019

This was referenced Oct 1, 2023

Missing indicator for letters after numbers in capital passage #1323

Closed

UEB: Ensure letter indicators are added to letters A-J following numbers in a capitalized passage #1457

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631

Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631

BueVest commented Aug 31, 2018

BueVest commented Sep 5, 2018

bertfrees commented Sep 26, 2018 •

edited

BueVest commented Oct 2, 2018

bertfrees commented Oct 22, 2018

BueVest commented Oct 22, 2018 via email •

edited by bertfrees

bertfrees commented Oct 23, 2018

bertfrees commented Oct 23, 2018

Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631

Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631

Comments

BueVest commented Aug 31, 2018

BueVest commented Sep 5, 2018

bertfrees commented Sep 26, 2018 • edited

BueVest commented Oct 2, 2018

bertfrees commented Oct 22, 2018

BueVest commented Oct 22, 2018 via email • edited by bertfrees

bertfrees commented Oct 23, 2018

bertfrees commented Oct 23, 2018

bertfrees commented Sep 26, 2018 •

edited

BueVest commented Oct 22, 2018 via email •

edited by bertfrees