New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numericnocontchars are not preceeded by nocontractsign when in a begcaps/endcaps block #631
Comments
Sorry, @bertfrees, but this happens during forward translation , not during back-translation. |
To do:
|
The test in question seems to fail for a simple reason: a, b and c are members of numericnocontchars, while A, B and C are not. We have a similar situation in the capsletter tests in line 29 ff, but here, we apparently don't want an extra nocontractsign because we already have the capsletter sign to separate digits from letters. I thought there was a genuine bug, but it appears that there is rather a problem in the defined behaviour. with capsletter and capsword, capital letters after digits will automatically be marked with either of the two indicators, but if the number appears in the middle of a capsphrase, there will be no indicator too separate digits from capital letters. I suppose this is mostly a UEB problem, if it is a problem. So, perhaps we should hear what the UEB people have to say about it. In Danish, all letters (cap and small) are numericnocontchars, which I currently address through context lines. And we don't have the concept of capsphrase in Danish Braille, only capsletter and capsword. So the test in question is xfail for a good reason, namely the problematic combi of capsphrase and numericnocontchars. I am still willing to help address it, but what should we do? remove the test? Change the table to include |
No I think we should keep the test. Maybe it's not a bug, but then at least we document a possible pitfall. The documentation should maybe also be clear about the fact that
You could also argue that it should work out of the box: that An alternative solution could be to not treat digits as capital letters by default (and use |
Yes, I agree. We should document the current behavior. Then, since capsphrase is mainly a UEB thing, if the UEB people see the need to change it, they can do so at a later stage, as long as they document what they do (smile).
However, as long as we use the tests to document a feature, I would prefer to have positive tests, i.e. tests that demonstrate how things work rather than how things do not work. Especially, since we must assume that the current behavior is the expected behavior for the time being.
So, I suggest simply changing the expected output to the actual output with an explanation about numericnocontchars being case sensitive and then get rid of the xfails.
Should I do that and then add a few sentences to the documentation?
|
Sure, that would be great, thanks! I'd maybe also add a test to show how to work around the issue (with |
It doesn't seem to be an issue in UEB though. "ABC123ABC" translates to ",,abc#abc,,abc"... |
This issue is a spin-off of the discussion in #611 and the tests in issue-400.yaml.
If the first letter after a number (with no intervening space) is a numericnocontchar, a nocontractsign is always needed. Otherwise the letter will be interpreted as a digit by the reader and the back-translation routine. However, this does currently not happen, if the construct is part of a begcaps/endcaps block.
Once a caps block has been started, it is presumably not canceled by anything other than an endcaps indicator (as opposed to capsword). So, inserting a nocontractsign should not pose any problem to interpreting the caps block correctly.
The text was updated successfully, but these errors were encountered: