In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

CyrilleB79 · 2023-01-17T16:35:25Z

Context

A first PR (#14459) had been merged to fix #14417. Unfortunately an issue was found (see #14473) so it has been reverted in #14477.

This PR is a second attempt to fix #14417 without causing #14473. It will remain a draft until I can have more information on #14473 from @OzancanKaratas, as requested in #14473 (comment), or from anyone else able to reproduce.

Link to issue number:

Fixes #14417

Summary of the issue:

Preliminary note for review

Keep in mind the following: in NVDA with CLDR enabled and with no custom user symbol defined, symbol level for symbol X is defined as follows:

look at locale symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at locale CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, use default symbol level (don't remember if it is None or All).

Description of the issue

Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file.

Currently, CLDR files are generated with level "None" for all symbols.

Usually, in locales with a CLDR file and a normal symbol files, less common characters that are only in CLDR are reported at level None, i.e. whatever the punctuation level setting of the user. But common punctuation symbols (dot, question marke, etc.) are added by translators in the locale symbol file what allows to have these symbols reported at a higher punctuation level.

For Hindi (or any language with no current symbol translated), all the characters present in CLDR file are reported at "None" level and above (i.e. at any level), because the level is not redefined in the locale (Hindi) symbol file.

In such situation, using the level of the locale CLDR (None) is not a good strategy. It would be better to take advantage of the levels defined for the symbols in the English symbol file.

Description of user facing changes

CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured.

Description of development approach

Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.

Testing strategy:

In the document in NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417, tested that punctuation are not reported when symbol level is None
For Hindi (hi) and Amharic (am) with eSpeak, tested the following punctuation and emojis: _),🐍
- The emoji 🐍 is reported with a non-English word even at level None
- The symbols _ ) , are reported in non-English words only at higher punctuation level
  Note: I do not speak these languages, so I just check that the reported word is not English; in Hindi however, "comma" is pronounced the same way as in English but written with Hindi characters.
  Note2: Hindi is a language with existing symbol file but empty; Amharic is a language with no symbol file.
Check that NVDA cannot read CLDR characters after #14459 #14473 is not reproducible anymore. Note: I was not able to reproduce NVDA cannot read CLDR characters after #14459 #14473 so @OzancanKaratas or someone else able to reproduce should confirm that the issue is not present anymore.

Known issues with pull request:

None

Change log entries:

Note: currently being discussed in the comments.

Bug fixes
In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (#14417)

Code Review Checklist:

Pull Request description:
- description is up to date
- change log entries
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
API is compatible with existing add-ons.
Documentation:
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English
Security precautions taken.

… punctuation level Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

AppVeyorBot · 2023-01-17T17:31:53Z

Build (for testing PR): https://ci.appveyor.com/api/buildjobs/dequube0p50qf3lp/artifacts/output/nvda_snapshot_pr14558-27558,58c7b80c.exe
PASS: Lint check.
PASS: Unit tests.
PASS: Translation comments check.
PASS: System tests (tags: installer NVDA).
CI timing (mins):
INIT 0.0,
INSTALL_START 1.3,
INSTALL_END 1.1,
BUILD_START 0.0,
BUILD_END 24.1,
TESTSETUP_START 0.0,
TESTSETUP_END 0.3,
TEST_START 0.0,
TEST_END 22.4,
FINISH_END 0.2

See test results for failed build of commit 58c7b80c81

CyrilleB79 · 2023-01-17T23:28:02Z

Hi @OzancanKaratas

I have run NVDA in Turkish with eSpeak Turkish.
With the build of this PR, the Emoji "😄" is reported, i.e. I have checked that something is reported and that the Google translation of what is reported seems to match (since I do not speak Turkish).

Could you try this build? Please try to read "😄" as well as all you have tried in #14473.

In case you are not able to have emojis reported (as described in #14473), could you please indicate the emojis you are trying to read and overall provide a full log in debug mode? If the log contains sensitive information (but it should not if you take care to it), you may want to send it to me in private rather than putting it here.

Many thanks for your help.

OzancanKaratas · 2023-01-18T15:55:05Z

I'm still reproducing the issue.

One possible solution I can think of: Copy the English symbol file into untranslated languages. Thus, the reading experience of untranslated languages can be enhanced without touching the translated languages.

CyrilleB79 · 2023-01-18T21:38:26Z

I'm still reproducing the issue.

Thanks. That was expected since I have not modified anything else except merging from last master branch.

One possible solution I can think of: Copy the English symbol file into untranslated languages. Thus, the reading experience of untranslated languages can be enhanced without touching the translated languages.

Before talking of a solution, I would like to understand the issue, i.e. why do you have this issue and why cannot I reproduce with Turkish NVDA and with Turkish eSpeak on my side. For that, as asked in #14558 (comment), could you please provide a full log in debug mode making your emoji tests with this build? Thanks in advance.

OzancanKaratas · 2023-01-19T10:54:00Z

Unfortunately this is not an issue where the log file can be provided. Emojis are not announced by NVDA. I can't hear anything when navigating with the arrow keys.

CyrilleB79 · 2023-01-19T13:52:12Z

Unfortunately this is not an issue where the log file can be provided. Emojis are not announced by NVDA. I can't hear anything when navigating with the arrow keys.

The point is that this PR causes an issue on your side and that I cannot reproduce it on my side, even with NVDA and synthesizer in Turkish. Since the issue seems to be 100% reproducible on your side, it's very likely that there is something that I do differently than you.

Even if the emoji are not spoken on your side, I hope that a full log in debug mode from you will help me to understand the conditions in which the issue occurs so that I can finally reproduce it on my side.
For example (but that's not exhaustive), the log would allow me to use the same options than you, the same synth/voice, the same document or webpage and the same emoji. The log may also give me unexpected indication that I did not think of yet.

So please, if you do not mind, could you provide a full log in debug mode when reproducing the issue? Thanks.

CyrilleB79 · 2023-01-30T08:20:22Z

@OzancanKaratas just a reminder: could you please provide a full log in debug mode? As explained in my previous message (pasted again below), it may be useful in any case for me to reproduce the issue that you have found. Many thanks.

Unfortunately this is not an issue where the log file can be provided. Emojis are not announced by NVDA. I can't hear anything when navigating with the arrow keys.

The point is that this PR causes an issue on your side and that I cannot reproduce it on my side, even with NVDA and synthesizer in Turkish. Since the issue seems to be 100% reproducible on your side, it's very likely that there is something that I do differently than you.

Even if the emoji are not spoken on your side, I hope that a full log in debug mode from you will help me to understand the conditions in which the issue occurs so that I can finally reproduce it on my side. For example (but that's not exhaustive), the log would allow me to use the same options than you, the same synth/voice, the same document or webpage and the same emoji. The log may also give me unexpected indication that I did not think of yet.

So please, if you do not mind, could you provide a full log in debug mode when reproducing the issue? Thanks.

CyrilleB79 · 2023-02-27T20:46:07Z

Setting this PR as ready to merge as per #14417 (comment).

Note for @OzancanKaratas or any other tester:
This PR has probably an issue as reported earlier by @OzancanKaratas. Since I have not enough information to reproduce the issue, the strategy is now to merge early in alpha hoping to get another report with more information before the release in order to fix the issue.
Of course if anybody finds an issue in this PR before merging, please report it as soon as possible here so that it can be taken into account before being merged in alpha.

seanbudd · 2023-02-27T22:36:34Z

@CyrilleB79 would an accurate alternative changelog description be:

"In Hindi, punctuation will be reported according to the default English symbol level"?

CyrilleB79 · 2023-02-28T07:50:36Z

@seanbudd, there are two paths:

either we want to put a change log entry indicating that that what has been reported in NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417 has been fixed; in this case I would keep the current change log item, i.e.:
In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level. (#14417)
or we want to describe more generally what was done in this PR with the following entry:
Punctuation will be reported according to the default English symbol level for symbols which do not have a symbol description in the current locale. (, #14558, #14417)

We may also add the two entries if you want.

seanbudd · 2023-03-02T20:55:31Z

Is "In Hindi, NVDA will no longer read punctuation symbols whatever the punctuation level." also accurate? I've found it hard to understand the current message and it's use of anymore.

CyrilleB79 · 2023-03-03T10:56:59Z

@seanbudd, you write:

Is "In Hindi, NVDA will no longer read punctuation symbols whatever the punctuation level." also accurate? I've found it hard to understand the current message and it's use of anymore.

Sorry, I do not understand this last comment.
Do you mean that the message "In Hindi, NVDA will no longer read punctuation symbols whatever the punctuation level." is not understandable or is not correct?

To be clearer (if it's just a matter of understandability), I can rephrase as follows:
"In Hindi, NVDA will no longer read punctuation symbols whatever the punctuation level reporting chosen by the user."

Also regarding #14558 (comment), it's not clear to me if you want to use one change log entry (and which one) or both.

Adriani90 · 2023-03-03T14:12:34Z

How about just saying:
"in Hindi, NVDA prevents reporting of punctuation symbols when the symbol level is set to "none", and reports now punctuation symbols according to the symbol levels defined in the english symbols file."

CyrilleB79 · 2023-03-03T20:58:35Z

How about just saying:

No, that's not true because:
In part 1:

"in Hindi, NVDA prevents reporting of punctuation symbols when the symbol level is set to "none",

Not true in the case of English symbols reported at "none" level.

And in part 2:

and reports now punctuation symbols according to the symbol levels defined in the english symbols file."

That's only true until Indian translators complete the translation work.

Adriani90 · 2023-03-04T10:10:08Z

Ok this seems a tricky one.
I would say since the behavior of reporting symbols is very dynamic and depends also on translations, could we not just say

"for the Hindi language, NVDA now reports less punctuation symbols than before when the symbol level is set to none."

XLTechie · 2023-03-04T10:41:50Z

"for the Hindi language, NVDA now reports less punctuation symbols than before when the symbol level is set to none."

The PR description is a little confusing about what is actually being done here. At least to me. It is hard to make a good changelog description, when the PR description is easily misunderstood. For example, what does this mean? " This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. " What is not adapted? Is "adapted" the right word?

XLTechie · 2023-03-04T10:49:56Z

It sounds from the PR description, though as I said I find it a little hard to understand, that: For languages that have a "stub" symbol translation (a file with no symbols), or no file at all, symbols are treated as CLDR are treated. Since that is not ideal, this PR causes untranslated symbols to follow the English rules for symbol level. If all of that is a correct understanding, then it's not that symbols in Hindi will go unannounced, or that less of them will be announced (since none were before). It's that: Punctuation in languages without a symbol translation file, will be spoken as it is in the English language, and will follow the punctuation levels from the English symbol file. That's my best guess of what was done, based on the description of the PR. :)

CyrilleB79 · 2023-03-04T22:12:25Z

It sounds from the PR description, though as I said I find it a little hard to understand, that: For languages that have a "stub" symbol translation (a file with no symbols), or no file at all, symbols are treated as CLDR are treated. Since that is not ideal, this PR causes untranslated symbols to follow the English rules for symbol level.

Correct.

If all of that is a correct understanding, then it's not that symbols in Hindi will go unannounced, or that less of them will be announced (since none were before).

No, actually, too much symbols were announced before (as described in #14417). Since there is a level defined on None in hi/cldr.dic for each character.

It's that: Punctuation in languages without a symbol translation file, will be spoken as it is in the English language, and will follow the punctuation levels from the English symbol file.

This is correct and is the second change log proposal that I tried to phrase in #14558 (comment), i.e.:
"Punctuation will be reported according to the default English symbol level for symbols which do not have a symbol description in the current locale. (, #14558, #14417)"

That's my best guess of what was done, based on the description of the PR. :)

Note: I have updated the description of the PR. Hope that it is clearer.

seanbudd · 2023-03-23T05:13:28Z

user_docs/en/changes.t2t

@@ -33,6 +33,7 @@ This only worked for Bluetooth Serial ports before. (#14524)
 - NVDA no longer occasionally causes Mozilla Firefox to crash or stop responding. (#14647)
 - In Mozilla Firefox and Google Chrome, typed characters are no longer reported in some text boxes even when speak typed characters is disabled. (#14666)
 - You can now use browse mode in Chromium Embedded Controls where it was not possible previously. (#13493, #8553)
+- For symbols which do not have a symbol description in the current locale, the default English symbol level will be used. (#14558, #14417)


@CyrilleB79

I've rewritten your alternative proposal, can this be merged with this changelog entry?

Punctuation will be reported according to the default English symbol level for symbols which do not have a symbol description in the current locale. (, #14558, #14417)

@CyrilleB79

I've rewritten your alternative proposal, can this be merged with this changelog entry?

Punctuation will be reported according to the default English symbol level for symbols which do not have a symbol description in the current locale. (, #14558, #14417)

Yes, this description is correct; this PR can be merged. Thanks.

In Hindi, NVDA will not read anymore punctuation symbols whatever the…

0ec72d0

… punctuation level Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

Merge branch 'master' into updateCldr2

26e925b

CyrilleB79 force-pushed the updateCldr2 branch from 5be0a4e to 26e925b Compare January 17, 2023 22:04

CyrilleB79 mentioned this pull request Jan 18, 2023

NVDA cannot read CLDR characters after #14459 #14473

Closed

CyrilleB79 mentioned this pull request Feb 21, 2023

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

Closed

CyrilleB79 marked this pull request as ready for review February 27, 2023 20:46

CyrilleB79 requested a review from a team as a code owner February 27, 2023 20:46

CyrilleB79 requested a review from seanbudd February 27, 2023 20:46

seanbudd added the merge-early Merge Early in a developer cycle label Feb 27, 2023

seanbudd added this to the 2023.2 milestone Feb 27, 2023

seanbudd approved these changes Feb 27, 2023

View reviewed changes

Merge remote-tracking branch 'origin/master' into updateCldr2

08fed78

update changes

8d9340d

seanbudd reviewed Mar 23, 2023

View reviewed changes

seanbudd merged commit ecdddb2 into nvaccess:master Mar 23, 2023

CyrilleB79 deleted the updateCldr2 branch April 18, 2023 08:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

CyrilleB79 commented Jan 17, 2023 •

edited

AppVeyorBot commented Jan 17, 2023

CyrilleB79 commented Jan 17, 2023

OzancanKaratas commented Jan 18, 2023

CyrilleB79 commented Jan 18, 2023

OzancanKaratas commented Jan 19, 2023

CyrilleB79 commented Jan 19, 2023

CyrilleB79 commented Jan 30, 2023

CyrilleB79 commented Feb 27, 2023

seanbudd commented Feb 27, 2023

CyrilleB79 commented Feb 28, 2023

seanbudd commented Mar 2, 2023

CyrilleB79 commented Mar 3, 2023

Adriani90 commented Mar 3, 2023 •

edited

CyrilleB79 commented Mar 3, 2023

Adriani90 commented Mar 4, 2023

XLTechie commented Mar 4, 2023 via email

XLTechie commented Mar 4, 2023 via email

CyrilleB79 commented Mar 4, 2023

seanbudd Mar 23, 2023

CyrilleB79 Mar 23, 2023

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

Conversation

CyrilleB79 commented Jan 17, 2023 • edited

Context

Link to issue number:

Summary of the issue:

Preliminary note for review

Description of the issue

Description of user facing changes

Description of development approach

Testing strategy:

Known issues with pull request:

Change log entries:

Code Review Checklist:

AppVeyorBot commented Jan 17, 2023

CyrilleB79 commented Jan 17, 2023

OzancanKaratas commented Jan 18, 2023

CyrilleB79 commented Jan 18, 2023

OzancanKaratas commented Jan 19, 2023

CyrilleB79 commented Jan 19, 2023

CyrilleB79 commented Jan 30, 2023

CyrilleB79 commented Feb 27, 2023

seanbudd commented Feb 27, 2023

CyrilleB79 commented Feb 28, 2023

seanbudd commented Mar 2, 2023

CyrilleB79 commented Mar 3, 2023

Adriani90 commented Mar 3, 2023 • edited

CyrilleB79 commented Mar 3, 2023

Adriani90 commented Mar 4, 2023

XLTechie commented Mar 4, 2023 via email

XLTechie commented Mar 4, 2023 via email

CyrilleB79 commented Mar 4, 2023

seanbudd Mar 23, 2023

Choose a reason for hiding this comment

CyrilleB79 Mar 23, 2023

Choose a reason for hiding this comment

CyrilleB79 commented Jan 17, 2023 •

edited

Adriani90 commented Mar 3, 2023 •

edited