NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

Sanjog1605 · 2022-12-02T09:12:11Z

When reading text in any of the Indian language, NVDA reports all the punctuation symbols even when the punctuation level is set to none. This happens irrespective of which synthesizer is being used.

Steps to reproduce

Open the attached word document.
Change NVDA synthesizer to eSpeak if you are not using it already.
Set the punctuation level to none by pressing NVDA+P combination.
Read the document.
Sample Hindi Text.docx

Expected Result

When symbol level is set to none, NVDA should not report any of the punctuation marks.

Actual Result

NVDA reports some of the punctuation marks even when punctuation level is set to none.

System Information

OS: Windows 11 Home
Microsoft Office version: Microsoft 365 MSO (Version 2210 Build 16.0.15726.20188) 64-bit
NVDA version: 2022.4 Beta 2

Remarks

The issue is affecting thousands of students with visual disabilities who are studying in Hindi medium schools. Their reading experience is not good and they cannot sometimes understand the text if there are a lot of symbols. The issue is present even in older versions of NVDA. It is just that these students have noticed this now as they have just started using their laptops for reading and writing.

Brian1Gaff · 2022-12-03T09:59:59Z

Does this affect any other screenreader, such as the Built in Narrator or the Jaws demo? Brian

…

-- ***@***.*** Sent via blueyonder.(Virgin media) Please address personal E-mail to:- ***@***.***, putting 'Brian Gaff' in the display name field. ----- Original Message ----- From: "Sanjog1605" ***@***.***> To: "nvaccess/nvda" ***@***.***> Cc: "Subscribed" ***@***.***> Sent: Friday, December 02, 2022 9:12 AM Subject: [nvaccess/nvda] NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none (Issue #14417)

When reading text in any of the Indian language, NVDA reports all the punctuation symbols even when the punctuation level is set to none. This happens irrespective of which synthesizer is being used. ## Steps to reproduce 1. Open the attached word document. 2. Change NVDA synthesizer to eSpeak if you are not using it already. 3. Set the punctuation level to none by pressing NVDA+P combination. 4. Read the document. [Sample Hindi Text.docx](https://github.com/nvaccess/nvda/files/10139434/Sample.Hindi.Text.docx) ## Expected Result When symbol level is set to none, NVDA should not report any of the punctuation marks. ## Actual Result NVDA reports some of the punctuation marks even when punctuation level is set to none. ## System Information OS: Windows 11 Home Microsoft Office version: Microsoft 365 MSO (Version 2210 Build 16.0.15726.20188) 64-bit NVDA version: 2022.4 Beta 2 ## Remarks The issue is affecting thousands of students with visual disabilities who are studying in Hindi medium schools. Their reading experience is not good and they cannot sometimes understand the text if there are a lot of symbols. The issue is present even in older versions of NVDA. It is just that these students have noticed this now as they have just started using their laptops for reading and writing. -- Reply to this email directly or view it on GitHub: #14417 You are receiving this because you are subscribed to this thread. Message ID: ***@***.***>

Sanjog1605 · 2022-12-03T12:18:34Z

I have tried with Narrator but everything seems to be working fine. The problem occurs with NVDA. I haven’t checked with JAWS.

…t from the level below. (#4) Fixes nvaccess/nvda#14417 Summary of the issue: Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none. This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. Description of user facing changes In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to: Use the level for this symbol if it is defined there Use "none" (coming from English CLDR dic file) if the character is not defined in English symbol file but is defined in CLDR. Description of development approach For all languages except English ("en"), generate the cldr.dic file with "-" in the level field, meaning that the level is inherited from previous files. For English cldr.dic file, use "none" for the symbol level, as it was already before this PR.

Commit message: Use symbolLevel=none only for English; for other languages, inherit it from the level below. (#4) Fixes nvaccess/nvda#14417 Summary of the issue: Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none. This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. Description of user facing changes In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to: Use the level for this symbol if it is defined there Use "none" (coming from English CLDR dic file) if the character is not defined in English symbol file but is defined in CLDR. Description of development approach For all languages except English ("en"), generate the cldr.dic file with "-" in the level field, meaning that the level is inherited from previous files. For English cldr.dic file, use "none" for the symbol level, as it was already before this PR.

…evel is set to none. Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

… punctuation level Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

CyrilleB79 · 2022-12-19T14:48:41Z

@seanbudd , this issue has been closed a bit too early. It should be closed when #14459 is merged instead. Thanks.

CyrilleB79 · 2022-12-19T22:28:34Z

@Sanjog1605, as a Hindi speaker, you may want to test this build to validate #14459. I have done some test but am not a Hindi speaker at all.

… punctuation level (#14459) Fixes #14417 Summary of the issue: Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none. This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file. Description of user facing changes CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured. Description of development approach Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.

Sanjog1605 · 2022-12-23T16:19:37Z

Apologies for the late reply. I am not well from last 3 days. Give me a week to test it. I do not have my testing machine with me. From: Cyrille Bougot ***@***.***> Sent: 20 December 2022 03:59 To: nvaccess/nvda ***@***.***> Cc: Sanjog Kumawat ***@***.***>; Mention ***@***.***> Subject: Re: [nvaccess/nvda] NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none (Issue #14417) @Sanjog1605<https://github.com/Sanjog1605>, as a Hindi speaker, you may want to test this build<https://ci.appveyor.com/api/buildjobs/7xooa9dbwa90a0ft/artifacts/output%2Fnvda_snapshot_pr14459-27328%2C40921bfb.exe> to validate #14459<#14459>. I have done some test but am not a Hindi speaker at all. — Reply to this email directly, view it on GitHub<#14417 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AQHMUHDTUJYBXGFFGQA6FLLWODOR3ANCNFSM6AAAAAASRWHIQY>. You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>

CyrilleB79 · 2022-12-30T21:53:47Z

@seanbudd please reopen this issue since #14459 was reverted.

Sanjog1605 · 2023-01-02T13:29:47Z

I confirm that the given NVDA version for testing solves the problem of punctuation reporting for Hindi language. Thank you so much for solving the problem so quickly. Will this solution come out in the 2023 version of NVDA? As this is the snapshot version, I have not given this to the students. A tentative time frame of the release will be much appreciated.

Sanjog1605 · 2023-02-14T10:53:56Z

I have updated NVDA to version 2023.1 beta 1. When I tested for the mentioned issue, NVDA still reads punctuation while reading Hindi text. However, in the NVDA alpha version given to me for testing, everything is working fine. Looking at the version number, I thought that the issue will be resolved in 2023 version. I request you to let me know the approximate implementation time and in which NVDA version it will be implemented as many students are still struggling to read Hindi.

CyrilleB79 · 2023-02-21T12:26:39Z

I have updated NVDA to version 2023.1 beta 1. When I tested for the mentioned issue, NVDA still reads punctuation while reading Hindi text. However, in the NVDA alpha version given to me for testing, everything is working fine. Looking at the version number, I thought that the issue will be resolved in 2023 version. I request you to let me know the approximate implementation time and in which NVDA version it will be implemented as many students are still struggling to read Hindi.

Hi @Sanjog1605

Unfortunately the fix in #14459 was causing issue for Turkish (see #14473); thus it has been reverted.
Personally, I have no visibility on when this issue can be resolved since I have not been able to reproduce #14473 on my side and have not yet got the additional requested information in #14558.

seanbudd · 2023-02-21T19:50:09Z

Hi @CyrilleB79, without sufficient information from this reporter, let's re-open this PR against master. If the issue is reproducible, we will get a valid report by the next release.

XLTechie · 2023-02-21T23:39:42Z

@CyrilleB79 I have not studied this fix, but is this something that could be implemented in an add-on for Hindi users?

Adriani90 · 2023-02-22T10:00:24Z

Or maybe a global plugin taht ships with NVDA and applies only when NVDA language is set to Hindi? Is that possible?

Sanjog1605 · 2023-02-22T10:02:28Z

@seanbudd, if you need anything more from my side, please do let me know. I urge NVDA team to work on this issue as many Hindi speakers and students who are studying in Hindi medium schools are affected by this issue. The reason this came up now and not in the past is that the students have just now started using PCs in their studies, with the intervention of Bookshare in India. Bookshare as you might know is an e-library for persons with print disabilities, which apart from providing accessible school textbooks to the students in India, is also involved with digital literacy program where we teach students how to use PC and Android phones to study. This is how the issue came up as many students reported to me that NVDA reads some of the punctuation marks while reading Hindi.

@OzancanKaratas

… punctuation level (2nd attempt) (#14558) A first PR (#14459) had been merged to fix #14417. Unfortunately an issue was found (see #14473) so it has been reverted in #14477. This PR is a second attempt to fix #14417 without causing #14473. It will remain a draft until I can have more information on #14473 from @OzancanKaratas, as requested in #14473 (comment), or from anyone else able to reproduce. Link to issue number: Fixes #14417 Summary of the issue: Preliminary note for review Keep in mind the following: in NVDA with CLDR enabled and with no custom user symbol defined, symbol level for symbol X is defined as follows: look at locale symbol file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at locale CLDR file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at English symbol file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file. look at English CLDR file: If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, use default symbol level (don't remember if it is None or All). Description of the issue Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file. Currently, CLDR files are generated with level "None" for all symbols. Usually, in locales with a CLDR file and a normal symbol files, less common characters that are only in CLDR are reported at level None, i.e. whatever the punctuation level setting of the user. But common punctuation symbols (dot, question marke, etc.) are added by translators in the locale symbol file what allows to have these symbols reported at a higher punctuation level. For Hindi (or any language with no current symbol translated), all the characters present in CLDR file are reported at "None" level and above (i.e. at any level), because the level is not redefined in the locale (Hindi) symbol file. In such situation, using the level of the locale CLDR (None) is not a good strategy. It would be better to take advantage of the levels defined for the symbols in the English symbol file. Description of user facing changes CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured. Description of development approach Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.

CyrilleB79 mentioned this issue Dec 5, 2022

Inherit symbol level for languages other than English nvaccess/nvda-cldr#4

Merged

6 tasks

seanbudd added p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation. labels Dec 6, 2022

CyrilleB79 mentioned this issue Dec 9, 2022

Do not ignore locale CLDR file when locale symbol file is missing #14433

Merged

6 tasks

seanbudd closed this as completed in nvaccess/nvda-cldr#4 Dec 18, 2022

CyrilleB79 added a commit to CyrilleB79/nvda that referenced this issue Dec 19, 2022

NVDA will not read punctuation symbols in Hindi even if punctuation l…

a9b4ffc

…evel is set to none. Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

CyrilleB79 added a commit to CyrilleB79/nvda that referenced this issue Dec 19, 2022

In Hindi, NVDA will not read anymore punctuation symbols whatever the…

0ec72d0

… punctuation level Fixes nvaccess#14417 Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4

CyrilleB79 mentioned this issue Dec 19, 2022

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level #14459

Merged

6 tasks

seanbudd reopened this Dec 19, 2022

seanbudd closed this as completed in #14459 Dec 21, 2022

nvaccessAuto added this to the 2023.1 milestone Dec 21, 2022

CyrilleB79 mentioned this issue Dec 30, 2022

Revert "In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level" #14477

Merged

seanbudd reopened this Jan 2, 2023

seanbudd removed this from the 2023.1 milestone Jan 2, 2023

This was referenced Jan 17, 2023

In Hindi, NVDA will not read anymore punctuation symbols whatever the punctuation level (2nd attempt) #14558

Merged

NVDA cannot read CLDR characters after #14459 #14473

Closed

seanbudd closed this as completed in #14558 Mar 23, 2023

nvaccessAuto added this to the 2023.2 milestone Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

Sanjog1605 commented Dec 2, 2022

Brian1Gaff commented Dec 3, 2022 via email

Sanjog1605 commented Dec 3, 2022 via email •

edited by feerrenrut

CyrilleB79 commented Dec 19, 2022

CyrilleB79 commented Dec 19, 2022

Sanjog1605 commented Dec 23, 2022 via email

CyrilleB79 commented Dec 30, 2022

Sanjog1605 commented Jan 2, 2023

Sanjog1605 commented Feb 14, 2023

CyrilleB79 commented Feb 21, 2023

seanbudd commented Feb 21, 2023

XLTechie commented Feb 21, 2023 via email

Adriani90 commented Feb 22, 2023

Sanjog1605 commented Feb 22, 2023

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

Comments

Sanjog1605 commented Dec 2, 2022

Steps to reproduce

Expected Result

Actual Result

System Information

Remarks

Brian1Gaff commented Dec 3, 2022 via email

Sanjog1605 commented Dec 3, 2022 via email • edited by feerrenrut

CyrilleB79 commented Dec 19, 2022

CyrilleB79 commented Dec 19, 2022

Sanjog1605 commented Dec 23, 2022 via email

CyrilleB79 commented Dec 30, 2022

Sanjog1605 commented Jan 2, 2023

Sanjog1605 commented Feb 14, 2023

CyrilleB79 commented Feb 21, 2023

seanbudd commented Feb 21, 2023

XLTechie commented Feb 21, 2023 via email

Adriani90 commented Feb 22, 2023

Sanjog1605 commented Feb 22, 2023

Sanjog1605 commented Dec 3, 2022 via email •

edited by feerrenrut