Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVDA reads punctuation symbols in Indian languages even if punctuation level is set to none #14417

Closed
Sanjog1605 opened this issue Dec 2, 2022 · 13 comments · Fixed by nvaccess/nvda-cldr#4, #14459 or #14558
Labels
p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation.
Milestone

Comments

@Sanjog1605
Copy link

When reading text in any of the Indian language, NVDA reports all the punctuation symbols even when the punctuation level is set to none. This happens irrespective of which synthesizer is being used.

Steps to reproduce

  1. Open the attached word document.
  2. Change NVDA synthesizer to eSpeak if you are not using it already.
  3. Set the punctuation level to none by pressing NVDA+P combination.
  4. Read the document.
    Sample Hindi Text.docx

Expected Result

When symbol level is set to none, NVDA should not report any of the punctuation marks.

Actual Result

NVDA reports some of the punctuation marks even when punctuation level is set to none.

System Information

OS: Windows 11 Home
Microsoft Office version: Microsoft 365 MSO (Version 2210 Build 16.0.15726.20188) 64-bit
NVDA version: 2022.4 Beta 2

Remarks

The issue is affecting thousands of students with visual disabilities who are studying in Hindi medium schools. Their reading experience is not good and they cannot sometimes understand the text if there are a lot of symbols. The issue is present even in older versions of NVDA. It is just that these students have noticed this now as they have just started using their laptops for reading and writing.

@Brian1Gaff
Copy link

Brian1Gaff commented Dec 3, 2022 via email

@Sanjog1605
Copy link
Author

Sanjog1605 commented Dec 3, 2022 via email

@seanbudd seanbudd added p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation. labels Dec 6, 2022
seanbudd pushed a commit to nvaccess/nvda-cldr that referenced this issue Dec 18, 2022
…t from the level below. (#4)

Fixes nvaccess/nvda#14417

Summary of the issue:
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place.
But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none.

This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file.

Description of user facing changes
In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to:

Use the level for this symbol if it is defined there
Use "none" (coming from English CLDR dic file) if the character is not defined in English symbol file but is defined in CLDR.
Description of development approach
For all languages except English ("en"), generate the cldr.dic file with "-" in the level field, meaning that the level is inherited from previous files.
For English cldr.dic file, use "none" for the symbol level, as it was already before this PR.
github-actions bot pushed a commit to nvaccess/nvda-cldr that referenced this issue Dec 18, 2022
Commit message:
Use symbolLevel=none only for English; for other languages, inherit it from the level below. (#4)
Fixes nvaccess/nvda#14417

Summary of the issue:
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place.
But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none.

This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file.

Description of user facing changes
In NVDA, locale CLDR dic file inherits symbol levels from the files coming after, i.e. English symbols and English CLDR. In case the locale symbol file does not define a character's level, this allows to:

Use the level for this symbol if it is defined there
Use "none" (coming from English CLDR dic file) if the character is not defined in English symbol file but is defined in CLDR.
Description of development approach
For all languages except English ("en"), generate the cldr.dic file with "-" in the level field, meaning that the level is inherited from previous files.
For English cldr.dic file, use "none" for the symbol level, as it was already before this PR.
CyrilleB79 added a commit to CyrilleB79/nvda that referenced this issue Dec 19, 2022
…evel is set to none.

Fixes nvaccess#14417

Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4
CyrilleB79 added a commit to CyrilleB79/nvda that referenced this issue Dec 19, 2022
… punctuation level

Fixes nvaccess#14417

Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4
@CyrilleB79
Copy link
Collaborator

@seanbudd , this issue has been closed a bit too early. It should be closed when #14459 is merged instead. Thanks.

@CyrilleB79
Copy link
Collaborator

@Sanjog1605, as a Hindi speaker, you may want to test this build to validate #14459. I have done some test but am not a Hindi speaker at all.

@seanbudd seanbudd reopened this Dec 19, 2022
@nvaccessAuto nvaccessAuto added this to the 2023.1 milestone Dec 21, 2022
seanbudd pushed a commit that referenced this issue Dec 21, 2022
… punctuation level (#14459)

Fixes #14417

Summary of the issue:
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place.
But there is a Hindi CLDR file. Thus the symbol level for symbols such as common punctuation (dot, question marke, etc.) is the one of CLDR, i.e. none.
This is not adapted and it would be better to take advantage of the symbol levels that are defined in the English symbol file.

Description of user facing changes
CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured.

Description of development approach
Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.
@Sanjog1605
Copy link
Author

Sanjog1605 commented Dec 23, 2022 via email

@CyrilleB79
Copy link
Collaborator

@seanbudd please reopen this issue since #14459 was reverted.

@Sanjog1605
Copy link
Author

I confirm that the given NVDA version for testing solves the problem of punctuation reporting for Hindi language. Thank you so much for solving the problem so quickly. Will this solution come out in the 2023 version of NVDA? As this is the snapshot version, I have not given this to the students. A tentative time frame of the release will be much appreciated.

@Sanjog1605
Copy link
Author

I have updated NVDA to version 2023.1 beta 1. When I tested for the mentioned issue, NVDA still reads punctuation while reading Hindi text. However, in the NVDA alpha version given to me for testing, everything is working fine. Looking at the version number, I thought that the issue will be resolved in 2023 version. I request you to let me know the approximate implementation time and in which NVDA version it will be implemented as many students are still struggling to read Hindi.

@CyrilleB79
Copy link
Collaborator

I have updated NVDA to version 2023.1 beta 1. When I tested for the mentioned issue, NVDA still reads punctuation while reading Hindi text. However, in the NVDA alpha version given to me for testing, everything is working fine. Looking at the version number, I thought that the issue will be resolved in 2023 version. I request you to let me know the approximate implementation time and in which NVDA version it will be implemented as many students are still struggling to read Hindi.

Hi @Sanjog1605

Unfortunately the fix in #14459 was causing issue for Turkish (see #14473); thus it has been reverted.
Personally, I have no visibility on when this issue can be resolved since I have not been able to reproduce #14473 on my side and have not yet got the additional requested information in #14558.

@seanbudd
Copy link
Member

Hi @CyrilleB79, without sufficient information from this reporter, let's re-open this PR against master. If the issue is reproducible, we will get a valid report by the next release.

@XLTechie
Copy link
Collaborator

XLTechie commented Feb 21, 2023 via email

@Adriani90
Copy link
Collaborator

Or maybe a global plugin taht ships with NVDA and applies only when NVDA language is set to Hindi? Is that possible?

@Sanjog1605
Copy link
Author

@seanbudd, if you need anything more from my side, please do let me know. I urge NVDA team to work on this issue as many Hindi speakers and students who are studying in Hindi medium schools are affected by this issue. The reason this came up now and not in the past is that the students have just now started using PCs in their studies, with the intervention of Bookshare in India. Bookshare as you might know is an e-library for persons with print disabilities, which apart from providing accessible school textbooks to the students in India, is also involved with digital literacy program where we teach students how to use PC and Android phones to study. This is how the issue came up as many students reported to me that NVDA reads some of the punctuation marks while reading Hindi.

seanbudd pushed a commit that referenced this issue Mar 23, 2023
… punctuation level (2nd attempt) (#14558)

A first PR (#14459) had been merged to fix #14417. Unfortunately an issue was found (see #14473) so it has been reverted in #14477.

This PR is a second attempt to fix #14417 without causing #14473. It will remain a draft until I can have more information on #14473 from @OzancanKaratas, as requested in #14473 (comment), or from anyone else able to reproduce.

Link to issue number:
Fixes #14417

Summary of the issue:
Preliminary note for review
Keep in mind the following: in NVDA with CLDR enabled and with no custom user symbol defined, symbol level for symbol X is defined as follows:

look at locale symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at locale CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English symbol file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, look at next file.
look at English CLDR file:
If X is defined in this file and a symbol level is defined for X, then this level applies for X. Else, use default symbol level (don't remember if it is None or All).
Description of the issue
Hindi has no symbol defined in its symbol file, only copyright header; seems that the file was prepared for translation but no actual symbol translation took place. But there is a Hindi CLDR file.

Currently, CLDR files are generated with level "None" for all symbols.

Usually, in locales with a CLDR file and a normal symbol files, less common characters that are only in CLDR are reported at level None, i.e. whatever the punctuation level setting of the user. But common punctuation symbols (dot, question marke, etc.) are added by translators in the locale symbol file what allows to have these symbols reported at a higher punctuation level.

For Hindi (or any language with no current symbol translated), all the characters present in CLDR file are reported at "None" level and above (i.e. at any level), because the level is not redefined in the locale (Hindi) symbol file.

In such situation, using the level of the locale CLDR (None) is not a good strategy. It would be better to take advantage of the levels defined for the symbols in the English symbol file.

Description of user facing changes
CLDR data will be available for languages which had no symbol file (am, et, kk, ne, th, ur) or empty symbol file (hi). For these languages, since there are no locale symbol file definition, the level defined in the English symbol file will be honoured.

Description of development approach
Update nvda-cldr repository to get the changes implemented in nvaccess/nvda-cldr#4.
@nvaccessAuto nvaccessAuto added this to the 2023.2 milestone Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority triaged Has been triaged, issue is waiting for implementation.
Projects
None yet
7 participants