Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly report accented letters after cap #11977

Merged
merged 14 commits into from Apr 21, 2021

Conversation

CyrilleB79
Copy link
Collaborator

@CyrilleB79 CyrilleB79 commented Dec 27, 2020

Link to issue number:

Fixes #11948

Summary of the issue:

As described in #11948, some letter with diacritic is not spelt correctly with some synth if "Say cap before capitale" is checked, e. g. 'À' is spelt 'cap A' instead of 'cap A grave'.
Looking at getSpellingSpeech function in speech\__init__.py, it appears that this is due to the fact that the string 'cap À' is passed to the synth in the speech sequence in such a case. But the synth is only asked to use character mode for the strings whose length is 1. This is wrong in this case, because 'cap' should be spoken with character mode off, but 'À' should be spoken with character mode on.

Description of how this pull request fixes the issue:

In getSpellingSpeech, in the string sequence, I have separated the 'cap' message string from the character to be spelt. If the character to be spelt is of length 1, i.e. not replaced by symbol description or modified pronunciation, character mode is activated. If a string in the speech sequence has a length > 1, because it is a capitalization message or because it corresponds to a character description or symbol replacement, character mode is disabled.

Note that the 'cap' message can be of two types:

  • with 'cap' indication before as in English: 'cap %s'
  • with 'cap' indication after as in French: '%s majuscule'
    Thus this message is split in 2 parts in the speech sequence.

I have also taken advantage of the rework I had to do in getSpellingSpeech to split this function that was too complex. I have thus removed the corresponding flake8 exception (noqa comment).

Testing performed:

  • Tested with 'Œ' character in French that it was pronounced as a character when spelt with eSpeak and with OneCore, what was not the case before this PR.
  • Checked the speech sequence in the log: 1-character long string are correctly preceded by a CharacterMode(True) command if not already in character mode.

Known issues with pull request:

  • This PR does not fix all character spelling issues. E.g. OneCore seems not to spell 'Á' correctly even if set in char mode, but the 'Say cap before capitale' option has not impact on this issue.
  • This PR assumes that the translated capitalization message has 0 or at least 2 characters before the '%s'; and similarly, the translated capitalization message should have 0 or at least 2 characters after the '%s'. Should it not be the case, some parts of this message would be spelt by the synth instead of being normally read. I have checked with a python script
    that this assumption is verified in all the existing translations of NVDA.
    If this does not seems robust enough, I may add a unit test such as the check_pot to check in all the .po files that this condition is met. Just let me know.

Change log entry:

Section: Bug fixes

NVDA should no longer spell incorrectly some letters with accents or other diacritic if 'Say cap before capitale' option is checked. (#11948)

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really helpful if this added unit tests for these modified functions.

source/speech/__init__.py Outdated Show resolved Hide resolved
source/speech/__init__.py Outdated Show resolved Hide resolved
source/speech/__init__.py Outdated Show resolved Hide resolved
@CyrilleB79
Copy link
Collaborator Author

It would be really helpful if this added unit tests for these modified functions.

OK, I will have a look at it.
Since getSpellingSpeech has been split in various functions, do you expect unit tests only on getSpellingSpeech with various use cases to test all the branches (use cases) of the old getSpellingSpeech?
Or do you also expect unit tests on the newly created helper functions called by getsSpellingSpeech (_getSpellingSpeechAddCharMode, _getSpellingCharWithCaseInfo, _getSpellingSpeechWithoutCharMode)?

Passing this PR as draft while developing unit tests.

@CyrilleB79 CyrilleB79 marked this pull request as draft March 25, 2021 10:44
@feerrenrut
Copy link
Contributor

Since getSpellingSpeech has been split in various functions, do you expect unit tests only on getSpellingSpeech with various use cases to test all the branches (use cases) of the old getSpellingSpeech?
Or do you also expect unit tests on the newly created helper functions called by getsSpellingSpeech (_getSpellingSpeechAddCharMode, _getSpellingCharWithCaseInfo, _getSpellingSpeechWithoutCharMode)?

It's probably clearer to test the three internal functions separately.

@CyrilleB79
Copy link
Collaborator Author

CyrilleB79 commented Apr 4, 2021

Hello @feerrenrut

I am trying to set up unit test for the new functions. I am not used to unit test development and have not ever done it for NVDA. Thus, I need some guidance.

I have created a test_speech.py file containing a Test_getSpellingSpeechAddCharMode class.
This class contains a test_symbolNameAtTheEnd function as follow:

def test_symbolNameAtTheEnd(self):
	seq = makeGen(['h', EndUtteranceCommand(), 'e', EndUtteranceCommand(), 'l', EndUtteranceCommand(), 'l', EndUtteranceCommand(), 'o', EndUtteranceCommand(), 'bang', EndUtteranceCommand()])
	expected = repr([CharacterModeCommand(True), 'h', EndUtteranceCommand(), 'e', EndUtteranceCommand(), 'l', EndUtteranceCommand(), 'l', EndUtteranceCommand(), 'o', EndUtteranceCommand(), CharacterModeCommand(False), 'bang', EndUtteranceCommand()])
	output = _getSpellingSpeechAddCharMode(seq)
	self.assertEqual(repr(list(output)), expected)

When running the unit test, I get the following error:

Traceback (most recent call last):
 File "D:\Cyrille\Dev\GIT\nvda\tests\unit\test_speech.py", line 29, in test_symbolNameAtTheEnd
 self.assertEqual(repr(list(output)), expected)
 File "D:\Cyrille\Dev\GIT\nvda\source\speech\__init__.py", line 222, in _getSpellingSpeechAddCharMode
 synthConfig = config.conf["speech"][synth.name]
 AttributeError: 'NoneType' object has no attribute 'name'

There is probably some sort of stub in unit tests that allow to avoid having a real synth driver.
Where and how should I create a synth stub? The same for config that will probably be missing.

Also, any feedback is welcome. All this is quite new for me.

Thanks for your help.

@feerrenrut
Copy link
Contributor

Ok, I see the issue, several of these newly introduced methods are coupled to config. Rather than look up the synth driver, and config from directly in the method, pass them in as arguments. If this is done for all of these functions then it is only the "glue" function getSpellingSpeech that is coupled to config. This will also make it easier to be sure you have tested all the combinations.

The test would become like:

def test_symbolNameAtTheEnd(self):
	seq = makeGen([
		'h', EndUtteranceCommand(),
		'e', EndUtteranceCommand(),
		'l', EndUtteranceCommand(),
		'l', EndUtteranceCommand(),
		'o', EndUtteranceCommand(),
		'bang', EndUtteranceCommand()
	])
	expected = repr([
		CharacterModeCommand(True),
		'h', EndUtteranceCommand(),
		'e', EndUtteranceCommand(),
		'l', EndUtteranceCommand(),
		'l', EndUtteranceCommand(),
		'o', EndUtteranceCommand(),
		CharacterModeCommand(False),
		'bang', EndUtteranceCommand()
	])
	output = _getSpellingSpeechAddCharMode(seq, useSpellingFunctionality = True)
	self.assertEqual(repr(list(output)), expected)

You would also add a test for useSpellingFunctionality = False

@CyrilleB79 CyrilleB79 marked this pull request as ready for review April 10, 2021 22:14
@CyrilleB79
Copy link
Collaborator Author

@feerrenrut, I have added unit tests for _getSpellingSpeechAddCharMode and _getSpellingCharAddCapNotification.

Regarding _getSpellingCharAddCapNotification, it would be interesting to add another test case for languages such as French where the 'cap %s' message is translated with %s before ('%s majuscule' in French). I guess I should override the _ function in some way. How should I do?

Regarding _getSpellingSpeechWithoutCharMode it contains the majority of the code that was in getSpellingSpeech and that I did not modified. Would you recommend to add unit tests for this function also? Passing synth config parameters as additional arguments do not seem to simplify the code. What do you think?
Any other idea of refactor?

Thanks!

@feerrenrut
Copy link
Contributor

it would be interesting to add another test case for languages such as French where the 'cap %s' message is translated with %s before ('%s majuscule' in French). I guess I should override the _ function in some way. How should I do?

Yes. Since you know that there are variations that must be handled, you could do it like that. Rather than call it a French test, which implies that locale is being set, I'd say "test variations on placeholder position (which occurs in other languages like French)". It would be good the code didn't have to deconstruct those strings for these variations. I don't fully understand the need for it and was hoping that the unit tests will demonstrate it to me.

Regarding _getSpellingSpeechWithoutCharMode it contains the majority of the code that was in getSpellingSpeech and that I did not modified. Would you recommend to add unit tests for this function also?

Yes, please do. Since you have taken the time to understand it, and what the expected output is, it would be great to document that via unit tests.

Passing synth config parameters as additional arguments do not seem to simplify the code. What do you think?
Any other idea of refactor?

While it might not immediately seem like it, I think it is a good thing to expose the underlying complexity. The fact is that this code depends on many variables. It is better for this to be explicit. One additional advantage is that it is then clear that it doesn't also modify config.

@CyrilleB79
Copy link
Collaborator Author

it would be interesting to add another test case for languages such as French where the 'cap %s' message is translated with %s before ('%s majuscule' in French). I guess I should override the _ function in some way. How should I do?

Yes. Since you know that there are variations that must be handled, you could do it like that. Rather than call it a French test, which implies that locale is being set, I'd say "test variations on placeholder position (which occurs in other languages like French)". It would be good the code didn't have to deconstruct those strings for these variations. I don't fully understand the need for it and was hoping that the unit tests will demonstrate it to me.

The 'cap %s' string has to be deconstructed in some way because:

  • the word 'cap' has to be pronounced as word by the synth (i.e. has to follow a CharacterModeCommand(False) command)
  • '%s' has to be pronounced as character by the synth (i.e. has to follow CharacterModeCommand(True) command).
    I agree this seems ugly code but I did not find a more elegant way to do this.

Regarding _getSpellingSpeechWithoutCharMode it contains the majority of the code that was in getSpellingSpeech and that I did not modified. Would you recommend to add unit tests for this function also?

Yes, please do. Since you have taken the time to understand it, and what the expected output is, it would be great to document that via unit tests.

OK, will do.

@AppVeyorBot
Copy link

See test results for failed build of commit 6b11603f1b

@CyrilleB79
Copy link
Collaborator Author

As requested, I have refactored a bit _getSpellingSpeechWithoutCharMode and implemented some unit tests for it.

For now I have skipped test_capNotificationsWithPlaceHolderAfter since, as explained before, I do not know how to implement it correctly. The way it is implemented however gives the idea of the goal of this test.
If you have a suggestion to implement it correctly, I will be happy to include it. If not, I may remove this test.

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have given some suggestions, let me know if you'd like more guidance.

source/speech/__init__.py Outdated Show resolved Hide resolved
tests/unit/test_speech.py Outdated Show resolved Hide resolved
tests/unit/test_speech.py Outdated Show resolved Hide resolved
@AppVeyorBot
Copy link

See test results for failed build of commit bf01e4df41

CyrilleB79 and others added 3 commits April 16, 2021 00:01
Install a fake gettext translation class while overriding translation return values
@CyrilleB79
Copy link
Collaborator Author

@feerrenrut, thanks for your last commit.
I have reviewed your changes that seems OK to me. I have closed the 2 last open conversations with my comments.

If you do not have any other point to comment, this PR is ready from my side.

Use a tear down method (run for each test) instead
@CyrilleB79 CyrilleB79 requested a review from a team as a code owner April 21, 2021 08:02
Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CyrilleB79

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants