-
-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sentence navigation #16384
base: master
Are you sure you want to change the base?
Sentence navigation #16384
Conversation
See test results for failed build of commit a1f18e45a7 |
I have a technical question regarding how to deal with non-breaking prefixes.
Non-breaking prefixes are obviously language-dependent. So far I just hardcoded English non-breaking prefixes, but that obviously needs to be changed before merging. I think we should make non-breaking prefixes to go through translation system to have it translated to all NVDA languages. But I see a few issues here to call out:
|
Thanks @mltony, Based on Sentence-Nav I want to know:
|
No. But it should be very straightforward to add it later provided NVAccess is aligned.
It currently uses the same regex as for text paragraph navigation, which is configurable. But its only a single regex.
I don't override MSWord sentence navigation here. It should be easy to override later - again as long as NVAccess is aligned. In general I think getting this PR approved and merged is going to be somewhat challenging because there are many unknowns regarding non-breaking prefixes. So at this point I would like to focus on core sentence navigation, and postpone bells and whistels to later PRs. |
@mltony wrote:
At end of line, after two spaces:
|
This comment was marked as outdated.
This comment was marked as outdated.
How can we send that tuple to translators? If translation system supports tuples of variable length then should be doable, but please show me an example. |
@mltony I have no idea. Nor do I know how you'll move a long and fragile regular
expression string through the translation process.
Especially with the translation system the way it is now, I know even
less about it than I did before.
Apologies for the inapplicable idea.
|
This is now ready for review. |
Hi @mltony
There seems to be no language difference, I used the English and Chinese versions of the NVDA User Guide for testing. |
@cary-rowen, try resetting your value for text paragraph regex. It should work with the default value. |
See test results for failed build of commit 24be7c8425 |
@mltony did you test this in browse mode in pdf documents, with Adobe reader, outlook messages in MS outlook and browse mode in MS Word? Does it give an error in browse mode in MS Excel? Or did you implement a message that this is not supported there? |
@mltony Re non breaking prefixes, it might be worth to add all so called "period words" in a database first and import them from there. Maybe there could be a file similar to symbols.dic called periodwords.dic. At least with such a file you could pass it to the translation system and we would get all of the international prefixes translated while translators can also add their language specific period words to their own language's periodwords.dic file. |
|
See test results for failed build of commit ee7ed21aad |
Hi @mltony Thank you for your efforts on this. I will continue to pay attention to this feature and provide the following test work for your reference. Test in Windows 10 Notepad:Steps
Actual behavior:
expected behavior
Test in NVDA Speech Viewer or Bookworm:Sentence navigation is not working. If you use the Sentence-nav add-on, you need to enable overriding MS Word's sentence navigation behavior in settings. Thanks |
|
See test results for failed build of commit cf90f52694 |
Yes.
OK, regarding this behavior, if it is very difficult then I don't think there is a need to put too much effort into it.
Yes, I reproduce this on Notepad on Windows 10 as well. As an improvement on step 4, I think another behavior might be better: Personally, I would like it to report the first sentence, ie: when there are no more sentences our caret should be placed at the beginning of the last sentence, no matter how many times I press Alt+DownArrow. This is consistent with the paragraph navigation pattern in document navigation.
That's weird, can you confirm this again, it seems to me that Speech Viewer behaves the same as Bookworm, i.e. sentence navigation doesn't work at all. The test I performed was with Speech Viewer open: I asked NVDA to report the Chinese sentences given in the comments above. Then try sentence navigation in Speech Viewer. |
Hi @mltony While reminding you not to forget the above comment, I thought of another question. I'm wondering, is it necessary that we share the regex option with paragraph navigation? Is there a way we can make the regex options more understandable? Thanks |
I have not reviewed this PR. But, on contrary to what I wrote earlier, I understand now why the regexp for text paragraph navigation and the one for sentence navigation may be separately configurable. Of course, since we do not have another criterion now, we may also wait for a real need before duplicating these fields. Re the possibility to better understand the regexp, we may use a verbose regexp in the field, rather than a normal one. Verbose regexps ignore comments, spaces and newline characters (unless escaped). We should however take into account that verbose regexps are less known than normal ones; I have discovered this option only some weeks ago... |
I agree in case of sentence navigation, the carret should always stay at the beginning of the sentence as it does with paragraph navigation when the paragraph is handled by NVDA in document navigation settings, no matter if it is the first or the last sentence or any senence in the middle. |
No, I haven't yet. I just have the above thoughts from the perspective of ease of understanding. |
To be honest, I don’t really like using speech to indicate boundaries. I prefer to use beeps or play sound to indicate boundaries. A similar request has already been made by #13612. |
By the way, sentence-nav add-on is better and there are also beep when navigating from one paragraph to another. |
There is no consistency in paragraph navigation between apps. Libreoffice behaves differently compared to ms Word for example. Nvda paragraph navigation is the most consistent paragraph handling we have so far.Von meinem iPhone gesendetAm 24.04.2024 um 17:36 schrieb Rowen ***@***.***>:
Instead of repeating the first or the last sentence, there could be a message reporting "no next sentence" or "no previous sentence". This is also consistent with NVDA paragraph navigation.
To be honest, I don’t really like using speech to indicate boundaries.
At this point, NVDA's paragraph navigation is inconsistent with other behaviors, such as paragraph navigation in Word.
I prefer to use beeps or play sound to indicate boundaries.
A similar request has already been made by #13612.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Indicating boundaries by sounds is a valid issue but out of scope for this PR I guess.Von meinem iPhone gesendetAm 24.04.2024 um 17:36 schrieb Rowen ***@***.***>:
Instead of repeating the first or the last sentence, there could be a message reporting "no next sentence" or "no previous sentence". This is also consistent with NVDA paragraph navigation.
To be honest, I don’t really like using speech to indicate boundaries.
At this point, NVDA's paragraph navigation is inconsistent with other behaviors, such as paragraph navigation in Word.
I prefer to use beeps or play sound to indicate boundaries.
A similar request has already been made by #13612.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
You confuse me that this scheme has never been done elsewhere except for what NVDA implemented in #13798 - using speech to indicate the last and first paragraphs. Where does your so-called "consistency" come from in this case?
What we have been discussing is how to prompt the user when navigating to the first and last sentences during sentence navigation. Speech or play sound are the two options under this issue. Like it was done in #13798, so I don't understand even more what you mean by out of scope. |
What I wanted to say is that when paragraphs are handled by applications, in some applications NVDA repeats the first or last paragraph, in some applications NVDA is silent, in some applications a windows bing sound is heared, etc. For now The boundary reporting for paragraphs in NVDA is consistent all over the place when paragraphs are handled by NVDA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor recommended change to user guide text, otherwise docs read well.
|
||
This combo box allows you to select whether NVDA will try to identify and reconstruct sentences spanning across multiple paragraphs when navigating by sentence via``alt+upArrow`` and ``alt+downArrow``. | ||
Sentence reconstruction comes in handy in two typical use cases: | ||
- In most Most PDF files text is incorrectly split into paragraphs. Every line typically denotes a paragraph and thus most sentences span many adjacent paragraphs and in order to identify complete sentences NVDA must look into adjacent paragraphs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate word: most (In most Most PDF files...). Also "Every line typically denotes a...." should be on a new line. I would also split into two sentences as:
Every line typically denotes a paragraph and thus most sentences span many adjacent paragraphs.
In order to identify complete sentences NVDA must look into adjacent paragraphs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done a first review of a part of the code and listed comments/suggestions here.
However, I'll also provide more important remarks in my next comment.
@@ -2645,9 +2647,60 @@ def makeSettings(self, settingsSizer: wx.BoxSizer) -> None: | |||
) | |||
self.bindHelpEvent("ParagraphStyle", self.paragraphStyleCombo) | |||
|
|||
# Translators: This is a label for sentence reconstruction in the document navigation dialog | |||
label = _("&Sentencew Sentence reconstruction:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typo here:
label = _("&Sentencew Sentence reconstruction:") | |
label = _("&Sentence reconstruction:") |
# Look behind clause ensures that we have a text character before text punctuation mark. | ||
# We have a positive lookBehind \w that resolves to a text character in any language, | ||
# coupled with negative lookBehind \d that excludes digits. | ||
lookBehind=r'(?<=\w)(?<!\d)', | ||
plb=r"(?<=\w)(?<!\d)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nlb
is quite unclear. Could we use a more explicit token name?
E.g. letterCharBefore
indicates that it is a look behind assertion (with Before
) and what it should contain (a letter character).
plb=r"(?<=\w)(?<!\d)", | |
letterCharBefore=r"(?<=\w)(?<!\d)", |
Also, is the negative look behind assertion needed? If we already know that the character before is a letter (\w
), we probably do not need to test that it is not a digit (\d
). Or do I miss something?
So the expression would rather become as follows:
plb=r"(?<=\w)(?<!\d)", | |
letterCharBefore=r"(?<=\w)", |
# Language-specific exceptions: characters suggesting that the following dot is not indicative | ||
# of a sentence stop. | ||
# This is a negative look-behind and will be inserted later when language is specified. | ||
nlb="{nonBreakingRegex}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably sem here: better something more explicit indicating what the expression matches.
nlb="{nonBreakingRegex}", | |
nonBreakingRegex="{nonBreakingRegex}", |
try: | ||
sentenceNavigation.getSentenceStopRegex(nonBreakingPrefixes=nbp) | ||
except re.error as e: | ||
log.debugWarning("Failed to compile text paragraph regex", exc_info=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.debugWarning("Failed to compile text paragraph regex", exc_info=True) | |
log.debugWarning("Failed to compile sentence stop regex", exc_info=True) |
except re.error as e: | ||
log.debugWarning("Failed to compile text paragraph regex", exc_info=True) | ||
gui.messageBox( | ||
# Translators: Message shown when invalid text paragraph regex entered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Translators: Message shown when invalid text paragraph regex entered | |
# Translators: Message shown when invalid sentence stop regex entered |
@@ -2502,6 +2502,26 @@ Note that this paragraph style cannot be used in Microsoft Word or Microsoft Out | |||
|
|||
You may toggle through the available paragraph styles from anywhere by assigning a key in the [Input Gestures dialog #InputGestures]. | |||
|
|||
==== Sentence reconstruction = ====[SentenceReconstruction] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo:
==== Sentence reconstruction = ====[SentenceReconstruction] | |
==== Sentence reconstruction ====[SentenceReconstruction] |
|
||
This combo box allows you to select whether NVDA will try to identify and reconstruct sentences spanning across multiple paragraphs when navigating by sentence via``alt+upArrow`` and ``alt+downArrow``. | ||
Sentence reconstruction comes in handy in two typical use cases: | ||
- In most Most PDF files text is incorrectly split into paragraphs. Every line typically denotes a paragraph and thus most sentences span many adjacent paragraphs and in order to identify complete sentences NVDA must look into adjacent paragraphs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already seen such PDFs, not "most of them" though.
My question is: is this issue coming from the PDF only or also from how NVDA renders it in the virtual buffer? Said otherwise: Are the words also split into separate lines also with Jaws and Narrator? If not, could NVDA implement a fix tha would even not be visible at all from a user point of view?
Paragraph reconstruction is a nice work around provided for Sentence Nav (add-on or core feature). But have we investigated the root cause of this and decided that we cannot provide better than this workaround?
def _displayStringLabels(self): | ||
return { | ||
# Translators: Sentence reconstruction mode | ||
SentenceReconstructionFlag.NEVER: _("Never reconstruct sentences across paragraphs"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the GUI, the label is already "Sentence reconstruction"; thus, there is no need to duplicate these words in the combo-box's values.
SentenceReconstructionFlag.NEVER: _("Never reconstruct sentences across paragraphs"), | |
SentenceReconstructionFlag.NEVER: _("Never"), |
# Translators: Sentence reconstruction mode | ||
SentenceReconstructionFlag.NEVER: _("Never reconstruct sentences across paragraphs"), | ||
# Translators: Sentence reconstruction mode | ||
SentenceReconstructionFlag.SAME_STYLE_PARAGRAPHS: _("Reconstruct sentences across same style paragraphs"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem:
SentenceReconstructionFlag.SAME_STYLE_PARAGRAPHS: _("Reconstruct sentences across same style paragraphs"), | |
SentenceReconstructionFlag.SAME_STYLE_PARAGRAPHS: _("Across same style paragraphs"), |
# Translators: Sentence reconstruction mode | ||
SentenceReconstructionFlag.SAME_STYLE_PARAGRAPHS: _("Reconstruct sentences across same style paragraphs"), | ||
# Translators: Sentence reconstruction mode | ||
SentenceReconstructionFlag.ANY_PARAGRAPHS: _("Reconstruct sentences across any paragraphs"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem:
SentenceReconstructionFlag.ANY_PARAGRAPHS: _("Reconstruct sentences across any paragraphs"), | |
SentenceReconstructionFlag.ANY_PARAGRAPHS: _("Across any paragraphs"), |
Here are some more general remarks. Test in Chrome's browse modePressing
Documentation navigation panel
Same style of error as above. In NotepadThe feature is not working with nothing logged in Notepad (Windows 10). Is it expected to work there? Exception listingThe exceptions for sentence stop are listed for all expressions in all languages. I am not totally convinced of this way to do. However, if we do not find a better solution and if it is validated by NV Access:
Unknown languagesThe following languages are recognized as unknown in the settings panel:
IMO, something should be done to have less unrecognized languages. At least English variants should fallback to English (neutral) and main Chinese languages should be recognized. More generally, if possible, we may log a warning and display the language code in case the language name is not recognized by Windows/NVDA, but not default to Unknown language. |
prettyLanguage = prettyLanguage or _("Unknown language") | ||
# Translators: This is the label for a textfield in the | ||
# Document Navigation settings panel. | ||
label = _("Non-breaking prefixes [%s]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it called prefix?
label = _("Non-breaking prefixes [%s]") | |
label = _("Sentence stop exceptions [%s]") |
or something else similar or even more suitable.
@CyrilleB79, thanks for comments, I'll address them soon;
You would need to delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mltony - this is just a partial review, as I've already accumulated a large amount of feedback
return punctuationMarksRegex.search(info.text) is not None | ||
# sentence regex always matches beginning and end of string. We add some words at the end to test | ||
# whether there is third match in the middle, that would be suggestive of a complete sentence. | ||
text = info.text + " traiiling word" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would encourage investigating alternative approaches here. is there a case where this isn't just +1? can you describe it in more detail in a comment.
otherwise, fixing the typo would be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text = info.text + " traiiling word" | |
text = info.text + " trailing word" |
| Next sentence | alt+downArrow | alt+downArrow | Moves the caret to the next sentence and announces it. | | ||
| Previous sentence | alt+upArrow | alt+upArrow | Moves the caret to the previous sentence and announces it. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add code formatting to the shortcuts
@@ -8,6 +8,7 @@ What's New in NVDA | |||
== New Features == | |||
- New key commands: | |||
- New Quick Navigation command ``p`` for jumping to next/previous text paragraph in browse mode. (#15998, @mltony) | |||
- Sentence navigation via ``alt+Up/DownArrow``. (#16384, @mltony) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will need to be moved when the 2024.3 change log is created
self._rangeObj = sentence._rangeObj | ||
else: | ||
UIAUnit = UIAHandler.NVDAUnitsToUIAUnits[unit] | ||
self._rangeObj.ExpandToEnclosingUnit(UIAUnit) | ||
|
||
def move( | ||
self, | ||
unit: str, | ||
direction: int, | ||
endPoint: Optional[str] = None, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a return type here and docstring
# whether there is third match in the middle, that would be suggestive of a complete sentence. | ||
text = info.text + " traiiling word" | ||
matches = punctuationMarksRegex.findall(text) | ||
return len(matches) >= 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a change of behaviour to 3 sentences are required to be determined a paragraph, rather than just 1?
@@ -253,6 +253,11 @@ def script_caret_nextSentence(self,gesture): | |||
self._caretMoveBySentenceHelper(gesture, 1) | |||
script_caret_nextSentence.resumeSayAllMode = sayAll.CURSOR.CARET | |||
|
|||
def script_speakCurrentSentence(self, gesture): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a type hint
""" | ||
This function either retrieves current sentence when direction = 0, or | ||
moves to previous/next sentence when direction is not 0. | ||
Please note that this function can only retrieve immediate previous/immediate next sentences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
This function either retrieves current sentence when direction = 0, or | |
moves to previous/next sentence when direction is not 0. | |
Please note that this function can only retrieve immediate previous/immediate next sentences. | |
""" | |
This function either retrieves current sentence when direction = 0, or | |
moves to previous/next sentence when direction is not 0. | |
Please note that this function can only retrieve immediate previous/immediate next sentences. |
) | ||
|
||
def expandCurrentSentence(self, direction: int = 0) -> FindSentenceResult: | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please de indent this docstring
@@ -2645,9 +2647,60 @@ def makeSettings(self, settingsSizer: wx.BoxSizer) -> None: | |||
) | |||
self.bindHelpEvent("ParagraphStyle", self.paragraphStyleCombo) | |||
|
|||
# Translators: This is a label for sentence reconstruction in the document navigation dialog | |||
label = _("&Sentencew Sentence reconstruction:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
label = _("&Sentencew Sentence reconstruction:") | |
label = _("&Sentence reconstruction:") |
@@ -0,0 +1,126 @@ | |||
# tests/unit/test_sentenceNavigation.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# tests/unit/test_sentenceNavigation.py |
OK. It is working now.
Would it be the opportunity to fix this? Or do you think it's better in a separate PR? In the latter case, it would be worth to open an issue describing your finding. |
I would advise to fix this as part of this PR, otherwise users will be confused because sentence navigation will be promoted as a global feature. Also @michaelDCurran, @seanbudd I really hope you have a careful review on this before merging. Also when reviewing following should be taken into account:
|
Link to issue number:
Closes #8518.
Summary of the issue:
Sentence navigation.
Description of user facing changes
Alt+Down/UpArrow
now navigates by sentence in editables and in browse mode.Description of development approach
documentNavigation\sentenceNavigation.py
. It is generic enough and works with different types of textInfos. I left some comments throughout to explain inner workings, but LMK if more clarification/ explanation is needed. I ended up rewriting the algorithm (compared to SentenceNav add-on) from scratch - now the code should be cleaner and more elegant.editableText.py
I addedspeakCurrentSentence
script - not assigned any default keystroke. Hope that's ok - since SentenceNav users requested that in the past.OffsetsTextInfo
updated_getUnitOffsets()
function to deal withUNIT_SENTENCE
.UIATextInfo
updatedexpand()
andmove()
methods to work withUNIT_SENTENCE
.browseMode.py
updatedscript_collapseOrExpandControl
: if current item is not collapsable, then treat alt+Up/Down as sentence navigation commands.Testing strategy:
Known issues with pull request:
Ready for review.
This is a draft, but fully working. Feel free to test or even start reviewing. Things I plan to address in the coming few days:Figure out how to deal with language-specific non-breaking prefixes.Add docuemntationAdd unit testsCode Review Checklist: