Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match whole word in pronunciation dictionaries #1704

Closed
nvaccessAuto opened this issue Aug 3, 2011 · 14 comments
Closed

Match whole word in pronunciation dictionaries #1704

nvaccessAuto opened this issue Aug 3, 2011 · 14 comments

Comments

@nvaccessAuto
Copy link

Reported by dczajka on 2011-08-03 18:34
There should be a "match whole word only" option when creating entries in the pronunciation dictionaries. It is possible to achieve this result if desired by using regular expressions, but it seems a common enough requirement even for people not well-versed in regex syntax that it should be provided for in the dialog. A temporary patch would be to provide the regex syntax to achieve this in the userguide, as a brief example of basic regex use.
Blocking #2220, #4450

@nvaccessAuto
Copy link
Author

Comment 4 by jteh on 2014-09-10 22:17
Suggested implementation for anyone who wants to give this a go:

  • Currently, the speech dict format contains two numeric fields at the end. One is a flag indicating case sensitivity and the other is a flag indicating whether the pattern is a regular expression. Rather than adding a new field, I suggest the last field accept an additional value (2) indicating a word match. In the code, you'll obviously need to treat this as an int instead of a bool and there should be int constants for the three types of pattern.
  • When a word pattern is handled, you'll need to build a regular expression which only matches at word boundaries. This should be simple enough.
  • In the GUI, the Regular expression check box should become a radio button to choose from the three types of pattern.

@nvaccessAuto
Copy link
Author

Comment 5 by blindbhavya on 2014-09-11 10:21
Hi.
It would be great if someone could implement this.
I have felt its need quite a few times. I will CC myself so that I receive updates on any progress made on this ticket.
Thanks.

@nvaccessAuto
Copy link
Author

Attachment 0001-Added-Whole-Word-option-to-speech-dictionary.-1704.patch added by cannona on 2014-09-12 21:23
Description:
First patch.

@nvaccessAuto
Copy link
Author

Comment 8 by cannona on 2014-09-12 21:26
Would appreciate any feedback that anyone might have.

@nvaccessAuto
Copy link
Author

Comment 9 by cannona on 2014-09-12 21:33
Sorry, just realized that there are still a couple bugs in this. Will fix and then upload an amended patch.

@nvaccessAuto
Copy link
Author

Attachment 0001-Added-Whole-Word-option-to-speech-dictionary.-1704.2.patch added by cannona on 2014-09-12 21:44
Description:
Second patch.

@nvaccessAuto
Copy link
Author

Comment 10 by cannona on 2014-09-12 21:46
Fixed. Sorry again for the confusion.

@nvaccessAuto
Copy link
Author

Comment 11 by jteh on 2014-09-15 22:18
Thanks! Very nice work... again! Code review:

+++ b/source/gui/settingsDialogs.py
@@ -1174,16 +1174,36 @@ class DictionaryEntryDialog(wx.Dialog):
...
> +       self.typeLabels = {
  • I'd probably make this a constant on the class, since it'll be used every time the dialog is used. You could call it TYPE_LABELS.
  • It'd be nice if the labels had keyboard accelerators; e.g. "&Anywhere", "Whole &word", "Regular &expression".
+     self.typeLabelsOrdering = (speechDictHandler.ENTRY_TYPE_ANYWHERE, speechDictHandler.ENTRY_TYPE_WORD, speechDictHandler.ENTRY_TYPE_REGEXP)

This could be a class constant; same as above.

+ def getType(self):
...
+     if typeRadioValue == wx.NOT_FOUND:

It never hurts to be safe, but should this ever happen?


@@ -1209,11 +1229,16 @@ class DictionaryDialog(SettingsDialog):

...

> +       self.typeLabels = {

To avoid duplication, this could be a class constant based on DictionaryEntryDialog.TYPE_LABELS. However, you'll need to remove the accelerator bits. You could do something like this just below the class statement for DictionaryDialog:

    TYPE_LABELS = {t: l.replace("&", "") for t, l in DictionaryEntryDialog.TYPE_LABELS.iteritems()}

+         self.dictList.Append((entry.comment,entry.pattern,entry.replacement,self.offOn[int(entry.caseSensitive)],self.typeLabels[int(entry.type)]))

nit: Unless I'm missing something, you shouldn't need the int() around entry.type, as it should already be an int.

Thanks again!

@nvaccessAuto
Copy link
Author

Attachment 0001-Added-Whole-Word-option-to-speech-dictionary.-1704.3.patch added by cannona on 2014-10-14 20:03
Description:
Patch 3.

@nvaccessAuto
Copy link
Author

Comment 12 by cannona (in reply to comment 11) on 2014-10-14 20:07
Replying to jteh:

+   def getType(self):
...
+       if typeRadioValue == wx.NOT_FOUND:

It never hurts to be safe, but should this ever happen?

I wasn't sure if it was ever possible for a user to deselect all radio buttons somehow, hence this bit of defensive programming.

@nvaccessAuto
Copy link
Author

Comment 14 by James Teh <jamie@... on 2014-10-21 04:48
In [159b456]:

In speech dictionaries, it is now possible to specify that a pattern should only match if it is a whole word; i.e. it does not occur as part of a larger word.

Re #1704.

@nvaccessAuto
Copy link
Author

Comment 15 by James Teh <jamie@... on 2014-10-21 04:49
In [e14c2ff]:

Merge branch 't1704' into next

Incubates #1704.

Changes:
Added labels: incubating

@nvaccessAuto
Copy link
Author

Comment 16 by James Teh <jamie@... on 2014-11-06 04:46
In [2c5f529]:

In speech dictionaries, it is now possible to specify that a pattern should only match if it is a whole word; i.e. it does not occur as part of a larger word.

Fixes #1704.

Changes:
Removed labels: incubating
State: closed

@nvaccessAuto
Copy link
Author

Comment 17 by jteh on 2014-11-06 04:47
Changes:
Milestone changed from None to 2014.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant