Set ordering for input gesture identifiers is unreliable #6945

jcsteh · 2017-03-08T12:50:56Z

We want input gesture identifiers with multiple items of indeterminate order (e.g. multiple keys) to be handled the same way regardless of the ordering of the items. For example, control+shift+f1 should be treated the same as shift+control+f1. To do this, we currently normalise using Python set order. My original thinking was that the output order needs to be consistent regardless of the input order, but it doesn't need to make sense to humans. Set order seemed like an ideal solution, since there's no sorting cost. Unfortunately, while this is often true for set ordering, it isn't true when there are hash bucket conflicts. For example:

>>> # Case 1: The output order is consistent regardless of input order.
>>> {1, 2}
set([1, 2])
>>> {2, 1}
set([1, 2])
>>> # Case 2: The output order is different for different input orders.
>>> {8, 256}
set([8, 256])
>>> {256, 8}
set([256, 8])

(See this StackOverflow thread for details.)

The specific example for case 2 is actually seen in the wild; it is the cause of #3157 (comment). The fact that we haven't seen this elsewhere is just luck. For example, you can reproduce it as follows:

Open the Python console.
Enter the following: import globalCommands; globalCommands.commands.bindGesture("kb:control+windows+f1", "dateTime")
Try pressing control+windows+f1 in input help.

Expected: Date time command.
Actual: Nothing.

To fix this, we're going to need to use some other ordering. Sorting for every gesture identifier seems expensive. This is also going to be painful because all the existing implementations output in set order; they don't call a common normalisation function.

CC @michaelDCurran.

The text was updated successfully, but these errors were encountered:

derekriemer · 2017-03-08T13:13:02Z

Hi:
For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort. 16 compares max assuming a bad algorithm, with pythons mixed quick sort/merge sort, it should be max of $O(n \log(n))$. For n= 4, this would be about 5.5 operations per gesture identifier sort. Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

derekriemer · 2017-03-08T13:13:32Z

It could theoretically be a large amount, but I can't think of a situation where you'd make the user press more than say 4 keys.

jcsteh · 2017-03-09T04:19:09Z

We should fix #3678 while we're at it.

jcsteh · 2017-03-09T04:47:05Z

For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort.

I did a bit of quick timing and the difference appears to be negligible for our use case. In fact, it might sometimes be faster to sort a list rather than using a set because we're going to join the list into a string anyway and joining lists is faster than joining other iterables.

Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

inputCore.normalizeGestureIdentifier is already called for all gestures when they're bound, so they're ordered at bind time. For the gestures emitted when input is received (e.g. when a key is pressed), the gesture implementation is currently expected to do the normalisation. This is theoretically a bit faster because the implementation doesn't have to strip the source prefix (e.g. "kb:"), normalise and then put the prefix back on, but it also means each implementation has to know how to normalise. Given this bug and other ordering bugs (both fixed and unfixed), I'm thinking we should just forget normalisation in the implementations and have callers normalise when required. That is less error prone, if a tiny bit less optimal in some cases. It does mean a slight backwards incompatible API change, but I doubt anyone is relying on this.

jcsteh · 2017-03-10T05:14:31Z

We discussed this via voice:

We all agreed that the cost of normalisation relative to other stuff we do (e.g. regexps in speech dicts) is negligible and that it's not really worth putting time into benchmarking.
In fact, we're likely to hit the OS/device rate limit for repeated input long before we hit any performance issues related to normalisation.
So, the plan is to have .identifiers unnormalised and have a .normalizedIdentifiers property which does the normalisation.

jcsteh · 2017-03-10T12:08:00Z

The specific example for case 2 is actually seen in the wild; it is the cause of #3157 (comment).

Actually, this is incorrect, since the numbers are not what we deal with in gesture identifiers. However, it's still a hash bucket conflict:

>>> {"dot4", "space"}
set(['space', 'dot4'])
>>> {"space", "dot4"}
set(['dot4', 'space'])

…tion.

jcsteh added the p3 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority label Mar 8, 2017

jcsteh mentioned this issue Mar 8, 2017

Braillesense u2: a lot of shortcuts don't work #3157

Closed

jcsteh mentioned this issue Mar 10, 2017

Normalisation of input gesture identifiers doesn't handle upper case correctly #3678

Closed

jcsteh mentioned this issue Mar 10, 2017

Fix input gesture identifier normalization. #6962

Merged

jcsteh added a commit that referenced this issue Mar 14, 2017

Incubates #6962 (issue #6945): Fix input gesture identifier normaliza…

6da1646

…tion.

jcsteh closed this as completed in #6962 Mar 29, 2017

nvaccessAuto added this to the 2017.2 milestone Mar 29, 2017

jcsteh added a commit that referenced this issue Mar 29, 2017

Update What's New for PR #6962 (issue #6945).

70d7531

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set ordering for input gesture identifiers is unreliable #6945

Set ordering for input gesture identifiers is unreliable #6945

jcsteh commented Mar 8, 2017

derekriemer commented Mar 8, 2017 •

edited

derekriemer commented Mar 8, 2017

jcsteh commented Mar 9, 2017

jcsteh commented Mar 9, 2017

jcsteh commented Mar 10, 2017 •

edited

jcsteh commented Mar 10, 2017

Set ordering for input gesture identifiers is unreliable #6945

Set ordering for input gesture identifiers is unreliable #6945

Comments

jcsteh commented Mar 8, 2017

derekriemer commented Mar 8, 2017 • edited

derekriemer commented Mar 8, 2017

jcsteh commented Mar 9, 2017

jcsteh commented Mar 9, 2017

jcsteh commented Mar 10, 2017 • edited

jcsteh commented Mar 10, 2017

derekriemer commented Mar 8, 2017 •

edited

jcsteh commented Mar 10, 2017 •

edited