New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set ordering for input gesture identifiers is unreliable #6945

Closed
jcsteh opened this Issue Mar 8, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@jcsteh
Contributor

jcsteh commented Mar 8, 2017

We want input gesture identifiers with multiple items of indeterminate order (e.g. multiple keys) to be handled the same way regardless of the ordering of the items. For example, control+shift+f1 should be treated the same as shift+control+f1. To do this, we currently normalise using Python set order. My original thinking was that the output order needs to be consistent regardless of the input order, but it doesn't need to make sense to humans. Set order seemed like an ideal solution, since there's no sorting cost. Unfortunately, while this is often true for set ordering, it isn't true when there are hash bucket conflicts. For example:

>>> # Case 1: The output order is consistent regardless of input order.
>>> {1, 2}
set([1, 2])
>>> {2, 1}
set([1, 2])
>>> # Case 2: The output order is different for different input orders.
>>> {8, 256}
set([8, 256])
>>> {256, 8}
set([256, 8])

(See this StackOverflow thread for details.)

The specific example for case 2 is actually seen in the wild; it is the cause of #3157 (comment). The fact that we haven't seen this elsewhere is just luck. For example, you can reproduce it as follows:

  1. Open the Python console.
  2. Enter the following: import globalCommands; globalCommands.commands.bindGesture("kb:control+windows+f1", "dateTime")
  3. Try pressing control+windows+f1 in input help.
  • Expected: Date time command.
  • Actual: Nothing.

To fix this, we're going to need to use some other ordering. Sorting for every gesture identifier seems expensive. This is also going to be painful because all the existing implementations output in set order; they don't call a common normalisation function.

CC @michaelDCurran.

@derekriemer

This comment has been minimized.

Show comment
Hide comment
@derekriemer

derekriemer Mar 8, 2017

Collaborator

Hi:
For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort. 16 compares max assuming a bad algorithm, with pythons mixed quick sort/merge sort, it should be max of $O(n \log(n))$. For n= 4, this would be about 5.5 operations per gesture identifier sort. Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

Collaborator

derekriemer commented Mar 8, 2017

Hi:
For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort. 16 compares max assuming a bad algorithm, with pythons mixed quick sort/merge sort, it should be max of $O(n \log(n))$. For n= 4, this would be about 5.5 operations per gesture identifier sort. Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

@derekriemer

This comment has been minimized.

Show comment
Hide comment
@derekriemer

derekriemer Mar 8, 2017

Collaborator

It could theoretically be a large amount, but I can't think of a situation where you'd make the user press more than say 4 keys.

Collaborator

derekriemer commented Mar 8, 2017

It could theoretically be a large amount, but I can't think of a situation where you'd make the user press more than say 4 keys.

@jcsteh

This comment has been minimized.

Show comment
Hide comment
@jcsteh

jcsteh Mar 9, 2017

Contributor

We should fix #3678 while we're at it.

Contributor

jcsteh commented Mar 9, 2017

We should fix #3678 while we're at it.

@jcsteh

This comment has been minimized.

Show comment
Hide comment
@jcsteh

jcsteh Mar 9, 2017

Contributor

For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort.

I did a bit of quick timing and the difference appears to be negligible for our use case. In fact, it might sometimes be faster to sort a list rather than using a set because we're going to join the list into a string anyway and joining lists is faster than joining other iterables.

Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

inputCore.normalizeGestureIdentifier is already called for all gestures when they're bound, so they're ordered at bind time. For the gestures emitted when input is received (e.g. when a key is pressed), the gesture implementation is currently expected to do the normalisation. This is theoretically a bit faster because the implementation doesn't have to strip the source prefix (e.g. "kb:"), normalise and then put the prefix back on, but it also means each implementation has to know how to normalise. Given this bug and other ordering bugs (both fixed and unfixed), I'm thinking we should just forget normalisation in the implementations and have callers normalise when required. That is less error prone, if a tiny bit less optimal in some cases. It does mean a slight backwards incompatible API change, but I doubt anyone is relying on this.

Contributor

jcsteh commented Mar 9, 2017

For sorting a gesture, it'll be max 4 items usually. Therefore, it should be a relatively inexpensive sort.

I did a bit of quick timing and the difference appears to be negligible for our use case. In fact, it might sometimes be faster to sort a list rather than using a set because we're going to join the list into a string anyway and joining lists is faster than joining other iterables.

Could identifiers somehow be sorted before hand, and stored sorted so it never has to be done again?

inputCore.normalizeGestureIdentifier is already called for all gestures when they're bound, so they're ordered at bind time. For the gestures emitted when input is received (e.g. when a key is pressed), the gesture implementation is currently expected to do the normalisation. This is theoretically a bit faster because the implementation doesn't have to strip the source prefix (e.g. "kb:"), normalise and then put the prefix back on, but it also means each implementation has to know how to normalise. Given this bug and other ordering bugs (both fixed and unfixed), I'm thinking we should just forget normalisation in the implementations and have callers normalise when required. That is less error prone, if a tiny bit less optimal in some cases. It does mean a slight backwards incompatible API change, but I doubt anyone is relying on this.

@jcsteh

This comment has been minimized.

Show comment
Hide comment
@jcsteh

jcsteh Mar 10, 2017

Contributor

We discussed this via voice:

  • We all agreed that the cost of normalisation relative to other stuff we do (e.g. regexps in speech dicts) is negligible and that it's not really worth putting time into benchmarking.
  • In fact, we're likely to hit the OS/device rate limit for repeated input long before we hit any performance issues related to normalisation.
  • So, the plan is to have .identifiers unnormalised and have a .normalizedIdentifiers property which does the normalisation.
Contributor

jcsteh commented Mar 10, 2017

We discussed this via voice:

  • We all agreed that the cost of normalisation relative to other stuff we do (e.g. regexps in speech dicts) is negligible and that it's not really worth putting time into benchmarking.
  • In fact, we're likely to hit the OS/device rate limit for repeated input long before we hit any performance issues related to normalisation.
  • So, the plan is to have .identifiers unnormalised and have a .normalizedIdentifiers property which does the normalisation.
@jcsteh

This comment has been minimized.

Show comment
Hide comment
@jcsteh

jcsteh Mar 10, 2017

Contributor

The specific example for case 2 is actually seen in the wild; it is the cause of #3157 (comment).

Actually, this is incorrect, since the numbers are not what we deal with in gesture identifiers. However, it's still a hash bucket conflict:

>>> {"dot4", "space"}
set(['space', 'dot4'])
>>> {"space", "dot4"}
set(['dot4', 'space'])
Contributor

jcsteh commented Mar 10, 2017

The specific example for case 2 is actually seen in the wild; it is the cause of #3157 (comment).

Actually, this is incorrect, since the numbers are not what we deal with in gesture identifiers. However, it's still a hash bucket conflict:

>>> {"dot4", "space"}
set(['space', 'dot4'])
>>> {"space", "dot4"}
set(['dot4', 'space'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment