Implement wildcards option #114

Mottie · 2017-02-03T11:52:16Z

This PR adds the useWildcard enhancement discussed in #112.

julkue

I'd vote to name this option wildcards instead of useWildcards, as we have option names like diacritics too, not useDiacritics.

I'd also vote to ignore escaped wildcards, e.g. ignore them in the following keyword: lor\?m, even if wildcards is true.

Do you agree?

Mottie · 2017-02-04T18:04:21Z

name this option wildcards instead of useWildcards

Sure, no problem!

ignore escaped wildcards, e.g. ignore them in the following keyword: lor\?m, even if wildcards is true.

There is a problem with that...

If you pass a keyword of lor\?m, the escape is processed and the resulting string becomes lor?m which is indistinguishable from an unescaped lor?m.
If we double escape the keyword (lor\\?m), we would then have to remove the extra backslash during keyword processing, or the search would be performed against lor\?m and not lor?m. I'm not sure if removing the backslash would be a good idea, but it shouldn't harm anything as long as we are only doing it for that specific pattern (\\? or \\*).
Alternatively, maybe we could "escape" it with a forward slash?... this isn't ideal and may be confusing to both the user and developer - lor/?m.

What do you think?

julkue · 2017-02-06T13:05:40Z

Alternatively, maybe we could "escape" it with a forward slash?

I'm against this option, since it's not common.

If we double escape the keyword (lor\?m), we would then have to remove the extra backslash during keyword processing, or the search would be performed against lor?m and not lor?m. I'm not sure if removing the backslash would be a good idea, but it shouldn't harm anything as long as we are only doing it for that specific pattern (\? or \*).

If you look at the escapeStr function you'll notice that it does exactly that; adds two backslashes for sensitive characters like *. If the user passes a keyword like lor\\*m it'll be lor\*m in the plugin and thus be equivalent to the result of escapeStr. The point is, if you pass this to this function it'll still add another \, which is of course unnecessary. But from my point of view, this is all that needs to be done to allow passing escaped wildcards even if the wildcards option is true.

Mottie · 2017-02-06T14:00:32Z

Hmm, well the setupWildcardsRegExp function is called before the escapeStr function to catch the wildcards before they are escaped... I'll look into this more later today.

And normalize case insensitive diacritics array

One of the issues specified in #108 Squashed commit of the following: commit 867dabe Author: Julian Motz <me@julianmotz.com> Date: Mon Feb 6 10:48:58 2017 +0100 Update patch level commit e46fff2 Merge: 0644a91 5b2dadb Author: Julian Motz <me@julianmotz.com> Date: Mon Feb 6 10:46:23 2017 +0100 Merge branch 'master' into issue-108 Resolved merge conflict commit 0644a91 Author: Rob Garrison <wowmotty@gmail.com> Date: Mon Jan 30 15:22:56 2017 -0600 Prevent adding empty string synonyms. See #108

Closes #113

Mottie · 2017-02-10T13:07:46Z

In the last update, the wildcards option now has three settings:

"disable" (default)
"enable" - wildcards only target non-whitespace characters.
"includeWhitespace" - wildcards target all characters

So to get the demo in #119 working, you have two choices (see https://jsfiddle.net/Mottie/6kqr1d88/4/):

Set wildcards to "enable" and use the keyword `"Hello? World" ("?" matches the comma or any punctuation, but not the whitespace).
Or, set wildcards to "includeWhitespace" and use the keyword "Hello*World".

But, setting the option to include whitespace does have some consequences! if for example you use the keyword lorem*ipsum, it will also mark lorem dolor ipsum; but it isn't greedy, so it won't include the second ipsum in lorem dolor ipsum ipsum. See this demo: https://jsfiddle.net/Mottie/6kqr1d88/3/

julkue

LGTM. However, why is it necessary to call setupWildcardsRegexp() in the synonyms method, when the this.createWildcardsRegexp() method is called after this.createSynonymsRegexp()?

julkue · 2017-02-12T16:07:52Z

src/mark.js

@@ -776,6 +828,20 @@ class Mark { // eslint-disable-line no-unused-vars
     *   <li><i>limiters</i>: A custom array of string limiters for accuracy
     *   "exactly" or "complementary"</li>
     * </ul>
+     * @property {"disable"|"enable"|"includeSpaces"}


Please outsource the JSDOC as a @typedef like here. This way it's possible to reference it in other functions.

Mottie · 2017-02-12T19:23:56Z

test/specs/basic/wildcards-synonyms.js

+            "done": function () {
+                new Mark($ctx2[0]).mark("Lor*m", {
+                    "synonyms": {
+                        "Lor*m": "Ips*m"


why is it necessary to call setupWildcardsRegexp() in the synonyms method

This allows you to include wildcards in the synonyms. Does that sound reasonable?

Mottie · 2017-02-12T19:30:20Z

First off, I incorrectly named an option in my last comment... it's not "includeWhitespace", it should be "includeSpaces".... maybe I should shorten this option to "withSpaces"? Or internally only look for "space" in case the wrong term is used?

I'm also starting to think we might need to change the "?" wildcard to match zero or one characters instead of only one character. That way you could use "World?s" to mark World's and Worlds as discussed in https://github.com/julmot/mark.js/issues/119#issuecomment-279233505.

julkue · 2017-02-12T22:22:14Z

I'm for "withSpaces"
I'd vote for naming it "enabled" and "disabled" instead of "enable" and "disable". What do you think?
I already thought that ? is open for either one or no match. If it isn't currently, I'm absolutely for it.

"enable" -> "enabled" "disable" -> "disabled" "includeSpaces" -> "withSpaces"

Previously set to match only one

julkue · 2017-03-11T19:38:48Z

shit mate, I'm very sorry for the long delay. I'm doing some tests in the next hours and will come back with some review.

@typedef

Beautified code and outsourced the accuracy property into a @typedef definition to make it consistent with the wildcards property

They must be included after diacritics and synonyms tests, since they depend on them

julkue

@Mottie Thanks for the great work. It looks very good overall. The only thing I'm missing is a test for the disabled value. Then I think we're safe to publish this as a new minor release!

Mottie · 2017-03-12T02:14:02Z

The disabled value was tested in the wildcards.js file (line 31); is that good enough, or should I add more?

HTML

<div>
    <p>
        Lorenzom ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
        tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
        Lorem ipsum dolor sit amet, consetetur sadipscing elitr. Loriem ipsum
        dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
        invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua.
        Lorm ipsum Lorum ipsum, et Loram ipsum. Lor?m ipsum. Lor*m ipsum.
    </p>
</div>

Script

new Mark($ctx4[0]).mark(["lor?m", "Lor*m"], {
    "separateWordSearch": false,
    "wildcards": "disabled",
    "done": done
});
// ...
it("should find wildcards as plain characters when disabled", function () {
    expect($ctx4.find("mark")).toHaveLength(2);
});

Add useWildcards option

57d7116

julkue self-requested a review February 3, 2017 11:52

julkue added the type: enhancement label Feb 3, 2017

julkue assigned Mottie Feb 3, 2017

julkue requested changes Feb 4, 2017

View reviewed changes

julkue changed the title ~~Add useWildcards option~~ Implement wildcards option Feb 4, 2017

Code formatting

2e66184

Change option to wildcards

3a231b6

Mottie and others added 6 commits February 10, 2017 06:04

Fix case insensitive diacritics mapping (#111)

b2bb6bd

And normalize case insensitive diacritics array

Ignore special characters in accuracy complementary (#116)

c328b97

Closes #113

Code formatting

b0c56da

Change option to wildcards

04e7a8e

Allow escaped wildcards

6fbf607

Merge branch 'master' into wildcards

df3ecfd

julkue reviewed Feb 12, 2017

View reviewed changes

Mottie commented Feb 12, 2017

View reviewed changes

Create @typedef wildcards definition

02321e1

Mottie added 2 commits February 12, 2017 17:15

Rename wildcard options

29112be

"enable" -> "enabled" "disable" -> "disabled" "includeSpaces" -> "withSpaces"

Make "?" wildcard match zero or one character

be886a7

Previously set to match only one

julkue added 2 commits March 11, 2017 21:46

Incremented version

a6d3d5b

Beautified code and outsourced the accuracy property into a @typedef definition to make it consistent with the wildcards property

Include wildcard tests in the right order

ecea6e3

They must be included after diacritics and synonyms tests, since they depend on them

julkue requested changes Mar 11, 2017

View reviewed changes

julkue approved these changes Mar 12, 2017

View reviewed changes

julkue merged commit 0b3fd5b into master Mar 12, 2017

julkue deleted the wildcards branch March 12, 2017 13:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement wildcards option #114

Implement wildcards option #114

Mottie commented Feb 3, 2017

julkue left a comment

Mottie commented Feb 4, 2017

julkue commented Feb 6, 2017

Mottie commented Feb 6, 2017

Mottie commented Feb 10, 2017 •

edited

Loading

julkue left a comment

julkue Feb 12, 2017

Mottie Feb 12, 2017

Mottie Feb 12, 2017

Mottie commented Feb 12, 2017 •

edited

Loading

julkue commented Feb 12, 2017

julkue commented Mar 11, 2017

julkue left a comment

Mottie commented Mar 12, 2017

Implement wildcards option #114

Implement wildcards option #114

Conversation

Mottie commented Feb 3, 2017

julkue left a comment

Choose a reason for hiding this comment

Mottie commented Feb 4, 2017

julkue commented Feb 6, 2017

Mottie commented Feb 6, 2017

Mottie commented Feb 10, 2017 • edited Loading

julkue left a comment

Choose a reason for hiding this comment

julkue Feb 12, 2017

Choose a reason for hiding this comment

Mottie Feb 12, 2017

Choose a reason for hiding this comment

Mottie Feb 12, 2017

Choose a reason for hiding this comment

Mottie commented Feb 12, 2017 • edited Loading

julkue commented Feb 12, 2017

julkue commented Mar 11, 2017

julkue left a comment

Choose a reason for hiding this comment

Mottie commented Mar 12, 2017

Mottie commented Feb 10, 2017 •

edited

Loading

Mottie commented Feb 12, 2017 •

edited

Loading