Skip to content

MongoDB 'matches' invalid pattern match #228

Open
kotedo opened this Issue Mar 26, 2013 · 3 comments

2 participants

@kotedo
kotedo commented Mar 26, 2013

Hi,

The case clause

    {'matches', "*"++Value} ->
        [{Key, {regex, list_to_binary(Value), <<"i">>}}];

in boss_db/src/db_adapters/boss_db_adapter_mongodb.erl
can never match; if Value is "form" then "form" is not a match, and if Value is "form" then Value will not match "**form".

@evanmiller

Hmm? The clause will match "*form" and binds "form" to Value.

@kotedo
kotedo commented Mar 26, 2013

(stuff@MK324SDKQ4)39> boss_db:find(thread, [{name, 'matches', "charg"}]).
07:36:21.978 [info] In no-asterix match: "charg"

40> boss_db:find(thread, [{name, 'matches', "charg"}]).
07:36:25.219 [info] In no-asterix match: "
charg"

41> boss_db:find(thread, [{name, 'matches', "Charg"}]).
07:36:34.770 [info] In no-asterix match: "
Charg"

42> boss_db:find(thread, [{name, 'matches', "Charg"}]).
07:36:38.824 [info] In no-asterix match: "Charg"

[{thread,"thread-5137b4d169e58119fa000009",
"stuff-512be35e69e5817b5d000008",
"owner-512be2e569e5817b5d000004",
<<"Charging and Distance">>,
<<"Whatever you heard about quick charging it, fohget it.">>,
<<"nissan,nissan leaf, charging, driving, charger">>,true,undefined,
{1362,605265,332000},
{1362,759328,256000}}]

What shows here is that the intended "*"++Value search never gets triggered, and thus it will never search
case insensitive, and as a result it will only find the record if the "accidentally" matches its uppercase or lowercase
equivalent.

The regex should find things case sensitive by default, I agree, and with a marker in the search term search case-insensitive. That would be at least acceptable.

Regex search in MongoDB has four different types of regex pattern matching:

  • i toggles case insensitivity, and allows all letters in the pattern to match upper and lower cases.
  • m toggles multiline regular expression. Without this option, all regular expression match within one line. If there are no newline characters (e.g. \n) or no start/end of line construct, the m option has no effect.
  • x toggles an “extended” capability. When set, $regex ignores all white space characters unless escaped or included in a character class. Additionally, it ignores characters between an un-escaped # character and the next new line, so that you may include comments in complicated patterns. This only applies to data characters; white space characters may never appear within special character sequences in a pattern. The x option does not affect the handling of the VT character (i.e. code 11.) New in version 1.9.0.
  • s allows the dot (e.g. .) character to match all characters including newline characters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.