MongoDB 'matches' invalid pattern match #228

Open
kotedo opened this Issue Mar 26, 2013 · 3 comments

Projects

None yet

2 participants

@kotedo
kotedo commented Mar 26, 2013

Hi,

The case clause

    {'matches', "*"++Value} ->
        [{Key, {regex, list_to_binary(Value), <<"i">>}}];

in boss_db/src/db_adapters/boss_db_adapter_mongodb.erl
can never match; if Value is "form" then "_form" is not a match, and if Value is "_form" then Value will not match "**form".

@evanmiller
Collaborator

Hmm? The clause will match "*form" and binds "form" to Value.

@kotedo
kotedo commented Mar 26, 2013

(stuff@MK324SDKQ4)39> boss_db:find(thread, [{name, 'matches', "charg"}]).
07:36:21.978 [info] In no-asterix match: "charg"

40> boss_db:find(thread, [{name, 'matches', "_charg"}]).
07:36:25.219 [info] In no-asterix match: "_charg"

41> boss_db:find(thread, [{name, 'matches', "_Charg"}]).
07:36:34.770 [info] In no-asterix match: "_Charg"

42> boss_db:find(thread, [{name, 'matches', "Charg"}]).
07:36:38.824 [info] In no-asterix match: "Charg"

[{thread,"thread-5137b4d169e58119fa000009",
"stuff-512be35e69e5817b5d000008",
"owner-512be2e569e5817b5d000004",
<<"Charging and Distance">>,
<<"Whatever you heard about quick charging it, fohget it.">>,
<<"nissan,nissan leaf, charging, driving, charger">>,true,undefined,
{1362,605265,332000},
{1362,759328,256000}}]

What shows here is that the intended "*"++Value search never gets triggered, and thus it will never search
case insensitive, and as a result it will only find the record if the "accidentally" matches its uppercase or lowercase
equivalent.

The regex should find things case sensitive by default, I agree, and with a marker in the search term search case-insensitive. That would be at least acceptable.

Regex search in MongoDB has four different types of regex pattern matching:

  • i toggles case insensitivity, and allows all letters in the pattern to match upper and lower cases.
  • m toggles multiline regular expression. Without this option, all regular expression match within one line.
    If there are no newline characters (e.g. \n) or no start/end of line construct, the m option has no effect.
  • x toggles an “extended” capability. When set, $regex ignores all white space characters unless escaped or included in a character class.
    Additionally, it ignores characters between an un-escaped # character and the next new line, so that you may include comments in complicated patterns. This only applies to data characters; white space characters may never appear within special character sequences in a pattern.
    The x option does not affect the handling of the VT character (i.e. code 11.)
    New in version 1.9.0.
  • s allows the dot (e.g. .) character to match all characters including newline characters.
@evanmiller
Collaborator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment