Conversation
|
Thanks for submitting this! The basic work looks good, but there are several important discrepancies between .NET regex and PostgreSQL regex, and it's important for the EF6 provider to provide .NET behavior. I recommend you take a look at the work I've done on regex in the EFCore provider (see NpgsqlRegexIsMatchTranslator). |
| {"operator_tsquery_contains",Operator.QueryContains}, | ||
| {"operator_tsquery_is_contained",Operator.QueryIsContained} | ||
| {"operator_tsquery_is_contained",Operator.QueryIsContained}, | ||
| {"regex_is_match",Operator.RegexIsMatch}, |
There was a problem hiding this comment.
I think it would be more consistent to change regex_is_match to operator_regex_is_match.
|
Agree. |
|
Regular expressions functions now match .NET behaviour. Like in the EFCore provider. |
|
@PSeON thanks, I promise to review ASAP. In the meantime can you please squash the two commits and push force for a cleaner history? |
|
Done. |
|
@rwasef1830, if you can help review this and provide more comments that would be greatly appreciated! |
| /// otherwise, <see langword="false"/>. | ||
| /// </returns> | ||
| [DbFunction("Npgsql", "regex_is_match")] | ||
| public static bool RegexIsMatch(string input, string pattern) |
There was a problem hiding this comment.
I think that if this (and its overload) was named MatchRegex it would be consistent the "Match" method which generates the @@ operator.
I would also rename the values in the operators and DbFunction attributes ... etc. to make it consistent with the match operator. (eg: operator_regex_is_match changes to operator_regex_match).
Also, I don't think the documentation for each parameter and return value adds any additional value. In order to free ourselves from the burden of keeping it in sync with postgresql docs, the link to the manual in the <summary> is enough. This also makes it consistent with the rest of the methods in this class.
|
@PSeON in addition to my notes on the commit, I find the behavior difference between the overload without RegexOptions and the one with to be slightly misleading. There are several sets of semantics exposed in a confusing way here: Operator " According to postgresql docs:
I think it is redundant to use both operators and in the case of one of them override the meaning of the operator with embedded options. To match the .NET semantics (which is case-sensitive by default unless overridden by RegexOptions), both functions should emit the case-sensitive version of the PostgreSQL operator. In the case of the second one, the meaning of the operator will be overridden by the RegexOptions, and in case of According to MS docs, RegexOptions.None gives the following behavior:
There should be also testcases that confirm this behavior. @roji What do you think about this and my notes ? |
| } | ||
|
|
||
| [Test] | ||
| public void RegexIsMatchOptions() |
There was a problem hiding this comment.
I think this could be clearer / more readable using [TestCase].
|
@rwasef1830 thanks for the comments. I am very grateful to you for your help. I will fix these problems ASAP. |
|
Finished implementation of recommendations. Please review this version. Thanks. |
| args[1].Accept(this)); | ||
| } | ||
|
|
||
| string flags = "(?"; |
There was a problem hiding this comment.
Please use a StringBuilder rather than string concatenation
|
I've taken a quick look (sorry, don't have much spare time at the moment), it's after @PSeON's latest modifications so @rwasef1830, I'm not sure exactly what you were referring to before - although what you say sounds right. The code as it is now seems to correspond pretty well to what I wrote for EF Core. I remember I spent some time understanding exactly how PostgreSQL does regexp and how .NET does them, and I think everything should be OK. @rwasef1830, if you thing any discrepancy still remains please let us know (I'll fix the EF Core implementation too if necessary), otherwise I'm OK for merging (modulu the nit-picking StringBuilder comment). @rwasef1830 I'll wait for your confirmation before doing so. |
| {"operator_tsquery_contains",Operator.QueryContains}, | ||
| {"operator_tsquery_is_contained",Operator.QueryIsContained} | ||
| {"operator_tsquery_is_contained",Operator.QueryIsContained}, | ||
| {"operator_regex_match",Operator.RegexMatch}, |
There was a problem hiding this comment.
Please rename operator_regex_match_case to operator_regex_match (and remove the other one). This implementation should be only using 1 operator and that's the case sensitive one (and the behavior is then overridden with postgresql modifiers according to RegexOptions).
|
@PSeON Thanks for your work! In addition to my latest comments, I think you should add a test for negative matching (maybe a second query in the same tests that selects the mismatching input). @roji Will negating this query execute and be treated the same way as using the "not regex match" operator in PostgreSQL or does this need to be separately implemented as well ? |
|
@rwasef1830, good question - it's worth testing. I'm guessing that without any extra handling, a negative regex match should render something like |
|
@rwasef1830 and thanks for the very valuable and through reviewing! |
|
@PSeON I forgot to mention, please add a unit test that very explicitly makes sure that using |
|
@rwasef1830 This code (line 75) checks that both methods produce same results. Am I wrong? |
|
@roji Checked query rendering. |
|
@PSeON You're right I missed that. Maybe the variable could be clarified, perhaps you could rename the (it is better to have the variables clearly named than to write clarifying comments). If negating the existing operator and the negative operator behave exactly the same then no need to implement it. |
|
Agree on the negation, the only impact is slightly better SQL readability which really isn't that important here. |
|
@rwasef1830, whenever you're OK let me know and I'll merge. |
|
Added proposed improvements. |
| [TestCase("^blog$", "blog", "some \nblog\n name", TestName = "MatchRegex ^ and $ match beginning and end")] | ||
| [TestCase("some .* name", "some blog name", "some \n name", TestName = "MatchRegex . matches all except \\n")] | ||
| [TestCase("some blog name", "some blog name", "someblogname", TestName = "MatchRegex whitespace not ignored in pattern")] | ||
| public void MatchRegex(string pattern, string matchingInput, string mismatchingInput) |
There was a problem hiding this comment.
You can remove MatchRegex from TestName property. In the unit test runner the test case names will appear under the test's main method name in a tree structure so no need to mention it again in the value of the TestName property. (eg: Case-sensitive).
|
Done. |
|
Thanks to both of you for this! |
|
@roji, @rwasef1830 Thanks! |
|
Any chance we can get a new version pushed with these changes? |
|
@danbopes sorry a new version hasn't been released in so long. @rwasef1830, what do you think? As I'm really unlikely to do any work on the EF6 provider anytime soon (have little time for Npgsql in general at the moment), can you take a look at other issues/PRs and tell me if you intend to do any work? If not I can do a release soon. |
|
@roji I've been waiting to get a chance to work on them, but didn't get any time I'm afraid :-( I'll take a quick look at the outstanding pull requests today. |
|
@rwasef1830 so what do you think, should I release or do you want a bit more time? |
|
Great, I'll hold off then. Will probably have a bit more time for a release next Sunday, hopefully you can get everything done by then. |
Hello. This pull request adds two POSIX regular expressions operators ("
" and "*").https://www.postgresql.org/docs/current/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
Tests included.
Please let me know what do you think. Thanks.