Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards search not respecting range #22

Closed
sorbits opened this issue Jul 20, 2013 · 4 comments
Closed

Backwards search not respecting range #22

sorbits opened this issue Jul 20, 2013 · 4 comments
Labels

Comments

@sorbits
Copy link
Contributor

sorbits commented Jul 20, 2013

Summary

If I call onig_search with a start/range of 3-0 then it may still return a match outside this range.

Steps to Reproduce

Build the following source:

#include <string.h>
#include <stdio.h>
#include "oniguruma.h"

int main (int argc, char const* argv[])
{
    char const* ptrn = "\\h+";
    char const* data = "abcdef";

    OnigErrorInfo einfo;
    OnigRegex regex = NULL;
    if(ONIG_NORMAL == onig_new(&regex, (OnigUChar const*)ptrn, (OnigUChar const*)ptrn + strlen(ptrn), ONIG_OPTION_CAPTURE_GROUP, ONIG_ENCODING_UTF8, ONIG_SYNTAX_RUBY, &einfo))
    {
        OnigRegion* res = onig_region_new();
        if(ONIG_MISMATCH != onig_search(regex, (OnigUChar const*)data, (OnigUChar const*)data + strlen(data), (OnigUChar const*)data + 3, (OnigUChar const*)data, res, 0))
            fprintf(stderr, "%td-%td\n", res->beg[0], res->end[0]);
    }
    return 0;
}

Expected Result

I would expect the output to be 0-3.

Actual Result

The program outputs 3-4.

Notes

This was tested with current master (9cd4fa1) but problem has existed for as long as I know.

@k-takata
Copy link
Owner

It seems to be caused by this #ifdef block:

Onigmo/regexec.c

Line 4176 in 754f936

#ifdef USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE

I don't know why this block is needed. I'll look into it more closely.

BTW, I think the expected result is 2-3.

@k-takata
Copy link
Owner

Hmm, the #ifdef block was added with Oniguruma 5.7.0.

Copied from the HISTORY:

2007/04/27: Version 5.7.0

2007/04/20: [spec] add config USE_MATCH_RANGE_IS_COMPLETE_RANGE.
2007/04/20: [impl] refactoring in match_at().

Note: USE_MATCH_RANGE_IS_COMPLETE_RANGE was changed to USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE with Oniguruma 5.9.0.

@sorbits
Copy link
Contributor Author

sorbits commented Aug 18, 2013

Thanks for looking into this.

Does this mean that defining USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE should return a match of 2-3?

I tried to define it in regexec.c:33 but got unchanged result.

@k-takata
Copy link
Owner

Does this mean that defining USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE should return a match of 2-3?

Yes, I think so. But currently the result is 3-4 as you reported.

After applying the following patch, the result becomes 2-3 when USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE is defined.
(If it is not defined, the result becomes 3-6.)

--- a/regexec.c
+++ b/regexec.c
@@ -4173,11 +4173,6 @@ onig_search_gpos(regex_t* reg, const UChar* str, const UChar* end,
     }
   }
   else {  /* backward search */
-#ifdef USE_MATCH_RANGE_MUST_BE_INSIDE_OF_SPECIFIED_RANGE
-    if (orig_start < end)
-      orig_start += enclen(reg->enc, orig_start); /* is upper range */
-#endif
-
     if (reg->optimize != ONIG_OPTIMIZE_NONE) {
       UChar *low, *high, *adjrange, *sch_start;

I'm concerning about side effects caused by this patch. More tests are needed.

sorbits added a commit to textmate/textmate that referenced this issue Aug 18, 2013
k-takata added a commit that referenced this issue Jun 17, 2014
No.67 should not match after the fix of Issue #22.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants