Skip to content

Function matchall with overlap=true does not produce correct output #8677

@binnisb

Description

@binnisb

Tested few cases with julia Version 0.4.0-dev+1021 (2014-10-08 21:56 UTC):

julia> matchall(r"GCG","GCGCG")
1-element Array{SubString{UTF8String},1}:
 "GCG"

julia> matchall(r"GCG","GCGCG",true)
3-element Array{SubString{UTF8String},1}:
 "GCG"
 "GCG"
 "GCG"
# Note I removed the first G from the string
julia> matchall(r"GCG","CGCG",true)
2-element Array{SubString{UTF8String},1}:
 "GCG"
 "GCG"

First case does it correctly but the second and third case, using overlap = true, should have returned a list of length 2 and 1 as eachmatch does:

julia> for i in eachmatch(r"GCG","GCGCG")
       println(i)
       end
RegexMatch("GCG")

julia> for i in eachmatch(r"GCG","GCGCG",true)
             println(i)
             end
RegexMatch("GCG")
RegexMatch("GCG")

# Note I reomved the first G from the string
julia> for i in eachmatch(r"GCG","CGCG",true)
             println(i)
             end
RegexMatch("GCG")

The problem seems to be that it counts the last match twice.

I saw that there have been efforts to speed up matchall, which might have introduced errors. I am willing to debug this and submit a pull request, I have compiled julia and am looking at the regex,jl in base and in test. I am wondering, how do I run the tests if I add a test case and a fix for matchall?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions