Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: prefer fewer doesn't work with (.*)+? #32704

Closed
zhuangliangji1992 opened this issue Jun 20, 2019 · 7 comments
Closed

regexp: prefer fewer doesn't work with (.*)+? #32704

zhuangliangji1992 opened this issue Jun 20, 2019 · 7 comments

Comments

@zhuangliangji1992
Copy link

@zhuangliangji1992 zhuangliangji1992 commented Jun 20, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.6 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/liangji.zlj/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/liangji.zlj/workspace/go_workspace"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.12.6/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.12.6/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/liangji.zlj/workspace/go_workspace/src/hello/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/5t/xqwqx3sd29n17qyy0qlm2rq00000gn/T/go-build122003158=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

func main() {
	contextTest := "abbbb\ncd\nefg\nhepq"
	reg := regexp.MustCompile(`(?s)(.*)+?e`)
	res := reg.FindStringSubmatch(contextTest)
	fmt.Println(res[0])
}

What did you expect to see?

abbbb
cd
e

What did you see instead?

abbbb
cd
efg
he

@av86743
Copy link

@av86743 av86743 commented Jun 20, 2019

Perhaps you want (.*?)? ? does not affect preceding nested *, only immediately preceding +. So (.*) grows as much as it can, stopping only at the last e.

@zhuangliangji1992
Copy link
Author

@zhuangliangji1992 zhuangliangji1992 commented Jun 21, 2019

@av86743 Do you mean +? can't affect (.*) ? But, I find +? can affect preceding normal characters. For example, ([a-z])+?.

@av86743
Copy link

@av86743 av86743 commented Jun 21, 2019

Affect repetitions of constant pattern - yes, indeed. Affect behavior of nested repetition modifier in preceding capture - no, it doesn't.

Should I also point out the inefficiency of your original pattern? In order to figure out the last e, it still has to scan the entire tail that follows that last e.

@zhuangliangji1992
Copy link
Author

@zhuangliangji1992 zhuangliangji1992 commented Jun 21, 2019

So, the behavior of +? affect preceding "repetitions of constant pattern" and "nested repetition modifier" is different.

Of course, I know my original pattern is inefficiency. I want to figure out the first e and I'm
just wondering why the result of (.*)+?e and .+?e is different in my simple case.

Thank you very much!

@av86743
Copy link

@av86743 av86743 commented Jun 21, 2019

So, the behavior of +? affect preceding "repetitions of constant pattern" and "nested repetition modifier" is different.

I wouldn't say +? behavior is different in your cases. For (.*)+?, you expect that ? would also force .* to take the shortest match, but it does not. On the contrary, it takes the longest match, as it should (without *?.) ? only affects + in preceding (.*)+ or .+.

@katiehockman
Copy link
Member

@katiehockman katiehockman commented Jun 21, 2019

For future questions about Go, see https://golang.org/wiki/Questions.

@katiehockman katiehockman changed the title Prefer fewer doesn't work with (.*)+? src/regexp: prefer fewer doesn't work with (.*)+? Jun 21, 2019
@dmitshur dmitshur changed the title src/regexp: prefer fewer doesn't work with (.*)+? regexp: prefer fewer doesn't work with (.*)+? Jul 19, 2019
@andybons
Copy link
Member

@andybons andybons commented Aug 12, 2019

Hi there,
We have decided that our experiment to allow questions on the issue tracker has not had the outcome we desired, so I am closing this issue. I'm sorry that we can't answer your question here.

There are many other methods to get help if you're still curious:

Thanks

@andybons andybons closed this Aug 12, 2019
@golang golang locked and limited conversation to collaborators Aug 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.