Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csearch sometimes fails on character classes containing only the same letter in upper and lowercase #8

Closed
GoogleCodeExporter opened this issue Mar 16, 2015 · 4 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
laptop$ cat userids.txt 
dgryski
laptop$ cindex .
2012/01/24 14:48:22 index /[XXXXXX]/
2012/01/24 14:48:22 flush index
2012/01/24 14:48:22 merge 0 files + mem
2012/01/24 14:48:22 8 data bytes, 237 index bytes
2012/01/24 14:48:22 done
laptop$ csearch '[g]r'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Hg]r'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Gg]r'
laptop$ 
laptop$ csearch 'g[Rr]'
/[XXXXX]/userids.txt:dgryski
laptop$ csearch '[Dd]g'
laptop$ csearch '[ZDd]g'
/[XXXXX]/userids.txt:dgryski
laptop$

What is the expected output? What do you see instead?
I expect 'dgryski' to be printed, but instead depending on the regex no lines 
are found.

What version of the product are you using? On what operating system?
070ef10ab799 tip.  Darwin 10.8.0

Please provide any additional information below.

Original issue reported on code.google.com by dgryski on 24 Jan 2012 at 1:59

@GoogleCodeExporter
Copy link
Author

Actually, it looks like it fails when the two-letter-character class is the 
first item in the regex.

laptop$ csearch 'd[gG]r'
/[XXXXX]/userids.txt:dgryski

Original comment by dgryski on 24 Jan 2012 at 2:09

@GoogleCodeExporter
Copy link
Author

I did a bit of investigation last night and it looks like the problem is with 
match.go:stepByte().  If we need to fold for this particular instruction, we 
uppercase the character 'c', but it doesn't get reset back to the lowercase 
version  when we move on to processing the next state -- the character 'c' is 
still the modified uppercase version instead of the original lowercase version 
that's actually in the string.

I've attached a patch to match.go to fix this, and two test cases to 
regexp_test.go.

Original comment by dgryski on 27 Jan 2012 at 8:51

Attachments:

@GoogleCodeExporter
Copy link
Author

This issue was closed by revision 5cd8d184e954.

Original comment by dgryski on 2 May 2012 at 8:05

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Author

Issue 19 has been merged into this issue.

Original comment by dgryski on 2 May 2012 at 8:54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant