Case insensitive literal not working with backreferences #216

mingodad · 2022-06-20T13:23:40Z

See discussion here and the examples tested on cpp-peglib playground.

The text was updated successfully, but these errors were encountered:

yhirose · 2022-06-20T13:30:02Z

@mingodad, could you put the smallest possible PEG grammar here, so that I can reproduce it on my machine easily? Thanks!

ChrisHixon · 2022-06-20T13:54:07Z

I'm seeing various corruption in the error message with this grammar on the playground:

ROOT          <- CONTENT !.
CONTENT       <- (ELEMENT / TEXT)*
ELEMENT       <- $(STAG CONTENT ETAG)
STAG          <- '<'  < $tag<TAGNAME> > '>'
ETAG          <- '</' < $tag > '>'
TAGNAME <- 'a' / 'b'i
TEXT          <- (![<] .)+

Input: <a>foo</A>

On Firefox, the error I'm currently seeing with the above grammar/input:

1:9 syntax error, unexpected 'A', expecting 'd tota�� % success fail definition 13 4 '.

It seems more apt to happen if i is added to the literals in TAGNAME, but I've seen corruption in simpler cases. Minor edits of TAGNAME change the corruption, even things like altering number of spaces. I see corruption in both Chromium and Firefox, even after refreshing, clearing cookies and local data, etc.

The command line lint seems to always show the error I believe is the proper error (with lots of variations on the TAGNAME): 1:9: syntax error, unexpected 'A', expecting 'a'.

I'll see if I can narrow it down to simpler grammar any...

ChrisHixon · 2022-06-20T14:37:28Z

This is about as simple as I can get it and still see consistent corruption:

ROOT          <- CONTENT !.
CONTENT       <- (ELEMENT / TEXT)*
ELEMENT       <- $(STAG CONTENT ETAG)
STAG          <- '<'  < $tag<"a"> > '>'
ETAG          <- '</' < $tag > '>'
TEXT          <- (![<] .)+

Input: <a>foo</A>
Most of the time error is: 1:9 syntax error, unexpected 'A', expecting 'd '.
Occasionally: 1:9 syntax error, unexpected 'A', expecting 's) i'.

yhirose · 2022-06-25T11:32:49Z

@ChrisHixon, thanks for the problem report. I fixed it at 3c2a53c.

yhirose · 2022-06-25T11:51:14Z

@mingodad, I would like to make sure I understand what you are mentioning here.

The current cpp-peglib backreference behavior is 'exact match' to the captured string, and same as the regular expression.

If your suggestion says this example should succeed, I am not sure if it's correct. Could you explain more clearly?

mingodad · 2022-06-25T11:55:20Z

After you showing it with regex I can see your point.
Also in the same topic it would be nice to have character class case insensitive [a-z]i for grammars where identifiers are case insensitive (SQL, Pascal, ...).

mingodad · 2022-06-25T12:14:20Z

Here is an example on peggy playground https://peggyjs.org/online.html (also implemented here https://github.com/mingodad/peg):

start = name_char+ 
name_char =
	 [a-z0-9$_]i* [ \t\n]

Input:

one
Two
One

yhirose · 2022-06-25T12:21:01Z

@mingodad, thanks for the response. I'll close this issue. Could you make a separate issue for [...]i operator?

mingodad mentioned this issue Jun 20, 2022

Case insensitive literals ChrisHixon/chpeg#4

Open

yhirose added the bug label Jun 25, 2022

yhirose removed the bug label Jun 25, 2022

yhirose closed this as completed Jun 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Case insensitive literal not working with backreferences #216

Case insensitive literal not working with backreferences #216

mingodad commented Jun 20, 2022

yhirose commented Jun 20, 2022

ChrisHixon commented Jun 20, 2022

ChrisHixon commented Jun 20, 2022

yhirose commented Jun 25, 2022

yhirose commented Jun 25, 2022

mingodad commented Jun 25, 2022

mingodad commented Jun 25, 2022

yhirose commented Jun 25, 2022

Case insensitive literal not working with backreferences #216

Case insensitive literal not working with backreferences #216

Comments

mingodad commented Jun 20, 2022

yhirose commented Jun 20, 2022

ChrisHixon commented Jun 20, 2022

ChrisHixon commented Jun 20, 2022

yhirose commented Jun 25, 2022

yhirose commented Jun 25, 2022

mingodad commented Jun 25, 2022

mingodad commented Jun 25, 2022

yhirose commented Jun 25, 2022