Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yypushback() behaviour on surrogate characters #215

Closed
hurricup opened this issue Apr 13, 2017 · 4 comments
Closed

yypushback() behaviour on surrogate characters #215

hurricup opened this issue Apr 13, 2017 · 4 comments
Assignees
Milestone

Comments

@hurricup
Copy link

Are there any plans to do something with, I'd say, inconsistency, when [^] matches 2 chars utf-16 emoji, for example, but yypushback(1) will push us back by 1 char only, in the middle of the pair.

Or this is intentional behavior and in this case, it would be nice to mention this in documentation.

@lsf37
Copy link
Member

lsf37 commented Nov 3, 2017

Sorry, somehow the notification mechanism seems to have failed, and I only saw this just now.

Interesting question. I'm not sure what the semantics should be, i.e. whether yypushback should refer to Java chars or full Unicode code points. Java char is certainly easier in the implementation.

Does yypushback(yylength()) do the correct thing for [^]?

@sarowe do you have an opinion on this?

@sarowe
Copy link
Contributor

sarowe commented Nov 3, 2017

I don't think it's unreasonable that Java programmers have to know about the char/code point duality, so I think it's reasonable for yypushback(int) to deal strictly with chars. Maybe we could add a yypushback_codepoints(int) (or something like that but better named) ?

@lsf37
Copy link
Member

lsf37 commented Nov 3, 2017

Ok, I agree. I guess we mainly should document the current behaviour more clearly.

@lsf37 lsf37 self-assigned this Nov 3, 2017
@lsf37 lsf37 added this to the release 1.7.0 milestone Nov 3, 2017
lsf37 added a commit that referenced this issue Nov 3, 2017
@lsf37
Copy link
Member

lsf37 commented Nov 3, 2017

Have now mentioned this behaviour in the docs, as requested, in 846d20c

@lsf37 lsf37 closed this as completed Nov 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants