Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFamilyLexer: support unicode identifiers #1848

Merged
merged 1 commit into from
Oct 15, 2021

Conversation

amitkummer
Copy link
Contributor

@amitkummer amitkummer commented Jun 26, 2021

Fixes #998. This new identifiers regex uses \w at the start so it will match numbers (like 12 for example), but we only use it for Names after we match for everything else (numbers, etc) so it should be fine.

@amitkummer
Copy link
Contributor Author

I also wanted to add this change to the CHANGES file, but wasn't sure if I should touch it. For future PRs, is it ok to do so?

@amitkummer
Copy link
Contributor Author

Can I help with moving this forward in light of #1916?

Right now identifiers must start with a letter matching [a-zA-Z_$], this PR aims to match the entire identifier with \w.

@Anteru Anteru self-assigned this Oct 15, 2021
@Anteru Anteru added the A-lexing area: changes to individual lexers label Oct 15, 2021
@Anteru
Copy link
Collaborator

Anteru commented Oct 15, 2021

Sorry, missed that completely. That's removing _ from the expression, is _ contained in \w? Otherwise this seems fine to be, just surprised it's [\w$] and not [\w_$].

@Anteru
Copy link
Collaborator

Anteru commented Oct 15, 2021

Ah, yes, indeed \w includes _. Merged, thanks!

@Anteru Anteru linked an issue Oct 15, 2021 that may be closed by this pull request
@Anteru Anteru merged commit 94aba94 into pygments:master Oct 15, 2021
@Anteru
Copy link
Collaborator

Anteru commented Oct 15, 2021

As to CHANGES: Feel free to edit them, but there's no requirement as I go through the changelog before a release anyways.

@Anteru Anteru added this to the 2.11.0 milestone Oct 15, 2021
@Anteru Anteru added the changelog-update Items which need to get mentioned in the changelog label Oct 15, 2021
@Anteru Anteru removed the changelog-update Items which need to get mentioned in the changelog label Nov 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lexing area: changes to individual lexers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accept Unicode identifiers, possibly XID, for C and C++? C++ Unicode identifiers
2 participants