-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++ Lexer ErrorToken: \ # ' #2207
Comments
Looking into this. |
It seems to work if you wrap the snippets in a function. For example, this is lexed okay: int main() {
const QString EMAIL_RE_STR(
// Regex for an e-mail address.
// From colander.__init__.py, in turn from
// https://html.spec.whatwg.org/multipage/input.html#e-mail-state-(type=email)
// Note that C++ raw strings start R"( and end )"
R"(^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9])"
R"((?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9])"
R"((?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$)"
);
} AFAICS, the problem is that without the enclosing function, we try to parse the statement as a function definition/declaration. Of course, the code given is not valid as a standalone program, but in documentation, it makes sense to give examples without context (e.g. without a
it is tricky to know if this is a function declaration or a declaration of a variable that is called by initializing a constructor. Cf. “most vexing parse”. I'm not sure yet what's the best way to fix this. Maybe just this? diff --git a/pygments/lexers/c_cpp.py b/pygments/lexers/c_cpp.py
index 5d3b9c7d..198d288d 100644
--- a/pygments/lexers/c_cpp.py
+++ b/pygments/lexers/c_cpp.py
@@ -132,9 +132,9 @@ class CFamilyLexer(RegexLexer):
r'(' + _possible_comments + r')' # possible comments
r'(' + _namespaced_ident + r')' # method name
r'(' + _possible_comments + r')' # possible comments
- r'(\([^;]*?\))' # signature
+ r'(\([^;"]*?\))' # signature
r'(' + _possible_comments + r')' # possible comments
- r'([^;{/]*)(\{)',
+ r'([^;{/"]*)(\{)',
bygroups(using(this), using(this, state='whitespace'), Name.Function, using(this, state='whitespace'),
using(this), using(this, state='whitespace'), using(this), Punctuation),
'function'),
@@ -143,9 +143,9 @@ class CFamilyLexer(RegexLexer):
r'(' + _possible_comments + r')' # possible comments
r'(' + _namespaced_ident + r')' # method name
r'(' + _possible_comments + r')' # possible comments
- r'(\([^;]*?\))' # signature
+ r'(\([^;"]*?\))' # signature
r'(' + _possible_comments + r')' # possible comments
- r'([^;/]*)(;)',
+ r'([^;/"]*)(;)',
bygroups(using(this), using(this, state='whitespace'), Name.Function, using(this, state='whitespace'),
using(this), using(this, state='whitespace'), using(this), Punctuation)),
include('types'), It's heuristic, and given that we're not going to reimplement a full-fledged C++ parser which is complex like hell, it will always remain heuristic ... |
(actually not possible for us since parsing C++ requires knowing the contents of the include files) |
@jean-abou-samra @amitkummer Thanks for looking into this. In case it is of any help, the original code is at https://github.com/ucam-department-of-psychiatry/camcops/tree/master/tablet_qt |
…ons and declarations Something like id id2("){ ... }"); is no longer wrongly recognized as a "function" id id2(") { ... } "); As the difference in the tests shows, this has the unfortunate side effect that we no longer highlight something like int f(param="default"); as a function declaration, but it is hard to imagine another way to fix this (cf. “most vexing parse” problem). Fixes pygments#2207
…ons and declarations Something like id id2("){ ... }"); is no longer wrongly recognized as a "function" id id2(") { ... } "); As the difference in the tests shows, this has the unfortunate side effect that we no longer highlight something like int f(param="default"); as a function declaration, but it is hard to imagine another way to fix this (cf. “most vexing parse” problem). Fixes pygments#2207
…ons and declarations (#2208) Something like id id2("){ ... }"); is no longer wrongly recognized as a "function" id id2(") { ... } "); As the difference in the tests shows, this has the unfortunate side effect that we no longer highlight something like int f(param="default"); as a function declaration, but it is hard to imagine another way to fix this (cf. “most vexing parse” problem). Fixes #2207
I've attached some files that are failing with the C++ Lexer. Most have been broken since c1a0d82. One has been broken since fc56ab8
pygments_cpp_tests.tar.gz
The text was updated successfully, but these errors were encountered: