New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fortran. Character variables ending in backslash cause highlighting issues. #1508
Comments
@ecasglez, isn't this invalid Fortran code? A quick Google search for "fortran escape character" turns up a result on Oracle's documentation for Fortran 77. It states that a PROGRAM testslash
IMPLICIT NONE
CHARACTER(LEN=:),ALLOCATABLE :: pathname
pathname = 'C:\\users\\user\\testslash\\'
WRITE(*,*) pathname
END PROGRAM testslash I don't yet see that this is a bug in the Fortran lexer. |
I would say this is valid Fortran, or at least it is compiled without any warnings using gfortran. I think the default treatment of backslashes is compiler-dependent in Fortran. Having a look at the gfortran documentation here you can see there are two options called ' I understand from the link you provided that the behavior in the Oracle compiler might be the opposite. By default they are used as escape characters and you need to compile with option Here you can see a issue on Doxygen related to strings ending in backslash too. |
I don't see how we can solve this reliably in Pygments. I'd err on the side of caution and keep the current behavior for now. |
I have same problem. I'm using the Intel Fortran compiler which distinguishes between standard strings and C-strings based on an optional C-string specifier. C-strings use escaping with the backslash character, while regular strings do not. I discovered an additional issue that seems to be related. The problem lies in from pygments.lexers import FortranLexer
source = r"""
foobar = 'foo'//'\'//'bar'
!\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
"""
for token in FortranLexer().get_tokens(source):
print(token) The run time increases rapidly with each additional backslash added to the comment line. I admit this is an edge case but I definitely would add a +1 for solving the issue brought up by the OP. |
That sounds a lot like catastrophic backtracking in the string regexp. I’ll take a look when I’m back home in a few days. Regarding escaping vs normal backslashes, I don’t see any way to highlight correctly but to add a lexer option for code with escaping backslashes. |
@mbraakhekke, you've discovered a catastrophic backtracking bug. pygments/pygments/lexers/fortran.py Lines 159 to 162 in 64e8e05
Would you open a new ticket that includes the sample code to reproduce the problem, and reference this issue as well? I'm interested in addressing this but likely can't jump into it immediately due to existing obligations. Edit: Or @jean-abou-samra may address this before me! 🥳 |
Sure, I'll open a new ticket. But note that there's a clear link with the backslash problem. If I use a double slash instead of a single in my example, the catastrophic backtracking problem is gone. |
I have used the demo section of your website to produce the following examples. It also happens in a local installation with version 2.6.1
I have some variables containing paths in Fortran. If the variables end in any character other than
'\'
everything works fine as in the following figure.However, if the variables end in
'\'
, the highlighting is wrong as in the following figure. You can see that line 5 containingWRITE(*,*) pathname
is in red and in line 6pathname =
is also in red, and they shouldn't. In addition, in line 6 there are some frames around the backslashes.If there is only one line with a variable ending in
'\'
as in the following figure, line 5 (with theWRITE
statement) is now ok, but there are the frames around the backslashes.I am attaching the code used as examples to this issue.
IssueBackslash.zip
The text was updated successfully, but these errors were encountered: