Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add apdl and gcode lexers #1714

Merged
merged 18 commits into from
Mar 14, 2021
Merged

Add apdl and gcode lexers #1714

merged 18 commits into from
Mar 14, 2021

Conversation

averter
Copy link
Contributor

@averter averter commented Feb 13, 2021

Hey. This is a pull request to add an APDL and gcode lexers

Copy link
Collaborator

@Anteru Anteru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks, this seems to be missing a few things:

  • Copyright blob
  • Comments for the individual lexers
  • Example files for testing
  • A few changes to the regex rules

'root': [
(r'!.*\n', Comment),
include('strings'),
("(" + "|".join(elafunb) + r')\b', Name.Class),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the words function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you give me more information? I am unsure on how to add/use this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following these instructions I've joined all the keywords lists with words, but I am unsure if I did it the right way. I am still a newbie at Python and I did this quite a long time ago, so I can't remember what I wrote very well... Thanks for any help

@@ -0,0 +1,211 @@
import re
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing copyright header

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Changed



class apdlexer(RegexLexer):
name = 'ANSYS parametric design language'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add documentation for this lexer here as a docstring, including a link to the language (if possible) and the versionadded macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Changed.

@averter
Copy link
Contributor Author

averter commented Mar 3, 2021

I have also added two example testing files in the main directory. I hope that's where they had to be placed.

@averter
Copy link
Contributor Author

averter commented Mar 3, 2021

Removed the example files and put them in the right place. Now all should be good to go.

@Anteru
Copy link
Collaborator

Anteru commented Mar 5, 2021

Can you please run make mapfiles? This should fix the failing tests.

@averter
Copy link
Contributor Author

averter commented Mar 5, 2021

I managed to use the words function correctly now. I've tested the lexer on the two test files successfully and I have created a new commit with the mapfiles. There are still two tests failing though...

@Anteru
Copy link
Collaborator

Anteru commented Mar 5, 2021

Well, you have an invalid escape sequence: invalid escape sequence \*. Please double-check your rules. This is related to tokens like \*VOPER and I'm not sure what you're trying to match there in the first place.

@Anteru
Copy link
Collaborator

Anteru commented Mar 5, 2021

Please generate golden test outputs (you can see the command line in the test output.)

@averter
Copy link
Contributor Author

averter commented Mar 6, 2021

Please generate golden test outputs (you can see the command line in the test output.)

This is the error message that I am receiving. It seems unrelated to the lexer itself.

ImportError while loading conftest '/home/myusername/pygments/tests/conftest.py'.
tests/conftest.py:23: in <module>
    import pygments.lexers
E     File "/home/myusername/pygments/pygments/lexers/__init__.py", line 237
E       yield from find_plugin_lexers()
E                ^
E   SyntaxError: invalid syntax

@Anteru
Copy link
Collaborator

Anteru commented Mar 7, 2021

Are you using Python3.5 or later? This looks like you're using an outdated version of Python which doesn't know yield from.

@Anteru
Copy link
Collaborator

Anteru commented Mar 7, 2021

Thanks, that's looking better. Can you please do the same for gcode? I.e. add an example file and golden output?

@averter
Copy link
Contributor Author

averter commented Mar 7, 2021

Thank you for your patience. I am not overly happy with the gcode lexer, but I suppose that it can/will be improved by other contributors in the long run.

@Anteru
Copy link
Collaborator

Anteru commented Mar 7, 2021

Thanks for the quick update. Those gcode files are quite massive and highly repetitive, could you please trim them? I suspect 20-30 lines is enough because it seems to be all the same.

@averter
Copy link
Contributor Author

averter commented Mar 8, 2021

Thank you. It is true that gcode files tend to be quite large, and I agree that this is unnecessary for testing purposes. The latest commit should amend this.

@Anteru
Copy link
Collaborator

Anteru commented Mar 8, 2021

Thanks! Looks like some of your regex can be simplified (see the lint) failure, could you please take a look? There are a few with duplicated values in a class. I.e. [66788] = [678], assuming that was intented in the first place :)

@averter
Copy link
Contributor Author

averter commented Mar 14, 2021

Unfortunately I cannot make sense of that output. I am not familiar with regexlint and the output seems unintuitive. What is [66788] = [678] ? If regexlint pointed to line numbers in the apdlexer.py file I could , in principle, figure out what are the duplicated values. I would appreciate if you could give me some pointers. Thanks.

@Anteru
Copy link
Collaborator

Anteru commented Mar 14, 2021

Regexlint tells you which regular expression can be simplified. For instance, you can see this in the output:

/home/runner/work/pygments/pygments/pygments/lexers/apdlexer.py: (apdlexer:root:pat#5) W117: Overlap in character class: ['6', '8']
  ...SS|RUN|S(?:EL|TIF))|NCLUDE|PE(?:1(?:[66788])|2(?:0|8(?:[89]))|59|60)|VCH...
                                         ^ here

There's a class [66788] which will match the numbers 6, 7 and 8. There's no reason to repeat it, and [678] is equivalent (hence you get the error: "Overlap in character class: 6, 8".) Regexlint points you at that to simplify your regular expression (and maybe you made some mistake when you wrote [66788] in the first place?)

You can simply copy/paste the regex from regexlint and find it in the apdlexer.py file, regexlint doesn't do line numbers unfortunately.

@birkenfeld
Copy link
Member

birkenfeld commented Mar 14, 2021

You can simply copy/paste the regex from regexlint and find it in the apdlexer.py file, regexlint doesn't do line numbers unfortunately.

That's not quite true: if the regex is a literal regex in the source, it will point to the line number.

In this case, where the regex is compiled by the words() function, it doesn't find the line number (but it does tell you the index of the regex in the state).

In short, you have one or more duplicate words in your words() list.

@Anteru
Copy link
Collaborator

Anteru commented Mar 14, 2021

Oh, ok. Should regexlint print that out then? Something like "compiled regex" to indicate it's not a literal?

@birkenfeld
Copy link
Member

well it already says "(apdlexer:root:pat#5)" to indicate the position.

@averter
Copy link
Contributor Author

averter commented Mar 14, 2021

well it already says "(apdlexer:root:pat#5)" to indicate the position.

That makes sense. I was just getting confused by the numbers in the regexlint output. Thanks both.

@Anteru Anteru self-assigned this Mar 14, 2021
@Anteru Anteru added this to the 2.9 milestone Mar 14, 2021
@Anteru
Copy link
Collaborator

Anteru commented Mar 14, 2021

Merged, thanks!

@Anteru Anteru merged commit b872095 into pygments:master Mar 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants