Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix warnings and small mistakes in several lexers #1679

Closed
wants to merge 10 commits into from

Conversation

brodock
Copy link

@brodock brodock commented Feb 13, 2021

When running tests for a gem that depends on rouge I've noticed that there are a bunch of relatively easy to fix warnings popping up during the test execution.

I've split into separate commits each warning / fix per file required and added an explanation on each commit message of what was the problem.

When evaluating each regular expression issue, I've used Ruby 2.0.0 Regexp documentation as base.


To reproduce the warnings above, I've run specs with -W2 option enabled:

RUBYOPT='-W2' rake check:specs

Occurrences:

* lib/rouge/lexers/clean.rb:82: warning: character class has duplicated range: /_*[a-z][\w_`]*/
* lib/rouge/lexers/clean.rb:90: warning: character class has duplicated range: /_*[A-Z][\w_`]*/
* lib/rouge/lexers/clean.rb:98: warning: character class has duplicated range: /[^\w_\s`]/
* lib/rouge/lexers/clean.rb:139: warning: character class has duplicated range: /[\w_]+/

Reason:

* `\w` already includes `_` so specifying both is unnecessary
Occurrences:

* lib/rouge/lexers/ecl.rb:117: warning: character class has duplicated range: /[:=|>|<|<>|\/|\\|\+|-|=]/
* lib/rouge/lexers/ecl.rb:118: warning: character class has duplicated range: /[\[\]{}();,\&,\.,\%]/

Reason:

* `[]` used when not desired (also precedence order was incorrect so `<>` would never match)
* `,` used multiple times inside the same `[]`
Occurrences:

* lib/rouge/lexers/gdscript.rb:31: warning: assigned but unused variable - builtins
Occurrences:

* lib/rouge/lexers/ghc_cmm.rb:25: warning: character class has duplicated range: /(?!#[a-zA-Z])[\w#\$%_']+/
* lib/rouge/lexers/ghc_cmm.rb:30: warning: character class has duplicated range

Reason:

* `\w` already includes `_` so specifying both is unnecessary
Occurrences:

* lib/rouge/lexers/sql.rb:118: warning: character class has duplicated range: /\w[\w\d]*/

Reason:

* `\w` already includes `\d` so specifying both is unnecessary
* `\w` initial token was defined incorrectly (should not start with a number)
Occurrences:

* lib/rouge/lexers/ada.rb:158: warning: character class has duplicated range: /\b(\p{Pc}|[[alpha]])\p{Word}*/

Reason:

* `[[alpha]]` was missing semicolons and therefore not being evaluated as a POSIX bracket expression
Occurrences:

* lib/rouge/lexers/kotlin.rb:27: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:29: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:30: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:56: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:87: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:88: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:93: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:94: warning: character class has duplicated range
* lib/rouge/lexers/kotlin.rb:109: warning: character class has duplicated range

Reason:

* `\p{Pc}` already include `_` so specifying both is unnecessary
Occurrences:

* lib/rouge/lexers/objective_c/common.rb:7: warning: assigned but unused variable - id

Reason:

* content on line 7 was duplicated on line 57
Occurrences:

* lib/rouge/lexers/solidity.rb:14: warning: character class has duplicated range: /[a-zA-Z$_][\w$_]*/

Reason:

* `\w` already includes `_` so specifying both is unnecessary
Occurrences:

* rouge/lib/rouge/lexers/yang.rb:13: warning: character class has duplicated range: /[\w_-]+(?=[^\w\-\:])\b/

Reason:

* `\w` already includes `_` so specifying both is unnecessary
Copy link
Collaborator

@dblessing dblessing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @brodock. I only have one comment.

I ran the check command you referenced and all I can say is 😓 There are a few. But thanks for taking care of some of them.

@@ -115,7 +115,7 @@ def self.keywords_type
rule %r/"/, Name::Variable, :double_string
rule %r/`/, Name::Variable, :backtick

rule %r/\w[\w\d]*/ do |m|
rule %r/[[:alpha:]][\w]*/ do |m|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We lose the digit piece here. Did you decide that's unnecessary since keyword_types don't include any digits? Similarly, :alpha: has slightly different meaning that \w - why the change? I actually wonder why we need both pieces anyway. Isn't %r/\w+/ equivalent? I'm slightly hesitant, though, as maybe there's something I'm not considering. Appreciate your insight.

@tancnle tancnle added the author-action The PR has been reviewed but action by the author is needed label Sep 22, 2021
@svoop
Copy link

svoop commented Mar 7, 2024

@brodock @dblessing Ping! Any chance to get this rebased and rolled out? There are so many warnings due to rouge now that it's basically down to either removing rouge (or tty-markdown pulling it in my case, ouch) or running tests without warnings which is not a good idea neither. Thanks a lot for your work on this!!

@tancnle
Copy link
Collaborator

tancnle commented Mar 18, 2024

@brodock @dblessing @svoop This has been resolved by #2030.

Thanks a lot for your effort here @brodock 🙇🏼

@tancnle tancnle closed this Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author-action The PR has been reviewed but action by the author is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants