Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid catastrophic backtracing in regex #1690

Merged
merged 2 commits into from Jun 7, 2021

Conversation

Ravlen
Copy link
Contributor

@Ravlen Ravlen commented Mar 10, 2021

@tancnle This PR attempts to fix some regex that could cause catastrophic backtracing, could you take a look?

For the ghc_core lexer:

  • Screen Shot 2021-03-10 at 15 57 39
  • Screen Shot 2021-03-10 at 15 58 13

For the factor lexer:

  • Screen Shot 2021-03-10 at 16 09 54
  • Screen Shot 2021-03-10 at 16 09 39

@@ -243,7 +243,7 @@ def self.builtins
end

# strings
rule %r/"""\s+.*?\s+"""/, Str
rule %r/"""[^"].*?[^"]"""/, Str
Copy link
Contributor Author

@Ravlen Ravlen Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this is wrong, let me fix that. Though I'm finding it hard to find what is actually supported in the Factor language. Does it need spaces after the triple quotes to be valid? Can it be split across multiple lines like this?

Test in GitHub:

article "ARTICLES" {
    { "title" "TITLE" { VARCHAR 256 } +not-null+ +user-assigned-id+ }
    { """title""" "TITLE" { VARCHAR 256 } +not-null+ +user-assigned-id+ }
    { ""title"" "TITLE" { VARCHAR 256 } +not-null+ +user-assigned-id+ }
    { """ title """ "TITLE" { VARCHAR 256 } +not-null+ +user-assigned-id+ }
    { """title over two
      lines to test it""" "TITLE" { VARCHAR 256 } +not-null+ +user-assigned-id+ }
} define-persistent

: <article> ( title -- article ) article new swap >>title ;

TUPLE: revision id title author date content description ;

Copy link
Contributor Author

@Ravlen Ravlen Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@tancnle tancnle Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Looks like GitHub is parsing the code correctly'ish 🤔 The highlight for double quotes seems to be out of place but overall it looks good.

I was about to suggest lifting the regexes directly from Pygments. However, it seems Pygments 2.8.1 (latest) - the Python package that Rouge compatible with - does not handle triple quotes well (could be an existing bug).

Screen Shot 2021-03-11 at 8 17 00 pm

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to factor/factor#1474, it seems like triple quotes are not supported.

@tancnle
Copy link
Collaborator

tancnle commented Mar 15, 2021

@Ravlen What do you think about getting the regexes from Pygments?

diff --git lib/rouge/lexers/factor.rb lib/rouge/lexers/factor.rb
index 2a18f715..4a7458a5 100644
--- lib/rouge/lexers/factor.rb
+++ lib/rouge/lexers/factor.rb
@@ -243,8 +243,9 @@ module Rouge
         end
 
         # strings
-        rule %r/"""\s+.*?\s+"""/, Str
-        rule %r/"(\\.|[^\\])*?"/, Str
+        rule %r/"""\s(?:.|\n)*?\s"""/, Str
+        rule %r/"(?:\\\\|\\"|[^"])*"/, Str
+        rule %r/\S+"\s+(?:\\\\|\\"|[^"])*"/, Str
         rule %r/(CHAR:)(\s+)(\\[\\abfnrstv]*|\S)(?=\s)/, Str::Char
 
         # comments

@dblessing
Copy link
Collaborator

@tancnle I like this idea. They're vetted/tested.

Uses the same regex as pygments, and removes the
triple quote regex, which isn't actually supported
in factor
@Ravlen
Copy link
Contributor Author

Ravlen commented Mar 18, 2021

@tancnle @dblessing OK, updated with the regex from Pygments, but I dropped the """ string regex as I'm quite confident it isn't supported by Factor. See https://github.com/factor/factor/blob/master/core/strings/strings-tests.factor, and the comment Tan found at: factor/factor#1474 (comment)

@tancnle
Copy link
Collaborator

tancnle commented Mar 18, 2021

Travis CI seems to be problematic 🤔 @dblessing could you help to check what is going wrong with Travis integration?

@dblessing
Copy link
Collaborator

It looks like we've run out of build minutes due to Travis' new billing - https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing.

We either need to apply for OSS allotment, or potentially use GitLab builds. What do you all think @tancnle @Ravlen. I think GitLab only offers 400 free minutes, but maybe they'd be willing to sponsor a 'plan' for Rouge?

@tancnle
Copy link
Collaborator

tancnle commented Mar 19, 2021

@dblessing Would it be much hassle to apply for an OSS allotment? It provides 10k credits which are build minutes I believe.

I think BuildKite or CircleCI might be a better choice given they have no build minute limit for OSS. I am keen to give you a hand on the CI migration if needed.

@dblessing
Copy link
Collaborator

@tancnle I've applied for credits. But it's not 10,000 automatic - I had to request the amount I think we need. I asked if they had visibility into historical usage. But AFAICT we use about 10 minutes per build - 1 minute per suite and we test 10 Ruby versions. Probably just 500-1000 minutes will suffice.

@tancnle
Copy link
Collaborator

tancnle commented Mar 23, 2021

@dblessing I think moving to BuildKite or CircleCI with unlimited build minute will remove the need to recalculate the allotment as we scale our pipeline. But it might require some work.

we test 10 Ruby versions

That's a lot. Are we thinking about deprecating old Ruby versions (I.e. no ongoing supported for MRI < 2.5)?

We can get those rolling on separate issues in the meanwhile, can't we?

@dblessing
Copy link
Collaborator

dblessing commented Mar 23, 2021 via email

@tancnle
Copy link
Collaborator

tancnle commented Mar 23, 2021

No worries @dblessing. I am cool with staying with TravisCI and get this MR unblocked and merged. FYI I have created the issue to remove support for old Ruby version.

@Ravlen
Copy link
Contributor Author

Ravlen commented Mar 24, 2021

@dblessing @tancnle If we want to switch to GitLab, I'm reasonably confident I could write the pipeline configuration. There are 50k minutes available for open source projects, see https://about.gitlab.com/solutions/open-source/, if we want to look into it (though perhaps you'd need to move the whole project to GitLab to get the full 50k?)

@dblessing dblessing mentioned this pull request Apr 6, 2021
@dblessing
Copy link
Collaborator

I still haven't heard back from Travis so I pinged them again on the support issue.

@Ravlen I think you're right - we'd need to either move the whole project or see if they would grant us a Premium license. I think the best case is Travis comes through for us, but if not we can see about GitLab CI.

@tancnle
Copy link
Collaborator

tancnle commented May 19, 2021

@dblessing Any news from Travis? If not, I think we can start migrating to another CI solution.

@dblessing
Copy link
Collaborator

dblessing commented May 19, 2021 via email

@Ravlen Ravlen closed this May 20, 2021
@Ravlen Ravlen reopened this May 20, 2021
@Ravlen
Copy link
Contributor Author

Ravlen commented May 20, 2021

@tancnle @dblessing Yay, finally the Travis build worked 😁

@dblessing dblessing merged commit 78af25c into rouge-ruby:master Jun 7, 2021
@jneen
Copy link
Member

jneen commented Jun 7, 2021

I for one would be 100% down to move to gitlab, so long as we could preserve the state of the issue tracker.

@jneen
Copy link
Member

jneen commented Jun 7, 2021

(And given that this software is used by gitlab I think it would be fitting :] )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants