Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior of "greedy" option #1039

Closed
romanvm opened this issue Oct 21, 2016 · 3 comments
Closed

Strange behavior of "greedy" option #1039

romanvm opened this issue Oct 21, 2016 · 3 comments

Comments

@romanvm
Copy link
Contributor

romanvm commented Oct 21, 2016

First, I must admit that I have little experience in JavaScript. But I like Prism and now I'm working on a language definition for Django/Jinja2 template languages. Those are 2 popular template languages in Python world with similar syntax, so I'm making one definition for both. You can see my working code here.

I'd say that it works OK in 95% cases and now I try to improve handling of borderline cases. One approach that I've tried is using the greedy options for matching template tags {{ ... }} and {% ... %}. Indeed, it improved matching, but on some testing samples a strange bug happens: the first string is copy-pasted all over the code snippet.

Here's my testing code snippet:

{% load i18n %}
<!DOCTYPE html>
<html lang="{{ LANGUAGE_CODE }}">
<head>
  <!-- This is the problem string -->
  <script src="{% static "cerulean_skin/js/ie10-viewport-bug-workaround.js" %}"></script>
</head>
<body>
  <script type="text/javascript">
    var DOCUMENTATION_OPTIONS = {
      URL_ROOT:    '{{ url_root }}',
      VERSION:     '{{ release|e }}',
      COLLAPSE_INDEX: false,
      FILE_SUFFIX: '{{ '' if no_search_suffix else file_suffix }}', // Difficult case
      HAS_SOURCE:  {{ has_source|lower }}
    };
  </script>
</body>
</html>

(GitHub has some support for Django/Jinja2 too, BTW)

And here's what I get with my language definition:
image jpg 3350-17

As you can see, it works OK except for the "difficult case". Now here's what I get with greedy option:

var _django_template = {
  'property': {
    pattern: /(?:{{|{%)[\w\W]*?(?:%}|}})/g,
    greedy: true,
inside: {
...

image jpg 3350-18
As you can see, the "difficult case" is now painted correctly, but the first string {% load i18n %} is copy-pasted all over the place.

By elimination I found out that "the problem string" with <script> tag is causing all this. If I remove it, the copy-pasting problem disappears.

I'd like to know: am I doing something wrong or this is a bug in Prism?

zeitgeist87 added a commit that referenced this issue Oct 24, 2016
This bug occurs in the relatively rare case of a pattern matching the
empty string. It was reported in issue #1039. If for example a HTML
page contains an empty script tag `<script></script>` then the script
pattern will match anything inside, which is the empty string.

This empty string is then passed to the constructor of the Token class.
Since `""` is falsy in Javascript the property `matchedStr` is set to
`null`.

But the property `matchedStr` is needed to calculate the current
position for the greedy feature. A `null` value in `matchedStr` results
in a `pos` that is `NaN`. This causes the bug described in issue #1039.

Since the property `matchedStr` is only ever needed to calculate the
length of the Token, it is more efficient to store the length directly
instead of the string. As a side effect this also fixes issue #1039.
@zeitgeist87
Copy link
Collaborator

This was indeed a bug. It should be fixed now. Thanks for reporting it!

@romanvm
Copy link
Contributor Author

romanvm commented Oct 26, 2016

Thanks! It seems to be fixed now. I'll test my language definition more and then submit it to Prism.

@romanvm
Copy link
Contributor Author

romanvm commented Oct 28, 2016

Since this issue is fixed, I'm closing it. Thanks for your help.

@romanvm romanvm closed this as completed Oct 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants