Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue with RegularParser #27

Closed
rhukster opened this issue Jan 21, 2016 · 6 comments
Closed

Performance Issue with RegularParser #27

rhukster opened this issue Jan 21, 2016 · 6 comments
Assignees
Labels

Comments

@rhukster
Copy link

As outlined in the original PR comment: #26 (comment), I was testing the Shortcode against some documentation I have for Grav CMS where I have created a Tab shortcode. I was running into a couple of issues, but one of them is related to very slow parsing of this example page with the RegularParser.

After a few back and discussions with @thunderer, he believes it is related to the various non-shortcode [] references contained in the document.

I have put together a simple test scenario (https://github.com/rhukster/shortcode-test) that shows this issue. There are two documents, a small one that is considerably slower than either the RegexParser or the WordpressParser. The RegularParser is not able to even process the full document as it just continues to spin until the PHP process is terminated.

FYI Both Wordpress and Regex parsers are able to parse this document. (note: corruption issues are handled in a separate issue #25 (comment))

@thunderer
Copy link
Owner

@rhukster Just wanted to clarify: this is not what I believe to be related, I know that it's caused by many unclosed shortcodes, because such input causes RegularParser to open many recursive calls to parser rules, that's how it works. It is not your or anyone else's fault (first of all it's my fault as an author :)), just a statement of problem to solve that I'm already trying to implement.

@thunderer thunderer self-assigned this Jan 23, 2016
@rhukster
Copy link
Author

You might want to look at my "Ideas" suggestions that could help with performance here. Cheers!

@thunderer
Copy link
Owner

@rhukster I merged the performance improvements PR to master, but I found some issues with parsing correctness when dealing with complex nested cases so please use RegexParser for now, I'll give you an update once they will be resolved.

@thunderer
Copy link
Owner

@rhukster I think I fixed the issue in 4cda52f , could you please test if current dev-master matches your expectations?

@rhukster
Copy link
Author

I just tested your latest dev-master. The RegularParser is now much improved! As you say it renders in about 500ms on the 'test page'. No more errors and timeouts! I'll probably continue to use Regex or Wordpress parser as they are an order of magnitude faster still (Regex ~40ms, and Wordpress ~20ms).

What would the advantage be to using the RegularParser over the Regex or Wordpress one?

@thunderer
Copy link
Owner

Thanks, could you also help @giansi resolve problems in #29? If you say that this issue is resolved, I only need to know if his issue is something to fix before tagging new release.

As for the advantages of RegularParser - the most important one is parsing correctness impossible to achieve with regular expressions. Say that you have text like [x][x][x][/x][/x][/x] (shortcodes nested multiple levels with the same names). Regex is unable to process it correctly as it'll always see it like [x][x][x][/x] (first closing tag always closes first matching opening tag), even recursive capture groups (?R) is not going to help (regex recursion is more like inner repetition, not true recursion). I could possibly match only opening / closing tag fragments and try to analyze them (I just came up with this idea, maybe worth trying?) but then it's be basically RegexParser with RegularParser's internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants