Use tokenized PHP in Customizer settings checks rather than regex. #1

kovshenin · 2014-10-29T01:15:47Z

As noted in the theme reviewers mailing list the regex in the customizer check easily chokes when a semicolon is encountered anywhere before the end of the method call (comment, default value, etc.).

This is an attempt at using tokenized PHP to validate sanitize_callback or sanitize_js_callback exist and are not empty.

fklein-lu · 2014-10-29T11:41:13Z

Tested the patch, and it works great. Excellent work!

The only thing to add is that the trim() call on line 80 should strip spaces as well. There is an edge case when developers pass ' ' as value, which is currently not caught.

Otto42 · 2014-10-29T16:20:13Z

Tokenizing the whole file just to find one of a couple of arguments to a specific function. Love it. Overkill and a half. :)

My only concern is speed. But I admit that I don't have a basis for comparison at the moment. I'll try and think of something to measure with.

kovshenin · 2014-10-29T18:00:50Z

I ran some profiling tests against both versions of the check, the regex version is much faster compared to looping through all tokens: ~ 500 microseconds for regex tests and ~ 50000 microseconds (50 ms) for tokens. The test ran with 50 calls to $wp_customize->add_setting() in functions.php.

If there's no call to add_setting() it obviously does not even begin to tokenize. The tokenizing itself is pretty fast (1.5 ms), it's all the logic in check() that's slow.

Otto42 · 2015-02-10T16:04:07Z

Given how badly the i18n checks are performing (which also use the tokenizer), I'm now much less inclined to trust it. Thinking we need an overall better solution here, or to simplify these checks to be less precise somehow.

Otto42 · 2015-08-20T23:31:01Z

The new textdomain code uses the tokenizer and doesn't seem to be breaking things. Perhaps we can integrate these checks in there and kill two birds with one stone.

Otto42 · 2015-09-10T01:40:48Z

Related #87. Given that I'm also using the tokenizer in the text domain checks now, a generic approach might be better to avoid duplicated work. Still looking at it. Maybe pass tokenizer data around? Still, discuss in #87.

Otto42 · 2016-01-01T23:29:20Z

Closing this in favor of a future redesign as a whole, which uses the tokenizer more extensively and avoids having to retokenize the files over and again. Also will reduce memory usage and simplify some of the checks.

Use tokenized PHP in Customizer settings checks rather than regex.

4697eda

Otto42 mentioned this pull request Oct 31, 2014

Unmatched parentheses in i18n check break the regex #3

Closed

Otto42 closed this Jan 1, 2016

kafleg mentioned this pull request Sep 3, 2021

Add a warning for large image sizes #396

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use tokenized PHP in Customizer settings checks rather than regex. #1

Use tokenized PHP in Customizer settings checks rather than regex. #1

kovshenin commented Oct 29, 2014

fklein-lu commented Oct 29, 2014

Otto42 commented Oct 29, 2014

kovshenin commented Oct 29, 2014

Otto42 commented Feb 10, 2015

Otto42 commented Aug 20, 2015

Otto42 commented Sep 10, 2015

Otto42 commented Jan 1, 2016

Use tokenized PHP in Customizer settings checks rather than regex. #1

Use tokenized PHP in Customizer settings checks rather than regex. #1

Conversation

kovshenin commented Oct 29, 2014

fklein-lu commented Oct 29, 2014

Otto42 commented Oct 29, 2014

kovshenin commented Oct 29, 2014

Otto42 commented Feb 10, 2015

Otto42 commented Aug 20, 2015

Otto42 commented Sep 10, 2015

Otto42 commented Jan 1, 2016