Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tokenized PHP in Customizer settings checks rather than regex. #1

Closed
wants to merge 1 commit into from

Conversation

kovshenin
Copy link

As noted in the theme reviewers mailing list the regex in the customizer check easily chokes when a semicolon is encountered anywhere before the end of the method call (comment, default value, etc.).

This is an attempt at using tokenized PHP to validate sanitize_callback or sanitize_js_callback exist and are not empty.

@fklein-lu
Copy link

Tested the patch, and it works great. Excellent work!

The only thing to add is that the trim() call on line 80 should strip spaces as well. There is an edge case when developers pass ' ' as value, which is currently not caught.

@Otto42
Copy link
Member

Otto42 commented Oct 29, 2014

Tokenizing the whole file just to find one of a couple of arguments to a specific function. Love it. Overkill and a half. :)

My only concern is speed. But I admit that I don't have a basis for comparison at the moment. I'll try and think of something to measure with.

@kovshenin
Copy link
Author

I ran some profiling tests against both versions of the check, the regex version is much faster compared to looping through all tokens: ~ 500 microseconds for regex tests and ~ 50000 microseconds (50 ms) for tokens. The test ran with 50 calls to $wp_customize->add_setting() in functions.php.

If there's no call to add_setting() it obviously does not even begin to tokenize. The tokenizing itself is pretty fast (1.5 ms), it's all the logic in check() that's slow.

@Otto42
Copy link
Member

Otto42 commented Feb 10, 2015

Given how badly the i18n checks are performing (which also use the tokenizer), I'm now much less inclined to trust it. Thinking we need an overall better solution here, or to simplify these checks to be less precise somehow.

@Otto42
Copy link
Member

Otto42 commented Aug 20, 2015

The new textdomain code uses the tokenizer and doesn't seem to be breaking things. Perhaps we can integrate these checks in there and kill two birds with one stone.

@Otto42
Copy link
Member

Otto42 commented Sep 10, 2015

Related #87. Given that I'm also using the tokenizer in the text domain checks now, a generic approach might be better to avoid duplicated work. Still looking at it. Maybe pass tokenizer data around? Still, discuss in #87.

@Otto42
Copy link
Member

Otto42 commented Jan 1, 2016

Closing this in favor of a future redesign as a whole, which uses the tokenizer more extensively and avoids having to retokenize the files over and again. Also will reduce memory usage and simplify some of the checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants