-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accents/special characters in payloads causing problems #39
Comments
You can't unconditionally utf-8 encode: If the input data is already in utf-8 (likely), this would lead to a double encode which, again, breaks characters. You can of course detect whether the input is already in utf-8 (mb_detect_encoding()), but it's not the quickest operation of them all. I would recommend this being handled on the client side for performance reasons, especially because more and more projects are using utf-8 internally. |
@pilif Hey! That is what I ended up doing... in my enquer I encode the strings, and I always decode them on my job workers. Anyways, I thought it was useful to post here as it took me a little while to figure out what was going on with the empty payloads... |
Even The only piece of code in a position to know for sure is the piece that actually accepts the input. In most PHP use cases, this is the browser. Most of the rest depend on the encoding of whatever other application (MySQL, Redis, web API, etc) the data is coming from. Some of these will tell you the encoding they've used. Some of them won't. And sadly, sometimes the ones that do mention an encoding are simply wrong. Ultimately, the only sane location to handle encoding conversions is in the Resque client, not the library itself. |
Is this still an issue, or can it be closed? |
Due to the way that JSON_ENCODE works, accents and some other special characters cause it to return null variables.
This poses a problem when passing messages that include these characters such as text in foreign languages and perhaps more... One work-around for this is to UTF8 encode all of the payload data.
I have tried this in development and it works fine. Is there any downside that you guys can point?
The function I am using was obtained in the php manpages and I havent even tweaked it yet, but the idea is to add something such as:
and then encode the payload through this function (in order to retain the array structure).
I currently identified that this is necessary in resque.php (function push), in job.php (crica line 250) and worker.php (crica line 500)
Another idea is to UTF8 before sneding to resque (which I will do tomorrow for compatibility sakes since I use a "enque wrapper function" in my app). but I thought I'd contirbute this anyhow in case anyone has issues.
The text was updated successfully, but these errors were encountered: