New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracker needs to handle characters outside of US-ASCII #7
Comments
This happened in our puu.sh project, but could really happen just about anywhere else. |
This happened in the isoprey project yesterday. Also, it's worth considering that nicknames are often used to construct upload targets and WARC names. Permitting characters outside basic ASCII puts additional requirements on filesystems and transfer protocols: not onerous ones, but ones that people often don't consider. |
This fix has been running in production for a couple of months now with no evident ill effects, so I'm submitting it for permanent inclusion in the tracker codebase. The main source of external data in the tracker comes from its Warrior clients and seesaw clients via its Redis instance. Redis is encoding-agnostic, and Warrior/Seesaw processes _usually_ run UTF-8. I don't believe that this is a total fix; I don't think there's anything stopping someone from sending e.g. a username encoded in UTF-16. But this will solve the common cause of #7, and we can build on top of this work, e.g. enforcing UTF-8 for all Seesaw data returned to the tracker.
Probably fixed now: I haven't been able to trigger this category of bug since 01911de was added. If we see similar errors, we'll file new issues. (And note to self: that Travis build needs to be fixed.) |
Set default external encoding to UTF-8. #7.
It's possible for users to use characters that are not in US-ASCII in their usernames. Currently, the tracker claims page errors out with an invalid byte sequence error when displaying claims for usernames that contain such characters. One example:
http://paste.archivingyoursh.it/xucamucege.vbs
Strings returned by Redis have an unset encoding. redis/redis-rb#48 looks like it fixed up the Ruby client to honor the default external encoding. We might be able to fix this just with a
Encoding::default_external = 'utf-8'
.The text was updated successfully, but these errors were encountered: