-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
twisted/web/server.py:190 #3059
Comments
Looking at the source data this should be caused by one of the following fields in the
We'll have to make sure that our unit tests cover these three cases. |
@devos50 I did some research on JSON grammar & standards and I don't think it would be a good idea after all to simply force JSON to handle non-text characters. A custom implementation or even A solution would be to simply encode ALL strings as base64 (or some other format which can be represented as ASCII in JSON):
The alternative would be to do this by hand and find all instances where a JSON field contains binary data. (which is what we're doing now):
What it boils down to is: do we want slow and reliable or fast and unstable? Personally I vote for slow and reliable. Your thoughts? |
reliable at all cost! What protocol is this in our architecture? |
@qstokkink encoding all strings as base64 is something I've been thinking about a year ago but my main objection against this is that it's not friendly for developers anymore. There are some people who really like the RESTful API and serving them a base64-encoded response will require them to do an additional step in the command line. So I'm slightly more in favor of the current approach and debugging the |
@synctext This concerns our GUI 🔃 Core communication / our public REST API. @devos50 Alright, since you made the GUI code I'll trust your judgement on this. I'll still put in one level of indirection though, such that the error report shows which of the fields could not be encoded (instead of the generic "oops something went wrong"). |
@qstokkink I'm not talking about the complexity it adds for us (we can easily adopt our code to support this, it's only one or two lines). The problem is that our API is not 'clean' anymore in a sense that if I want to use the command line to control the Tribler instance (which I do quite often to test new endpoints or get information about existing ones), it's more complicated. |
@devos50 Exactly, I agree completely: JSON was made as a human readable data container for interaction with an application. Pumping binary strings into JSON is actually misuse of JSON. However, since we found ourselves in this situation, we do need to deal with these accidental binary strings somehow. So the only thing I suggest to do, is to make it easier to debug when someone does send a binary string (which JSON gives a vague error for, as seen in the error of o.p.). |
@qstokkink that's an ok suggestion. Maybe we should have some generic way to send some key/value info to our servers (only for debugging purposes of course)? Then we can capture the UnicodeDecodeError and be done with it. |
@devos50 You mean forward all failed requests to our servers? Also, I extended the UnicodeDecodeError message, what I have right now is: Traceback (most recent call last):
File "/home/quinten/tribler/tribler/Tribler/Core/Utilities/json_util.py", line 47, in <module>
dumps({'a': ['ABCDEF', '\xFF', '\xFE'], '\xFF': 1, 5: {2: u'\u1234\x00', 3: '\xFC', 4: (1, 2, 3, '\xFF')}})
File "/home/quinten/tribler/tribler/Tribler/Core/Utilities/json_util.py", line 40, in dumps
raise UnicodeDecodeError(e.encoding, str(obj), e.start, e.end, fmtmessage)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x7b in position 0: could not dump:
dict[a]->list->str::'\xff'
dict[a]->list->str::'\xfe'
dict[5]->dict[3]->str::'\xfc'
dict[5]->dict[4]->tuple->str::'\xff'
dict::'\xff' |
Improving the error details is good.
|
I created a unit test for this and shoved unicode into every possible string. This has probably already been fixed since the release of rc2. I'll create a PR for the unit test itself. |
"UnicodeDecodeError" seems like you would need to "shoved invalid unicode into every possible string" in order to test it... |
@Dmole As far as Python's json is concerned there are no invalid unicode strings, only invalid strings (0x80-0xFF will error out). Usually this error occurs when an endpoint improperly handles unicode in its fields. It seems that this is improper string formatting though. This makes the error nigh impossible to debug with just this error message (which is why we added #3075). For now our only realistic option would be to wait until this error occurs again. |
tribler_7rc2.app
The text was updated successfully, but these errors were encountered: