New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Json::encode(): auto-sanitize bad UTF-8 strings #3444
Json::encode(): auto-sanitize bad UTF-8 strings #3444
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want this to be usable throughout our modules, such dangerous magic should not be the default. I expect Json::encode to be strict and to fail for invalid data. A failsafe variant is of course helpful for situations like the one referenced here. But then there should be a dedicated method call allowing one to explicitly opt for the failsafe variant.
Why do you think this is dangerous? Icinga 2 does the same thing before IDO inserts and they're fine (no one complained). And: I've not changed any module not in the repo (e.g. Director) neither there is any similar "peer pressure". Anyway IMO one can expect PS: C'mon, aren't you the master of magic? 😉 |
Being weak on encoding isn't magic, that's acting carelessly, it's dangerous. All sane parts of our stack should accept valid UTF8 only. There are legacy parts in our stack where we have to live with historic facts. But letting the JSON library accept every kind of garbage per default is just the wrong way of dealing with this. This "forgiving" encoding should be opt-in in an explicit way by calling a dedicated function. Regarding IDO: I gave my suggestions at the time this has been discussed and implemented. My proposal was to not let invalid UTF8 enter the core, in no way. Illegal check command output should be sanitized by the related component when reading the result. I don't know how it has then been implemented. |
I also expect Also, since this is a class in the So, please make the sanitation optional. |
Depending on where these functions are used, you'd need sanitation or not. I wouldn't move this into the Json* functions, but leave them outside. This renders the code more visible and transparent, and you don't need to know the function signature (which would be different from the inner |
0bc0ee3
to
201861b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alex,
Thanks for the PR. I opt for Json::sanitize()
instead of a parameter to encode. Else you always have to specify the default parameters.
Best,
Eric
fd5e805
to
46a93d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, tests are failing because of a fallthrough case in the switch statement. And I added an inline comment for a small change.
@@ -190,7 +222,9 @@ public function outputBody() | |||
$body['data'] = $this->getSuccessData(); | |||
break; | |||
} | |||
echo json_encode($body, $this->getEncodingOptions()); | |||
echo $this->autoSanitize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use getAutoSanitize()
. Subclasses may override this method. Though we don't have any at moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always used $this->whatever
– no one complained. Shall I do member reads like you everywhere in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stumbled upon this several times while working on the ipl. I'd say that we did not care about subclass overrides before. Yes, please use the getter for reads in the future.
46a93d0
to
906c166
Compare
I already added a comment. Didn't our tests respect it? |
library/Icinga/Util/Json.php
Outdated
if ($autoSanitize) { | ||
return static::encode(static::sanitizeUtf8Recursive($value), $options, $depth); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need a // Fallthrough
comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes 👍
fixes #2635