-
Notifications
You must be signed in to change notification settings - Fork 41
Refactor and complete JSON string escape function #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/environ.c
Outdated
| case '\r': memcpy(p, "\\r", 2); p += 2; break; | ||
| case '\f': memcpy(p, "\\f", 2); p += 2; break; | ||
| case '\n': memcpy(p, "\\n", 2); p += 2; break; | ||
| default: *p = value[i]; p += 1; break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change to:
switch (value[i]) {
case '"':
case '/':
[...]
case '\n':
*p = '\\';
p++;
break;
default:
break;
}
*p = value[i];
p++;
(also keep case aligned to switch, not indented, only the actions are indented).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah also, I am reading RFC 4627 2.5. Strings
As far as I can tell it says to escape all control characters, so perhaps we should just:
if ((value[i] < 0x1F) || (value[i] == '"') || (value[i] == '\\') {
*p = '\\';
p++;
}
*p = value[i];
p++;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh we can't just use '\' for all chars, silly me ... then we have a few options, one is to allocate 6x and escape all unrepresentable chars with \uXXXX notation, or just allocate double as you do now, but check bounds, and if we get to fill the string reallocate doubling the buffer.
If we want to keep using the \X notation for common chars then we'd do something like:
switch (value[i]) {
case '"':
case '/':
[...]
case '\n':
*p = '\\';
p++;
break;
default:
if (value[i] < 0x1F) {
snprintf(p, "\u%04d", (int)value[i]);
p += 6;
continue;
}
break;
}
*p = value[i];
p++;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The escaping in my example is always wrong for \n \b etc, get that right :)
The main point here is to escape all control chars < 1F and style.
using memcpy is fine, but do it on a new line and break on the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many comments :). First, I wouldn't expect having unrepresentable chars as part of the output of the display_value, since they are not displayable. But, to be safe, we better generate valid JSON regardless the input. I don't really like having to reallocate the buffer taking care of the length. I'd rather allocate the worst case (size*6+1). The memory is associated from the request pool, and hence is to be freed soon.
Let me push another commit and let me know what you think.
simo5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please escape all needed chars, see style hints too :)
c0f1767 to
2ef85d8
Compare
frozencemetery
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two nits (first one I don't really care about and wouldn't bring up without the second; second one is important for reducing code churn for backport purposes). Otherwise, I'm happy with this.
src/environ.c
Outdated
| * escaped length will be original length * 6 + 1 in the worst case */ | ||
| p = escaped_value = apr_palloc(req->pool, disp_value.length * 6 + 1); | ||
| for (i = 0, j = 0; i < disp_value.length; i++, j++) { | ||
| if ((value[i] < 0x1F) || (value[i] == '"') || (value[i] == '\\')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parens are not needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and the second one was addressed while I was reviewing, awesome. Ignore this if we have no other comments :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I noticed about the change in '\0' by 0 and fixed with force. Shouldn't do that, I know... :)
src/environ.c
Outdated
| escaped_value[j] = '\0'; | ||
|
|
||
| /* Make the string NULL terminated */ | ||
| escaped_value[j] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three lines shouldn't change from the original.
2ef85d8 to
0e7a724
Compare
src/environ.c
Outdated
| * escaped length will be original length * 6 + 1 in the worst case */ | ||
| p = escaped_value = apr_palloc(req->pool, disp_value.length * 6 + 1); | ||
| for (i = 0, j = 0; i < disp_value.length; i++, j++) { | ||
| if ((value[i] < 0x1F) || (value[i] == '"') || (value[i] == '\\')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the if/else? you are alredy doing a switch/case here ... it can be handled in there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. You need if you want to do the
escaped_value[j] = '\\';
j++;
only once. Otherwise the default case would comprehend the non-escape case and you cannot do it for the common case (well, I didn't find a way to do it). So I found out that doing first the if to differenciate escape/non-escape and the using the switch/case for selecting the replacement characters was better suited.
simo5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use just the switch case not if/else around it, it's redundant
src/environ.c
Outdated
| break; | ||
| default: | ||
| apr_snprintf(&escaped_value[j], 6, "u%04d", (int) value[i]); | ||
| j += 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right, you should add 5 to j, seem like this has not been tested ...
I think I preferred the previous version where you would increment a pointer and write to it, reades better than setting stuff in escaped_value[j] for some reason ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is exactly the opposite... I've tested it in an independent C program and used valgrind and "jq" utility to check memory and JSON formatting. At first I thougth it was as you propose but
note that the "for" loop will increment j for you.
Nonetheless, let me convert this into the pointer-based version (without rebasing) and see if that looks nicer.
|
Now I have a comment. I have a lot of which can be written as: Which one do you prefer? |
|
I do prefer this last version. I think autoincrents in this case read fairly well. |
|
In all honesty I think I prefer the initial version based on memcpy(), with the addition of the \u00XX handling. |
|
Sure, let me do it :). |
|
BTW I used <= 0x1F since it seems value 0x1F should be escaped as well. |
simo5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
please squash all in one commit and I'll push
420028d to
5c235eb
Compare
|
Done |
Cover all the cases mentioned in RFC4627. Reviewed-by: Simo Sorce <simo@redhat.com> Closes gssapi#132
|
Ok, so I made some minor changes and pushed them to your tree.
Here is the interdiff: If you are ok with these changes I'll push. |
|
Yes I am ok with the changes.
Enviado con AquaMail para Android http://www.aqua-mail.com
En 28 de febrero de 2017 22:17:00 Simo Sorce <notifications@github.com>
escribió:
… Ok, so I made some minor changes and pushed them to your tree.
Changes:
- Minor style changes
- Comment was outdated, changed
- Commit message fixed (subject was too long)
Here is the interdiff:
```
diff -u b/src/environ.c b/src/environ.c
--- b/src/environ.c
+++ b/src/environ.c
@@ -101,16 +101,16 @@
gss_buffer_desc disp_value)
{
/* This function returns a copy (in the pool) of the given gss_buffer_t
- * where every occurrence of " has been replaced by \". This string is
+ * where some characters are escaped as required by RFC4627. The string is
* NULL terminated */
- int i = 0;
- char *escaped_value = NULL, *p = NULL;
- char *value = (char*) disp_value.value;
+ char *value = disp_value.value;
+ char *escaped_value = NULL;
+ char *p = NULL;
/* gss_buffer_t are not \0 terminated, but our result will be. Hence,
* escaped length will be original length * 6 + 1 in the worst case */
p = escaped_value = apr_palloc(req->pool, disp_value.length * 6 + 1);
- for (i = 0; i < disp_value.length; i++) {
+ for (size_t i = 0; i < disp_value.length; i++) {
switch (value[i]) {
case '"':
memcpy(p, "\\\"", 2);
@@ -142,7 +142,7 @@
break;
default:
if (value[i] <= 0x1F) {
- apr_snprintf(p, 7, "\\u%04d", (int) value[i]);
+ apr_snprintf(p, 7, "\\u%04d", (int)value[i]);
p += 6;
} else {
*p = value[i];
```
If you are ok with these changes I'll push.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/modauthgssapi/mod_auth_gssapi/pull/132#issuecomment-283162892
|
The old escaping code only escapes '"' characters, but RFC4627 also mentions other characters that must be escaped, such as '\n', '/', '\b', etc.
The new code adds support for escaping these characters, and provides a simplified implementation. In particular, instead of iterating the original string twice (one for counting characters, and the other to replace them), only a single iteration is performed always. Memory is allocated as the worst case scenario (i.e. all the characters are to be escaped).
I have used:
because I thougth it was clearer than:
But we can change that if you prefer.