You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Commit cdd6139 removed 0 <= c && and changed char to unsigned char, so the logic stays the same - escape invisible chars 0 < ch < 32.
Commit df41cbe by @lcsdavid changes unsigned char to auto (e.g. char).
I'm not sure what exactly problem that was supposed to solve, but now not only invisible chars are escaped now (0 < char < 0x20 https://www.asciitable.com/), but the Unicode sequences may be escaped too from now on (if auto -> char -> signed char is true on this architecture).
All bytes of multibyte utf8 codepoints contain the most significant bit on (e.g. 0x80), so signed char with the leading bit on is always negative for a two's complement (almost any architecture), and ch < 0x20 would be now true for any Unicode symbol. https://en.wikipedia.org/wiki/UTF-8#Encoding
The original project took solution to store UTF8 sequences in std::string: ipkn/crow#189
But with the mentioned commit that solution can't be applied.
Middleware just adds UTF8 headers to text and I'm not sure if the middleware is the right place to cancel mentioned escapes. #202
I have almost no grasp at the codebase, but for me, it seems like it would be nice to have a customization point for defining escape function somehow or to introduce a new JSON value type raw_string that would not be escaped later. Maybe default
I could make a PR with high-level guidance about what may be acceptable in this situation.
For now, I just revert to unsigned char and that's totally fine for me, could someone kindly explain why that's wrong? And elaborating on how it must be done would be even greater! Thanks!
The text was updated successfully, but these errors were encountered:
Since the problem seems to be with values lower than 0, we can just change the check from c < 0x20 to c > 0 && c < 0x20. Would that work?
Yes, that's fit my use case. I've send the PR #304
And I believe we want to escape the null-symbol too since it's invisible, so it's c >= 0 && c < 0x20.
Hi! I need help with serving JSON with UTF8 string.
Currently as far as I understood json response serialized by crow::json::dump_internal:
Crow/include/crow/json.h
Line 1736 in f96189f
which in turn calls crow::json::escape for string:
Crow/include/crow/json.h
Line 1809 in f96189f
Crow/include/crow/json.h
Line 1729 in f96189f
Crow/include/crow/json.h
Line 41 in f96189f
Commit cdd6139 removed
0 <= c &&
and changedchar
tounsigned char
, so the logic stays the same - escape invisible chars 0 < ch < 32.Commit df41cbe by @lcsdavid changes
unsigned char
toauto
(e.g. char).char - type for character representation which can be most efficiently processed on the target system (has the same representation and alignment as either signed char or unsigned char, but is always a distinct type).
I'm not sure what exactly problem that was supposed to solve, but now not only invisible chars are escaped now (0 < char < 0x20 https://www.asciitable.com/), but the Unicode sequences may be escaped too from now on (if
auto
->char
->signed char
is true on this architecture).All bytes of multibyte utf8 codepoints contain the most significant bit on (e.g. 0x80), so signed char with the leading bit on is always negative for a two's complement (almost any architecture), and ch < 0x20 would be now true for any Unicode symbol.
https://en.wikipedia.org/wiki/UTF-8#Encoding
The original project took solution to store UTF8 sequences in std::string:
ipkn/crow#189
But with the mentioned commit that solution can't be applied.
Middleware just adds UTF8 headers to text and I'm not sure if the middleware is the right place to cancel mentioned escapes.
#202
I have almost no grasp at the codebase, but for me, it seems like it would be nice to have a customization point for defining escape function somehow or to introduce a new JSON value type raw_string that would not be escaped later. Maybe default
I could make a PR with high-level guidance about what may be acceptable in this situation.
For now, I just revert to unsigned char and that's totally fine for me, could someone kindly explain why that's wrong? And elaborating on how it must be done would be even greater! Thanks!
The text was updated successfully, but these errors were encountered: