JSON string escaping support #753

noloader · 2022-10-26T03:07:06Z

This commit adds JSON string escaping support according to RFC 8259, Section 7.
See GitHub Issue
closes #754. [added by @kwwall]

src/main/java/org/owasp/esapi/codecs/JSONCodec.java

jeremiahjstacey · 2022-10-26T10:16:02Z

src/main/java/org/owasp/esapi/codecs/JSONCodec.java

+    protected int charToCodepoint( Character ch ) {
+
+        final String s = Character.toString(ch);
+        assert (s.length() == 1) : "Ooops";


Asserts are enabled by the Surefire plugin by default, so if this triggers we will see the output in the test dump. Would you consider adding more context to the assert failure output to help identify this as the cause when inspecting from a terminal?

EG "JSONCodec - Unable to convert specified character to codepoint. Character String conversion must contain more than one character"

I don't feel strongly about this. There is a lot of output in the tests, so the likelihood is that the stacktrace from any assertion failure will be more useful than the message itself. This will get us in the right area, and I think most developers will be able to fill in the rest with the logic being applied.

Resolving this review thread.

In #753 (comment), @jeremiahjstacey wrote:

Asserts are enabled by the Surefire plugin by default, so if this triggers we will see the output in the test dump. Would you consider adding more context to the assert failure output to help identify this as the cause when inspecting from a terminal?

That still doesn't mean that we should rely on 'asserts'. The are not generally enabled in runtime by default and even if we are doing so in our Surefire plugin, I would feel better if anything that can be caused by a user accidentally providing incorrect input or not using the classes as per their Javadoc, should be an actual runtime check and not rely on assert. Rather, there should be an explicit check and if input is amok, then throw an IllegalArgumentException. In other words, do NOT use assert to check for preconditions. Using asserts for postconditions or invariants, I think is okay, as that generally means a program error on our part (assume we've validated all the inputs for sanity), but don't use them for precondition checking.

So, @noloader, if this is a precondition check using assertions, could you change it to an explicit check? Thanks. (I haven't checked, as I'm talking to my mother and trying to listen to her and type at the same time.)

@kwwall per your comment on the assert statement I've unresolved that conversation. Suggest waiting to merge pending updated resolution.

See my previous comment here. If Unicode characters can extend from 0x0 to 0x10ffff (rather than 0xffff), is there any concern that this could result in a string whose length is >1?

I don't think we need to resolve this before merging this PR, but if it is a problem, we should create a GitHub issue to address it later and to note it somewhere in the JSONCodec Javadoc.

src/main/java/org/owasp/esapi/codecs/JSONCodec.java

jeremiahjstacey · 2022-10-26T10:33:55Z

@noloader, I scanned the existing bugs and I don't see a task to associate this PR to. (I may have missed it, sorry)

If something does not exist already, would you please create an issue in the project that encapsulate the intent and value of this change? Once identified or created, please associate it with this PR for project traceability.

kwwall · 2022-10-26T16:26:45Z

I believe there is an old, closed GitHub issue for this that was closed because he thought lost the code when he did a rebase, but he was able to recover it. We should find that old GitHub issue and just reopen it.

…

-kevin

On Wed, Oct 26, 2022, 6:34 AM jeremiahjstacey ***@***.***> wrote: @noloader <https://github.com/noloader>, I scanned the existing bugs and I don't see a task to associate this PR to. (I may have missed it, sorry) If something does not exist already, would you please create an issue in the project that encapsulate the intent and value of this change? Once identified or created, please associate it with this PR for project traceability. — Reply to this email directly, view it on GitHub <#753 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAO6PG7ZS5YYM6HZPNOIGI3WFECJ5ANCNFSM6AAAAAARORONOE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Logic adjustements to the implementatation of JsonCodec.decodeCharacter to consolidate method return events to four: 1 - First character is null (fast return/no Exception) 2 - First character is not null, Second character is Null (fast return w/Exception) 3 - First & Second not null, but no decode can be performed (Exception) 4 - First & Second not null, decode performed and returned (normal response) Also added the two-character output to the Exception message in case 3 to aid with diagnostics/debugging.

noloader · 2022-10-27T19:02:06Z

@jeremiahjstacey

I scanned the existing bugs and I don't see a task to associate this PR to. (I may have missed it, sorry)

If something does not exist already, would you please create an issue in the project that encapsulate the intent and value of this change? Once identified or created, please associate it with this PR for project traceability.

Its kind of a winding road to get here.

This started with Kevin and I talking offline because we need a JSONEncoder at $dayjob. We did not want to use a JavaScript encoder because the grammars are slightly different. Then, the discussion moved to the mailing list at JSON codec discussion on esapi-github, https://groups.google.com/a/owasp.org/g/esapi-project-users/c/Q0LTNGxbTQc/m/xoyjMNwCAgAJ .

The mailing list message points to the dead PR at #722 . The original PR is dead because I accidentally blew away the code changes on a rebase. I was able to recover the deleted changes and files through Git's reflog, which eventually lead to this [new] PR.

kwwall · 2022-10-27T20:18:42Z

I looked at the issue history and didn't find one. No worries. I will create one for you tonight, and reference this PR so they will get linked together.

…

-kevin

On Thu, Oct 27, 2022, 3:02 PM Jeffrey Walton ***@***.***> wrote: @jeremiahjstacey <https://github.com/jeremiahjstacey> I scanned the existing bugs and I don't see a task to associate this PR to. (I may have missed it, sorry) If something does not exist already, would you please create an issue in the project that encapsulate the intent and value of this change? Once identified or created, please associate it with this PR for project traceability. Its kind of a winding road to get here. This started with Kevin and I talking offline because we need a JSONEncoder at $dayjob. We did not want to use a JavaScript encoder because the grammars are slightly different. Then, the discussion moved to the mailing list at *JSON codec discussion on esapi-github*, https://groups.google.com/a/owasp.org/g/esapi-project-users/c/Q0LTNGxbTQc/m/xoyjMNwCAgAJ . The mailing list message points to the dead PR at #722 <#722> . The original PR is dead because I accidentally blew away the code changes on a rebase. I was able to recover the deleted changes and files through Git's reflog, which eventually lead to this [new] PR. — Reply to this email directly, view it on GitHub <#753 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAO6PG3GHNDHPT5W7LY3OPTWFLGTTANCNFSM6AAAAAARORONOE> . You are receiving this because you commented.Message ID: ***@***.***>

Using the more appropriate %c syntax for the String.format when displaying the first and second characters in the exception message.

Updating comments pertaining to handling of invalid code points in UTF8, per request in PR ESAPI#753 ESAPI#753 (comment)

kwwall · 2022-11-15T23:50:21Z

@noloader, @jeremiahjstacey, & @xeno6696 - Call for final comments and/ commits on this PR. I'd like to get this and @jeremiahjstacey's PR #756 merged soon so we can do another release. I know Dave Wichers is planning on a AntiSamy point release to fix a CVE in a dependency so we should release when AntiSamy does to prevent people from claiming ESAPI is vulnerable.

noloader · 2022-11-16T15:51:23Z

On Wed, Nov 16, 2022 at 6:07 AM jeremiahjstacey ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/main/java/org/owasp/esapi/codecs/JSONCodec.java <#753 (comment)> : > + // Do nothing. Fall into throw below. + } + catch (IllegalArgumentException e) + { + // Do nothing. Fall into throw below. + } + + // Catch all. The escaped character sequence was invalid. + input.reset(); + throw new IllegalArgumentException( errorMessage ); + } + + protected int charToCodepoint( Character ch ) { + + final String s = Character.toString(ch); + assert (s.length() == 1) : "Ooops"; @kwwall <https://github.com/kwwall> per your comment on the assert statement I've unresolved that conversation. Suggest waiting to merge pending updated resolution.

The code does not depend on behavior from the assert. The idea behind the assert is to alert the developer when the code is being debugged. That's all I wanted to do there: "Hey! Something looks sideways. Take a look at this!". In C/C++/ObjC apps, an assert should snap the debugger. In release builds the assert should be removed. If my understanding is mistaken or a web app does not alert the developer during a debug session, then would someone correct it, please. Now the behavior when asserts are not in effect may be wrong. The behavior is to return the first character in the pair. I feel like that's not quite correct. I think the docs say an empty string will be returned on encoding and decoding errors. Maybe that code path should return "" instead.

…

Message ID: ***@***.*** com>

kwwall · 2022-11-16T16:12:50Z

In Java, assertions are disabled by default. We enable them via the Surefire plugin for our JUnit tests, but for others development where ESAPI is used, they would be disabled. So, just consider that as a possibility.

…

-kevin On Wed, Nov 16, 2022, 10:51 AM Jeffrey Walton ***@***.***> wrote:

On Wed, Nov 16, 2022 at 6:07 AM jeremiahjstacey ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In src/main/java/org/owasp/esapi/codecs/JSONCodec.java > < #753 (comment) > > : > > > + // Do nothing. Fall into throw below. > + } > + catch (IllegalArgumentException e) > + { > + // Do nothing. Fall into throw below. > + } > + > + // Catch all. The escaped character sequence was invalid. > + input.reset(); > + throw new IllegalArgumentException( errorMessage ); > + } > + > + protected int charToCodepoint( Character ch ) { > + > + final String s = Character.toString(ch); > + assert (s.length() == 1) : "Ooops"; > > @kwwall <https://github.com/kwwall> per your comment on the assert > statement I've unresolved that conversation. Suggest waiting to merge > pending updated resolution. > The code does not depend on behavior from the assert. The idea behind the assert is to alert the developer when the code is being debugged. That's all I wanted to do there: "Hey! Something looks sideways. Take a look at this!". In C/C++/ObjC apps, an assert should snap the debugger. In release builds the assert should be removed. If my understanding is mistaken or a web app does not alert the developer during a debug session, then would someone correct it, please. Now the behavior when asserts are not in effect may be wrong. The behavior is to return the first character in the pair. I feel like that's not quite correct. I think the docs say an empty string will be returned on encoding and decoding errors. Maybe that code path should return "" instead. > Message ID: ***@***.*** > com> > — Reply to this email directly, view it on GitHub <#753 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAO6PG2FKDVUTMQZZRJCWR3WIT7ILANCNFSM6AAAAAARORONOE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

xeno6696 · 2022-11-16T17:54:15Z

@kwwall i know I need to review this, I’m on travel for work (again) and I can’t sit down with this until this weekend.

If we need to push faster than that I’ll trust the group judgment here.

xeno6696 · 2022-11-20T21:05:23Z

Given that this was recovered code from #722 I feel like I beat that up pretty well already. @kwwall I'm okay with moving forward. I would normally just merge here but it sounded like you had some changes and I don't want to step on your toes here.

xeno6696 · 2022-11-20T21:10:44Z

I did want to call myself out to building some unit tests that handled the full Unicode range. If we stayed with the AbstractCharacterCodec I have a minor frowny-face just because it was only supposed to be a bridge until all the codecs were converted to Integer-based codecs... however this is my fault as I should have marked that abstract class as deprecated. (It could be that there are some codecs where for some reason, it makes zero sense at all to expand to handle the full UTF-8 range, but I figured we'd handle those case by case.)

kwwall

Would like a response to my comment before I approve merging this PR. Changes may or may not be required. I just want to be sure our assumptions about code points are still legitimate as @xeno6696 did the AbstractCodec work many years ago when we were using Java 7 and that may have been an older version of Unicode.

src/main/java/org/owasp/esapi/codecs/AbstractCodec.java

kwwall · 2022-11-22T03:59:35Z

src/main/java/org/owasp/esapi/codecs/AbstractIntegerCodec.java

@@ -25,6 +25,9 @@
 * points cannot be represented by a {@code char}, this class remedies that by parsing string
 * data as codePoints as opposed to a stream of {@code char}s.
 *
+ * WARNING:  This class will silently discard an invalid code point according to


See my previous comment in AbstractCodec.java. In particular, it seems as though Character.isValidCodepoint( int ) allows a broader range that what was anticipated.

kwwall · 2022-11-22T04:01:59Z

src/main/java/org/owasp/esapi/codecs/JSONCodec.java

+    protected int charToCodepoint( Character ch ) {
+
+        final String s = Character.toString(ch);
+        assert (s.length() == 1) : "Ooops";


See my previous comment here. If Unicode characters can extend from 0x0 to 0x10ffff (rather than 0xffff), is there any concern that this could result in a string whose length is >1?

I don't think we need to resolve this before merging this PR, but if it is a problem, we should create a GitHub issue to address it later and to note it somewhere in the JSONCodec Javadoc.

JSON string escaping support

3fed7c3

jeremiahjstacey reviewed Oct 26, 2022

View reviewed changes

src/main/java/org/owasp/esapi/codecs/JSONCodec.java Show resolved Hide resolved

jeremiahjstacey reviewed Oct 26, 2022

View reviewed changes

src/main/java/org/owasp/esapi/codecs/JSONCodec.java Show resolved Hide resolved

JSONCodec Decode Exception Message Correction

1ceb62a

Using the more appropriate %c syntax for the String.format when displaying the first and second characters in the exception message.

kwwall mentioned this pull request Oct 28, 2022

JSON encoder #754

Closed

Doc Updates - Character Codec UTF8 handling

9f1a2cf

Updating comments pertaining to handling of invalid code points in UTF8, per request in PR ESAPI#753 ESAPI#753 (comment)

kwwall reviewed Nov 22, 2022

View reviewed changes

kwwall merged commit 5f21e78 into ESAPI:develop Nov 23, 2022

noloader deleted the json branch June 18, 2023 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON string escaping support #753

JSON string escaping support #753

noloader commented Oct 26, 2022 •

edited by jeremiahjstacey

Loading

jeremiahjstacey Oct 26, 2022

jeremiahjstacey Nov 3, 2022

kwwall Nov 15, 2022

jeremiahjstacey Nov 16, 2022

kwwall Nov 22, 2022 •

edited

Loading

jeremiahjstacey commented Oct 26, 2022

kwwall commented Oct 26, 2022 via email

noloader commented Oct 27, 2022

kwwall commented Oct 27, 2022 via email

kwwall commented Nov 15, 2022 •

edited

Loading

noloader commented Nov 16, 2022 via email

kwwall commented Nov 16, 2022 via email

xeno6696 commented Nov 16, 2022

xeno6696 commented Nov 20, 2022

xeno6696 commented Nov 20, 2022

kwwall left a comment

kwwall Nov 22, 2022

kwwall Nov 22, 2022 •

edited

Loading

JSON string escaping support #753

JSON string escaping support #753

Conversation

noloader commented Oct 26, 2022 • edited by jeremiahjstacey Loading

jeremiahjstacey Oct 26, 2022

Choose a reason for hiding this comment

jeremiahjstacey Nov 3, 2022

Choose a reason for hiding this comment

kwwall Nov 15, 2022

Choose a reason for hiding this comment

jeremiahjstacey Nov 16, 2022

Choose a reason for hiding this comment

kwwall Nov 22, 2022 • edited Loading

Choose a reason for hiding this comment

jeremiahjstacey commented Oct 26, 2022

kwwall commented Oct 26, 2022 via email

noloader commented Oct 27, 2022

kwwall commented Oct 27, 2022 via email

kwwall commented Nov 15, 2022 • edited Loading

noloader commented Nov 16, 2022 via email

kwwall commented Nov 16, 2022 via email

xeno6696 commented Nov 16, 2022

xeno6696 commented Nov 20, 2022

xeno6696 commented Nov 20, 2022

kwwall left a comment

Choose a reason for hiding this comment

kwwall Nov 22, 2022

Choose a reason for hiding this comment

kwwall Nov 22, 2022 • edited Loading

Choose a reason for hiding this comment

noloader commented Oct 26, 2022 •

edited by jeremiahjstacey

Loading

kwwall Nov 22, 2022 •

edited

Loading

kwwall commented Nov 15, 2022 •

edited

Loading

kwwall Nov 22, 2022 •

edited

Loading