Fix index out of bounds crash in HttpHeaderHelper.findCharset by emlun · Pull Request #89 · apache/cxf

emlun · 2015-09-21T13:07:59Z

Sending an HTTP request with the header "Content-Type: foo/bar; charset="
would previously make this method throw a StringIndexOutOfBoundsException
that would go uncaught and cause a 500 response:

java.lang.StringIndexOutOfBoundsException: String index out of range: 0
        at java.lang.String.charAt(String.java:658)
        at org.apache.cxf.helpers.HttpHeaderHelper.findCharset(HttpHeaderHelper.java:90)

Sending an HTTP request with the header `"Content-Type: foo/bar; charset="` would previously make this method throw a `StringIndexOutOfBoundsException` that would go uncaught and cause a 500 response: java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.String.charAt(String.java:658) at org.apache.cxf.helpers.HttpHeaderHelper.findCharset(HttpHeaderHelper.java:90)

sberyozkin · 2015-10-07T10:27:12Z

FYI, http://git-wip-us.apache.org/repos/asf/cxf/commit/59b87cad, I did not apply the patch directly, as I renamed the tests. Please close this request yourself. Thanks

emlun · 2015-10-08T20:28:12Z

Cool, thanks! I don't quite understand why you didn't just make your changes as follow-up commits, which would have better preserved my authorship, but alright.

sberyozkin · 2015-10-08T20:34:01Z

Sure, I'll explain. I did not like the actual test name and I thought it was not a very useful test as it was only 'asserting' that the given line runs - hence I added two tests explicitly checking two error situations.
The commit message refers to your alias.
Cheers

emlun · 2015-10-24T14:39:45Z

(Sorry for taking so long to respond)

Sure, I have no problem at all with the changes you made. It's just that I would in the same situation instead have made those changes in additional commits instead of starting a completely new, unrelated (as far as Git is concerned) branch.

elakito · 2015-10-26T10:07:05Z

isn't this patch encouraging the invalid charset parameter usage?

SIOOBE was bad but shouldn't we be throwing some invalidity exception or at least write a WARN log to signal potentially the content may not be correctly decoded?

emlun · 2015-10-26T10:17:51Z

My thinking when writing the patch was that charset= should probably be handled in the same way as if the charset parameter wasn't present at all. Now that you point it out, this decision is indeed questionable. The parameter being left out vs. being specified, but with an invalid value, are indeed different things. I'm inclined to agree that silently ignoring an empty parameter value probably isn't the right way to go, but I'm not familiar enough with the codebase to suggest a better course of action.

emlun · 2015-10-26T10:18:27Z

Oops, accidentally hit the reopen button while writing.

sberyozkin · 2015-10-26T21:30:08Z

Hi emlun - sure I'll bear that in mind when merging your next patch, hi Aki - I think there was a code there already defaulting to UTF-8, but if you think something may need to be fixed we can definitely do it :-)

elakito · 2015-10-27T09:35:50Z

@sberyozkin somewhere I remember reading the missing charset is supposed to be interpreted as charset utf-8 in http. But the current mime RFC [1] as well as w3c's internationalization document both mention the missing charset means iso-8859-1. So, I don't remember where I read the defautl utf-8 convention.

But here I was talking about not the default but the invalid charset syntax. Something went wrong or programmed wrong and a client is sending a content-type header with
Content-Type: text/xml; charset=

The above specs say the charset value must be a valid IANA charset value. In this case, we don't know why the client generated this invalid charset entry. Was it trying to set the system default charset and didn't realize the value was null? Or something else went wrong? Hence, simply ignoring this invalid charset parameter and defaulting to utf-8 will hide this problem from our eyes and potentially lead to the incorrect decoding.

[1] https://tools.ietf.org/html/rfc7230
[2] http://www.w3.org/International/O-HTTP-charset#charset

sberyozkin · 2015-10-27T11:07:30Z

Sorry, I meant ISO-8859-1, my fault. As far as the actual defaulting is concerned, I've no strong opinion here, the older code was interpreting the absence of the charset by defaulting to ISO-8859-1. I'm not sure if having 'charset=' is equivalent to omitting a charset or to a bad client request situation...

sberyozkin · 2015-10-27T11:08:43Z

emlun, how did you have a 'charset=' created, do you use some existing REST client that does it ?

emlun · 2015-10-27T11:18:05Z

I think I had charset= as an explicit test input (though perhaps
inadvertently) in an integration test at some point, while testing that a
web application should refuse non-JSON input. It's not in my test suite
anymore, however.

On Tue, 27 Oct 2015 12:08 sberyozkin notifications@github.com wrote:

emlun, how did you have a 'charset=' created, do you use some existing
REST client that does it ?

—
Reply to this email directly or view it on GitHub
#89 (comment).

Emil Lundberg added 2 commits September 21, 2015 15:02

Fix another index out of bounds crash in findCharset

7f46dc2

emlun closed this Oct 8, 2015

emlun reopened this Oct 26, 2015

emlun closed this Oct 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix index out of bounds crash in HttpHeaderHelper.findCharset#89

Fix index out of bounds crash in HttpHeaderHelper.findCharset#89
emlun wants to merge 2 commits intoapache:masterfrom
emlun:charset-string-out-of-bounds

emlun commented Sep 21, 2015

Uh oh!

sberyozkin commented Oct 7, 2015

Uh oh!

emlun commented Oct 8, 2015

Uh oh!

sberyozkin commented Oct 8, 2015

Uh oh!

emlun commented Oct 24, 2015

Uh oh!

elakito commented Oct 26, 2015

Uh oh!

emlun commented Oct 26, 2015

Uh oh!

emlun commented Oct 26, 2015

Uh oh!

sberyozkin commented Oct 26, 2015

Uh oh!

elakito commented Oct 27, 2015

Uh oh!

sberyozkin commented Oct 27, 2015

Uh oh!

sberyozkin commented Oct 27, 2015

Uh oh!

emlun commented Oct 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

emlun commented Sep 21, 2015

Uh oh!

sberyozkin commented Oct 7, 2015

Uh oh!

emlun commented Oct 8, 2015

Uh oh!

sberyozkin commented Oct 8, 2015

Uh oh!

emlun commented Oct 24, 2015

Uh oh!

elakito commented Oct 26, 2015

Uh oh!

emlun commented Oct 26, 2015

Uh oh!

emlun commented Oct 26, 2015

Uh oh!

sberyozkin commented Oct 26, 2015

Uh oh!

elakito commented Oct 27, 2015

Uh oh!

sberyozkin commented Oct 27, 2015

Uh oh!

sberyozkin commented Oct 27, 2015

Uh oh!

emlun commented Oct 27, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants