specifying charset for body content? #926

garretwilson · 2017-10-18T13:55:33Z

What charset do methods such as RequestSpecification.body(String body) use? Because you have to be using some charset to convert a string to bytes.

Surely you aren't using the default system charset --- this would be a defect, as tests would not be reliable nor repeatable because you have no idea what charset the system is using.

From the API it seems that there is a big problem here --- you should be allowing the user to indicate the charset to use.

garretwilson · 2017-10-18T14:05:14Z

Unfortunately I'm getting the idea that this is indeed a defect, after reading e.g. #567 .

Look at all the Java APIs from String.getBytes(Charset) to InputStreamReader. They all require a Charset to be indicated. Only old APIs rely on the system default charset, because this is not reliable across platforms.

Ideally you should have a RequestSpecification.charset(Charset) method, but you should also pick up whether the charset parameter was used in RequestSpecification.contentType(ContentType contentType). Either of these would be used as the charset for RequestSpecification.body(String body); otherwise, it should throw an exception if the charset hasn't been set first.

Having a RequestSpecification.body(String body, Charset charset) would allow the body and charset to be set in one go.

…#66

garretwilson · 2017-10-18T19:42:05Z

To better explain the problem, I wonder what charset this would use to encode the bytes?

when()
  .contentType("text/plain; charset=ISO-2022-JP")
  .body("もしもし")
  …

Would the body be encoded in ISO-2022-JP? Or UTF-8? (How would I know? The documentation doesn't seem clear on this point.) My fear is that neither would happen, and that REST Assured might in fact just use the system default charset---which might be Cp1251 for all we know (as @mkotsur might be using in mkotsur/restito#66 ).

johanhaleby · 2017-10-23T04:34:13Z

If you specify a charset explicitly in the contentType that charset should be used.

If charset is not explicitly specified REST Assured will query the config to check which charset to use. You can specify either the default charset (which defaults to ISO_8859_1 all content-types (except JSON which using UTF-8) as defined by Apache Http Client in HTTP#DEF_CONTENT_CHARSET) or the charset on a per contet-type basis using the EncoderConfig.

garretwilson mentioned this issue Oct 18, 2017

ContentType.JSON should automatically encode as UTF-8 #567

Closed

garretwilson referenced this issue in mkotsur/restito Oct 18, 2017

(fix) Severe encoding bug in Action.stringContent(String content). See …

c6f269b

…#66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specifying charset for body content? #926

specifying charset for body content? #926

garretwilson commented Oct 18, 2017

garretwilson commented Oct 18, 2017

garretwilson commented Oct 18, 2017

johanhaleby commented Oct 23, 2017

specifying charset for body content? #926

specifying charset for body content? #926

Comments

garretwilson commented Oct 18, 2017

garretwilson commented Oct 18, 2017

garretwilson commented Oct 18, 2017

johanhaleby commented Oct 23, 2017