Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong encoding for special characters (Swedish language) #10604

Closed
Pooriafd opened this issue Jan 28, 2020 · 5 comments
Closed

Wrong encoding for special characters (Swedish language) #10604

Pooriafd opened this issue Jan 28, 2020 · 5 comments
Assignees
Labels
bug This bug is not present in a released version of Open Liberty in:JAX-RS release bug This bug is present in a released version of Open Liberty team:Wendigo West

Comments

@Pooriafd
Copy link

Describe the bug
In the latest release (20.0.0.1), our application started to handle Swedish characters in wrong way

Steps to Reproduce
Print something in swedish like äppel or översätt, and it will be question marks instead of right characters

Expected behavior
to be written correctly in responses

Diagnostic information:

  • OpenLiberty Version: 20.0.0.1
  • Java Version: 11.0.6+10-LTS

Additional context
The problem got fixed by setting LANG=en_US.UTF-8 in linux, however we still have the same problem in windows which seems not to solve the problem by LANG env variable

@Pooriafd Pooriafd added the bug This bug is not present in a released version of Open Liberty label Jan 28, 2020
@chooty
Copy link

chooty commented Jan 28, 2020

This seems to happen as a result of the change in JsonBProvider in commit 548bcf2 where getBytes is called without a charset specified.

@andymc12
Copy link
Contributor

I'm taking a look - it seems like we should be setting the charset based on the Accept-Charset header instead of just assuming the default charset.

@chooty
Copy link

chooty commented Jan 30, 2020

According to RFC8259,
JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8

So to me it sounds like always using UTF-8 could be ok, or at least the default if no charset is specified.

@c-koell
Copy link

c-koell commented Jan 31, 2020

We are facing the same problems after upgrading. German umlauts are not readable

andymc12 added a commit to andymc12/open-liberty that referenced this issue Jan 31, 2020
andymc12 added a commit to andymc12/open-liberty that referenced this issue Jan 31, 2020
@andymc12
Copy link
Contributor

andymc12 commented Feb 3, 2020

I've just integrated a change that uses UTF-8 as the default, but allows users to override the default charset for the JSON-B entity provider by setting com.ibm.ws.jaxrs.jsonbprovider.defaultCharset=<differentCharset>. The change will also attempt to respect the Accept-Charset HTTP header from the client.

@andymc12 andymc12 added the release bug This bug is present in a released version of Open Liberty label Feb 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This bug is not present in a released version of Open Liberty in:JAX-RS release bug This bug is present in a released version of Open Liberty team:Wendigo West
Projects
None yet
Development

No branches or pull requests

5 participants