New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chinese characters can't be saved into postgresql such as in a CodeSystem post and be as a url parameter to search something (version 2.0) #454

Closed
YinAqu opened this Issue Sep 21, 2016 · 5 comments

Comments

Projects
None yet
2 participants
@YinAqu

YinAqu commented Sep 21, 2016

It comes with a org.postgresql.util.PSQLException: 错误: 无效的 "UTF8" 编码字节顺序: 0x00, in English the error information means about error:invalid "UTF8" code byte sequence:0x00.
I think this can be resolved in the method BaseHapiFhirDao.normalizeString() .

@jamesagnew

This comment has been minimized.

Show comment
Hide comment
@jamesagnew

jamesagnew Sep 21, 2016

Owner

Hi @YinAqu , would you be able to attach a resource body we could use in a testcase? Or some sample code to generate one?

Owner

jamesagnew commented Sep 21, 2016

Hi @YinAqu , would you be able to attach a resource body we could use in a testcase? Or some sample code to generate one?

@YinAqu

This comment has been minimized.

Show comment
Hide comment
@YinAqu

YinAqu Sep 22, 2016

Hi @jamesagnew ,thanks for your comment.Posting or putting any resource body including Chinese characters as the value of certain properties(e.g. name,publisher,description) results in this exception.The below is a sample for CodeSystem:
{
"resourceType": "CodeSystem",

"url": "http://nestvision.com/fhir/myexample-yh",
"name": "NestVision Restful Interactions",
"status": "active",
"publisher": "杨浩",
"description": "this is a codesystem example.",
"caseSensitive": true,
"concept": [
{
"code": "mygod",
"display": "wodetian",
"definition": "this is a translate."
}
]
}

YinAqu commented Sep 22, 2016

Hi @jamesagnew ,thanks for your comment.Posting or putting any resource body including Chinese characters as the value of certain properties(e.g. name,publisher,description) results in this exception.The below is a sample for CodeSystem:
{
"resourceType": "CodeSystem",

"url": "http://nestvision.com/fhir/myexample-yh",
"name": "NestVision Restful Interactions",
"status": "active",
"publisher": "杨浩",
"description": "this is a codesystem example.",
"caseSensitive": true,
"concept": [
{
"code": "mygod",
"display": "wodetian",
"definition": "this is a translate."
}
]
}

@jamesagnew

This comment has been minimized.

Show comment
Hide comment
@jamesagnew

jamesagnew Sep 26, 2016

Owner

So.. I'm very confused about this. So far I haven't been able to reproduce it.

Are you using the charSet parameter on your database connection URL, and do you have PGSQL configured to use UTF-8?

Owner

jamesagnew commented Sep 26, 2016

So.. I'm very confused about this. So far I haven't been able to reproduce it.

Are you using the charSet parameter on your database connection URL, and do you have PGSQL configured to use UTF-8?

@YinAqu

This comment has been minimized.

Show comment
Hide comment
@YinAqu

YinAqu Oct 8, 2016

Hi @jamesagnew, sorry to confuse you, it's my mistake to mention to database.

In fact this has nothing to do with database, but the post operation will call the method BaseHapiFhirDao.normalizeString() to normalize the ‘post JSON body', and this method doesn't allow characters in a ‘post JSON body' which are greater than '\u007F' to remain intact..

I don't know why hapi-fhir makes the rule only English characters can get through the filter, but I really wish I could populate some Chinese characters into the ‘description‘ property so that Chinese people can understand what it is used for..

Please have a look at the method's codes:

public static String normalizeString(String theString) { char[] out = new char[theString.length()]; theString = Normalizer.normalize(theString, Normalizer.Form.NFD); int j = 0; for (int i = 0, n = theString.length(); i < n; ++i) { char c = theString.charAt(i); if (c <= '\u007F') { out[j++] = c; } } return new String(out).toUpperCase(); }

YinAqu commented Oct 8, 2016

Hi @jamesagnew, sorry to confuse you, it's my mistake to mention to database.

In fact this has nothing to do with database, but the post operation will call the method BaseHapiFhirDao.normalizeString() to normalize the ‘post JSON body', and this method doesn't allow characters in a ‘post JSON body' which are greater than '\u007F' to remain intact..

I don't know why hapi-fhir makes the rule only English characters can get through the filter, but I really wish I could populate some Chinese characters into the ‘description‘ property so that Chinese people can understand what it is used for..

Please have a look at the method's codes:

public static String normalizeString(String theString) { char[] out = new char[theString.length()]; theString = Normalizer.normalize(theString, Normalizer.Form.NFD); int j = 0; for (int i = 0, n = theString.length(); i < n; ++i) { char c = theString.charAt(i); if (c <= '\u007F') { out[j++] = c; } } return new String(out).toUpperCase(); }

@jamesagnew

This comment has been minimized.

Show comment
Hide comment
@jamesagnew

jamesagnew Oct 12, 2016

Owner

Hi @YinAqu , ahhhhh ok thanks for the explanation. That makes sense, and I was able to reproduce your issue.

I'm checking in a fix now.

Owner

jamesagnew commented Oct 12, 2016

Hi @YinAqu , ahhhhh ok thanks for the explanation. That makes sense, and I was able to reproduce your issue.

I'm checking in a fix now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment