Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Languages support #281

Open
earth2378 opened this issue May 12, 2018 · 7 comments
Open

Languages support #281

earth2378 opened this issue May 12, 2018 · 7 comments
Labels

Comments

@earth2378
Copy link

earth2378 commented May 12, 2018

I tried to use with Thai language but it was found that:

terminate called after throwing an instance of 'xml::serialization'
what(): xl/sharedStrings.xml: error: invalid UTF-8

@tfussell
Copy link
Owner

tfussell commented Jun 4, 2018

thanks for reporting his problem. UTF compatibility is a priority for me. Could you show me the code that caused this problem? Or tell me more about what you were trying to do.

@tfussell tfussell added the bug label Jun 4, 2018
@ZM-J
Copy link

ZM-J commented Jul 12, 2018

@tfussell Chinese also got the same fault.
Just substitute

ws.cell("B2").value("Hello world");

in sample code with

ws.cell("B2").value("你好,世界");

and you will get the fault.
I use vs2017 and use xlnt.lib as the linker input. I got the right xlsx file after running the sample code, but got fault immediately when I changed the string to Chinese.

@ZM-J
Copy link

ZM-J commented Jul 12, 2018

Finally I solved the problem by noticing issue #215 and using u8"你好,世界". I think @earth2378 might encounter the same problem as mine.

@Crzyrndm
Copy link
Collaborator

Crzyrndm commented Jul 12, 2018

Nice find.

A catch you may want to be aware of if you need portability is that even with the u8 literal, different compilers/editors may still choke and give the incorrect output (relevant stack overflow).

The only way to ensure that everything works as expected anywhere with literals is the unicode escape sequences. It's downright ugly though :(

你好,世界
u8"\u4F60\u597D\uFF0C\u4E16\u754C"

Hex format from https://unicodelookup.com/#%E4%BD%A0%E5%A5%BD%EF%BC%8C%E4%B8%96%E7%95%8C/1

@Crzyrndm
Copy link
Collaborator

Crzyrndm commented Jul 28, 2018

It is likely that using u8 prefix (+ escape sequences if neccesary) will resolve the reported issue for @earth2378 . Confirmation (or not) of this would be appreciated

@tfussell
This is a recurring issue. I wonder if documentation (e.g. where UTF8 compatibility is mentioned, add a note about source literals) or source changes/additions can be made that may prevent this error.

@li1553770945
Copy link

Finally I solved the problem by noticing issue #215 and using u8"你好,世界". I think @earth2378 might encounter the same problem as mine.

I try to use your method,but if I want to read a excel,like ws.cell(1,1).to_string() ,if the value of (1,1) is Chinese,it will be messy code.Do you know how can I fix it?

@tfussell
Copy link
Owner

Finally I solved the problem by noticing issue #215 and using u8"你好,世界". I think @earth2378 might encounter the same problem as mine.

I try to use your method,but if I want to read a excel,like ws.cell(1,1).to_string() ,if the value of (1,1) is Chinese,it will be messy code.Do you know how can I fix it?

What are you trying to do with it? It should be a valid UTF-8 string you can use like any other Chinese UTF-8 string from another source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants