-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
correct bstr conversion #6
Conversation
- in none-english environment (i am Chinese) we DO NOT use CP_UTF8, CP_ACP should be used - while Cygwin default convert filename internally to UTF-8 - we have to use ASCII format for our lua source code, if you prefer UTF-8, you need luaiconv to convert UTF-8 to your ASCII format (GBK or other)
Can you provide an example of non-working code? As it seems, this change will fix LuaCom for you but will break it for every other user running it with Cygwin. |
i do add a "#ifdef _CYGWIN " to keep the behavior in Cygwin. Here is an example that not work: require "luacom"
wordApp = luacom.CreateObject("Word.Application")
wordApp.Visible = true
wordDoc = wordApp.Documents:Add()
wordApp.Selection:TypeText("中文真的可以吗,我也不知道啊!")
wordDoc:SaveAs2("F:\\中文的文件名哦还挺长的.docx")
wordDoc:Close(0)
wordApp:Quit(0) Save the above code as ASCII format(that GBK encoding for me), it will produce a file "F:\ĵļŶͦ.docx" and its content is "ĿҲ֪",the code is tested on Windows XP(Simplified Chinese Edition) |
Well, the thing is LuaCom expects its input strings to be encoded as utf-8. So you need to change the encoding of your script and not change the codepage used by LuaCom in its conversion routines. With LuaCom as it is, I can work with strings in different languages (spanish and portuguese, with accented characters and so on) regardless of what language I have configured in Windows. If LuaCom used CP_ACP, my spanish scripts woud only work if I run them while having Spanish as the current language. |
i am not quite understand codepage of Windows, but let me explain my situation. i am in China, mose of us are using Windows xp Simplified Chinese edition, the codepage is 936, our filenames are encoded in gb2312, then how to deal with these files ? if the filename encoding is changed to utf-8, it turns out to be a mess, this kind of filename cannot be managed by windows explorer any more. as i mentioned, if i save lua source file as utf-8 format, luacom could understand the filenames and strings, and successfully convert the string to widechar, but others cannot, i.e. MS Word would complain "file not exists", cause the file with that UTF-8 encoded name DOES not exists(they are encoded in gb2312). in my situation, what's worse, the content of MS Word file is encoded in gb2312, i have to deal with these files using luacom, and i find i should let luacom to use CP_ACP. Any suggestions of using the existing version of luacom ? and i wonder what's your situation in a Spanish version of Windows ? |
Ah, I see. I think your main problem has to do with filenames. I heard that's a tricky thing on Windows. What I ended up doing was adding a couple of functions to allow changing the codepage on the fly. I see your situation is completely different with mine. I didn't have to deal with filenames and I don't fully understand all the tiny little details of codepages on Windows. |
Thank you for your reply! May be i should close the pull request, it seems not a common problem. |
hi davidm,
When i use luacom, i found that i cannot used it with Chinese characters: either filename with Chinese characters nor string parameters with Chinese characters.
i tried to save the lua source file to UTF-8 format, but that could not solve the filename with Chinese characters, because i cannot change the system's encoding.
Later, i found it works with Cygwin, then i realized that the newest Cygwin internally convert filename to UTF-8.
i download the source code today, and i think in function bstr2string()/string2bstr() CP_UTF8 should be changed to CP_ACP when not using with Cygwin (see commit changes). The changes work for me, but i do not known if it will cause other errors.
luojiejun