Non-latin characters in PelEntryWindowsString #16
Comments
It should certainly support other character sets. I think we should just go for UTF8. Do you have a test case showing that it does not support your character set? |
Test Case: http://pastie.org/1423164 Results (given and expected): http://img35.imageshack.us/img35/4917/testresults.png |
Could you post the test image also? |
I test it with several JPEG images from different sources - result the same. exif data, produced by the library (see "subject" field): http://regex.info/exif.cgi?b=3&url=http://img689.imageshack.us/img689/1230/testss.jpg |
I have just added a testcase and it seems to pass at my end. Are you using utf8 as encoding for the file you are using? |
Yes, I use utf8. |
I have not put a subject field in the image. I just used your image as a reference. As you can see in the test case I copy the picture and use a copy of it. You can change tearDown() so it will not unlink the test image and check it for yourself. Sorry but I do not have a Windows machine. Let me know what it generates? |
My Windows 7 shows subject as "Ïðåâåä, ìåäâåä!", and it is wrong... Here is the file with subject, that Windows shows wrong (filled by script): http://img10.imageshack.us/img10/1867/wrongne.jpg You can see the difference by online EXIF viewer here: wrong: http://regex.info/exif.cgi?b=3&url=http://img10.imageshack.us/img10/1867/wrongne.jpg |
I am aware of the difference. Have you tried running the test case
Uncomment the tearDown() method and read the subject in the tmp file? |
In your test case you write subject string as UTF-8 and then read it as UTF-8. But Windows (as I think) does not support UTF-8 for PelTag::XP_SUBJECT (and, as I think, for other PelTag::XP_* tags too). In my test case I take into consideration what Windows expects PelTag::XP_SUBJECT as ASCII and do recoding from UTF-8 to Windows-1251 (russian Windows encoding): |
Finally, I get it worked! The problem is in PelEntryWindowsString::setValue. In your library, it works only for Latin-1 and looks as:
I rewrite it as:
My function expects $str argument to be UTF-8. |
@mage2pro Do you know whether Windows has started supporting UTF-8. It's been a while, so I think (hope) they may support it now? |
Closing this for now. Please reopen, if you still have this problem. |
Why does PelEntryWindowsString support only Latin-1 character set?
Windows surely supports other ASCII character sets in file properties dialog (particularly, russian: Windows-1251).
The text was updated successfully, but these errors were encountered: