-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixing the display of gibberish instead of hebrew in file names and ID3 tags #222
Comments
#222 Default filesystem character encoding to UTF-8, except for Windows and PHP<7.1 which defaults to ISO-8859-1 (but the user may want/need to change this to their local system character encoding setting)
It sounds from your description that you have Windows set to Windows-1255 character encoding which remaps the upper characters (171-254) to Hebrew characters. As I understand it, Windows uses UTF-16 encoding for filenames, but this is mapped into 8-bit encoding for "non-UTF8-aware" programs, which includes PHP < 7.1.0 which can make it difficult (or impossible) to work with file containing "foreign" (to the system codepage) characters since they can't be mapped. For example, in my Windows7-PHPv7.0.24 test environment I have a file named I have made changes in 35d752f (and e08fbde) that sets the default filesystem encoding to The better solution is to upgrade your PHP installation to at least v7.1 which supports UTF-8 representation of filenames and should eliminate all this kind of confusion. |
Hi James, Thanks for the quick and detailed response. |
Just checked it. Your new commit fixed the display of the directory (Files in ......) and the directory listing of file names. |
I have a library of Israeli music folders on my Windows PC. The file names are composed of Hebrew characters as well as the ID3 tags encoded in the mp3 files. I have used MediaMonkey to manage my music and tag the files and now I tried to do it myself with PHP.
Using the getIDS lib I ran the demo.browse.php file on one of the folders and got the following display
Fixing it required the following:
1- changing 'ISO-8859-1' with 'UTF-8' in lines 84, 116, 217,238
now the file names are displayed correctly.
the next task was replacing the gibberish artist and title with the equivalent Hebrew characters
äçìåðåú äâáåäéí àäáä øàùåðä
should be displayed as
החלונות הגבוהים אהבה ראשונה
Looking at a bunch of conversion routines without success, I finally created a simple translation scheme with 2 arrays
passing the string to the preg_replace() function will emit the correct hebrew characters
Well, not quite. the program HTML encode the foriegn characters so that they are displayed as
&#[0-9]+ entities. Those need to be convered to UTF-8,
I found a piece of code in PHP.NET
To make it all work in demo.browse.php you need to modify lines 279 and 280 with the following lines:
finally, adding the two arrays $gib2 and $heb at the begining of the file, just below the
$PageEncoding = 'UTF-8';
line.and now the display is correct
That's it. If someone can do it more efficient, I would love to hear about it.
p.s. if anyone wonders 'אהבה ראשונה' means first love. and you can listen to the song on YouTube
Eli Argon
The text was updated successfully, but these errors were encountered: