-
-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reversed Arabic Numbers/Words #5426
Comments
|
...oh - and the shapefile/input data in this case is already in the correct order when it gets to MapServer: طريق 30 |
|
In version 7.0 there was a big overhaul of labelling. RFC98 mentions Arabic - see http://mapserver.org/development/rfc/ms-rfc-98.html#text-rendering-pipeline Related code changes are at #4673 |
|
Thanks for the link. I looked at that before, but it's still unclear to me whether or not what I'm seeing here is expected behavior. If I open the .shp in QGIS or ArcMap and turn on labels, I get the desired/expected behavior: However, in MapServer, I'm still getting the reversed ordering of text/numbers. I'm attaching the subset of roads (a zipped .shp) from the screenshot above that contains the طريق 30 road name. Thanks again. |
|
Can you include a full working sample in your zip (ttf font file, mapfile + layer including ENCODING parameter)? |
|
by the way, if I look at your dbf contents in any text editor, the record appears as '[arabic text], 30'. The same for QGIS and its labels. |
|
maybe it is just your configuration in the mapfile layer. In any case, this discussion would get much more eyes if you posted it to the mapserver-users mailing list first (then, if they decide it is a bug, post here). However, please update your zip here to a full working sample, and I'll take a look. |
|
wait, my words here made me realize what I bet is wrong in your mapfile layer: in version 7 the ENCODING parameter was moved from the LABEL object, to the LAYER object. See examples at http://mapserver.org/mapfile/encoding.html Again, this discussion should be happening on the mapserver-users list, not here. But, anyway, please make that change locally. |
|
It does sound like a bug. The Mapserver 7 example is showing that the characters within a word are rendered right to left, but the words themselves are rendered left to right. The shapefile looks correct in QGIS, and Mapserver is flipping the word order. I wonder if this is because the column contains both Arabic and English labels in a single field? Perhaps Mapserver is applying the word ordering for an entire field based on the first record it encounters? Or perhaps Mapserver 7 is being too smart - instead of rendering the data as it appears in the file, it is detecting the arabic, and reversing the word order on the assumption that it should be right to left, when it already is... |
|
Also, just to make a small correction to the original description:
According to the screen shot, this is what Mapserver 7 is doing: it is putting the '30' before the word 'Road', as read in Arabic (right to left). However that is incorrect because the desired rendering is 'Road 30', which is the correct Arabic rendering shown in the Mapserver 6.x screenshot. |
|
thanks for (not) providing a full working sample 🥇 ha. I spent too much effort downloading arabic fonts from the web, mostly bad, until finally found one that worked. Please see the following for a full working sample, including font, layer, data, etc.: Here is the result with MapServer 7.0.4: If you open up the file 'kuwait_major_roads.cpg' you will notice that the encoding of the shapefile is in UTF-8. As @geographika pointed out correctly earlier, with MapServer 7 if the label text is not UTF-8 then the fribidi library converts to UTF-8. In this case, I believe nothing happens because the data is already in UTF-8 encoding. Download the working sample locally, run the mapfile through shp2img, and provide your feedback here :) thanks all. -jeff |
|
ok some good news: I installed an old MS4W containing MapServer 6.2.1 I modified that same mapfile layer in my sample zip, but moved the ENCODING to inside the LAYER object: Ok now I've done too much testing ha. Will let others test locally and give feedback. I believe, as I said previously, the changes in MapServer 7 mean that since the source strings (dbf) are already in UTF-8, then fribidi is never called to convert. Thanks all, -jeff |
|
I get the same results Jeff. All I can add is that yesterday when I commented, I was on a Mac, and the data rendered correctly as expected in QGIS. Today, on a windows machine, QGIS (from OSGeo4W) is not able to render the labels even after I install Jeff's provided font That is a different problem, but. Mapserver 7 via MS4W is not having a problem rendering the labels, but the flipping of the word order is occurring regardless of if or where encoding is specified in the map file. The incorrect output looks like Jeff's first example above. |
|
I get my same exact results above on Ubuntu with today's master compiled from source, with fribidi & harfbuzz support (arabic labels appear correct, but as '[arabic text], 30' using that test font and mapfile. |
|
Do you think it still worth posting this issue in the mapserver-users mailing list? |
|
If you still have this issue with MapServer 7.4.1, please do bring the issue to the attention of the mapserver-users mailing list. At least now there is a sample package with mapfile & fonts for everyone to test locally and provide feedback. |
|
Reopening. The problem still exists today on Ubuntu with MapServer 7.4.1, fribidi-1.0.5, harfbuzz-2.4.0 (all compiled from source). |
|
Bringing this issue to the MapServer-users mailing list now, to hopefully get more eyes on it.... |
|
Brought to the mapserver-users list (https://lists.osgeo.org/pipermail/mapserver-users/2019-July/081273.html). |
|
I should also mention that I am using freetype-2.9.1 |
|
The glyphs render in the correct RTL order within words, and the words render in the correct RTL direction within a label. Somehow only the numerals are moved. I tested this by editing the shapefile to include the full multi-word arabic name of the 4th Ring Road, and I added a fake number at the end (as wanted by the OP): kuwait_major_roads_edited123.zip QGIS renders the multi-word label with number as it appears in the attribute table of the shapefile (number at the end of the words RTL). Mapserver renders the individual words correctly, and in the correct RTL order, but moves the number from one end of the label to the other end. ++++++++++++++++ Result in Mapserver (MS4W 4): |
|
Great testing @tchaddad I have also opened the DBF in LibreOffice Calc (don't "try this at home" ha) and confirm that the stored text is [number]-[text] as you found; but for some reason MapServer >=7 reverses this order and places the numbers after. I've now tried this from source on several Ubuntu machines, several Windows machines. Every way I compile MapServer7 I get this problem. |
|
(the direct translation of "طريق 30" is "Route 30") From what I read, it is the Bidi algorithm that handles numbers with text properly. "BIDI is needed for numbers, while arabic text flows from right to left numbers flow from left to right like in latin languages, so BIDI is required even in unilingual texts." Somewhere MapServer7 is not handling it properly. Hmm..... |
That's not actually true. If you look at the DBF content with an hexadecimal editor, the stored content is {arabic_in_utf8_probably_with_right_most_glyph_first} 30 On Linux, I can also see the same order with ogrinfo, with a terminal that is probably not bidi aware, and displays probably arabic in the logical order, thus incorrectly left-to-right, as found in the DBF So the behaviour of MapServer is consistent with RFC 98 ( http://mapserver.org/development/rfc/ms-rfc-98.html#text-rendering-pipeline ), that is the arabic glyphs are rendered right-to-left, and at the right of them 30 is displayed in left-to-right. But... interestingly if, instead of 30, I put letter ASCII characters like ab, then it is displayed the same in my non bidi-aware terminal and here in Firefox (arabic and then ab) And even more interestingly, if in my non bidi-aware terminal, I have {arabic}30ab, when pasted here itbecomes 30{arabic}ab But if in my non bidi-aware terminal, I have {arabic}ab30, when pasted here it becomes {arabic}ab30 So it seems there's a difference of handing in the Firefox renderer between digit and non-digit ASCII characters that immediately follow Arabic glyphs Actually the immediately is a bit more subtle than that... If in my non dibi-aware terminal, I have {arabic}{space}{comma}30.12ab, when pasted here it becomes 30.12{comma}{space}{arabic}ab I suspect this might be an exception case where numbers should be displayed in left-to-right order, but put at the left visually of Arabic glyphs when, in the binary encoding, they're after them. (aren't digit we use in western languages Arabic numbers after all ... ?) I've not looked at the code to check if it is a Fribidi issue or a MapServer implementation issue: I'd suspect a MapServer one since RFC98 mentions shortcuts for Latin glyphs. |
|
Thanks @rouault that explains my early tests above, using an old simple 'dbf editor' differs from my recent LibreOffice tests, which must be bidi-aware. The magic seems to be happening in textlayout.c. Hmmm... |
|
Thank you @rouault for bringing back a bit of rationality to this thread. Mixing specifically LTR and RTL languages leads to an ambiguous situation, "eg. when rendering text stored as "arabic latin" a LTR centric renderer will choose to render "cibara latin" while a RTL centric renderer will choose to render "latin cibara". RFC98 has made some assumptions in that case, but that is not the topci of this issue. |
|
Thanks for the explanation. I did notice the runs that Fribidi is doing inside the file "textlayout.c" The case here is that MapServer<7 handled the raw text "arabic 30" by displaying it at "30 arabic", which is of course very important for street-level maps. This is also how other software such as QGIS display the label, as "30 arabic". Earlier this morning I tried fribidi at the commandline (Fribidi -1.0.5) on Linux, and it seems to spit the same result as MapServer7 ("arabic 30"). I am honestly having a difficult time finding a bash shell or tool that does not enable Bidi in the results, or when examining the raw input (ogrinfo on the dbf displays the text correctly as "30 arabic"). |
|
I think maybe my issue at the fribidi commandline is that I am not using the actual raw text as input. (which I am having a difficult time getting access to) |










Hi there. We recently upgraded from MapServer 6.2.1 to 7.0.4.
One issue we're seeing is that Arabic road labels that contain a number and a word like طريق 30 is being displayed with the 30 at the end of the phrase. (So Road 30 instead of 30 Road)
Correct display in 6.2.1:

After upgrade to 7.0.4:

The '30' should appear before the word.
We are using fribidi_0.19.6 and harfbuzz_1.4.1 (both in our new and previous versions of MapServer). This is happening on both a Windows and Ubuntu 16.04 environment, and seems to only affect the ordering of labels when a number is present in an Arabic label.
One thing that changed in our installation had to do with a freetype/harfbuzz circular dependency:
I had to compile and install freetype w/o harfbuzz. Then install harfbuzz. And then compile and install freetype w/ harfbuzz on top of existing freetype. I am pretty sure that when I installed freetype the second time it overwrote the freetype libraries from the first installation of freetype.
Here are the options we used when compiling:
-DWITH_GDAL=1 -DWITH_HARFBUZZ=1 -DWITH_THREAD_SAFETY=1 -DWITH_JAVA=1 -DWITH_CAIRO=0 -DWITH_GEOS=0 -DWITH_POSTGIS=0 -DWITH_RSVG=0 -DWITH_CLIENT_WMS=0 -DWITH_CLIENT_WFS=0 -DWITH_WFS=0 -DWITH_LIBXML2=0 -DWITH_KML=0 -DWITH_GIF=0 -DWITH_EXEMPI=0 -DWITH_FCGI=0
The input data for both versions is identical - the only thing I can tell that is different is the version of MapServer.
Is this expected or seen by anyone else?
Any thoughts?
The text was updated successfully, but these errors were encountered: