New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ColumnText + Arabic Reshaping causes arabic characters to no longer appear. Removing column text makes characters appear. #940
Comments
Thank you for reporting this bug. Please submit a pull request with a solution to this problem if you can. |
Neither 0x0627 nor 0x0FE8E can be found in BidiLine.java Seems to be a problem with the commercial font GraphikArabic. |
May be related to #938 |
Hi @vk-github18, I was able to print 0x0627 with GraphikArabic. I could not print 0x0FE8E. It seems that the library was converting characters from form A -> form B based on the surrounding characters in the ArabicLigatuizer.java. We can see 0x0627 is defined here: https://github.com/LibrePDF/OpenPDF/blob/1.3-java8/openpdf/src/main/java/com/lowagie/text/pdf/ArabicLigaturizer.java#L100 we can also see that and we can see So the question is:
Or even more generally speaking:
An |
What is the result if you use as explained in https://github.com/LibrePDF/OpenPDF/wiki/Accents,-DIN-91379,-non-Latin-scripts ? |
To your question, if the commercial font you used does not support Arabic properly you should open an issue at the producer of the font. |
Describe the bug
The arabic reshaping is leading to characters not being rendered in the PDF when using some fonts. If I do not use the ColumnText, the characters appear.
To Reproduce
We can use a modified version of the
RightToLeft.java
example to show the issue:Here it is working:
Output:
If we leave everything the exact same as the above, except we change
"NotoSansArabic-regular.ttf"
to a different font, such as"GraphikArabic-Regular.ttf"
, Then we get the following output:The problem can be seen most easily by looking at the lower left section of the main paragaph. In the
NotoSansArabic
font, we can see a word that looks like it expands multiple characters. In theGraphikArabic
font, we can see that it is missing the right half of the word and seems to only contain the last two characters.A specific character that seems to be rendered by
NotoSansArabic
and notGraphikArabic
is\u0627
.I thought that
GraphikArabic
was missing the\u0627
character altogether, but if I use the following code, i can generate it just fine:screenshot of the output being as expected
System.out.println(bf.charExists('\u0627'));
also outputs true when using theGraphikArabic
font. I assume thatBaseFont::charExists(char)
is the way to determine if the given char should show on the PDF.I believe the issue is that characters like
\u0627
are being reshaped into much different characters in a whole other unicode block and that the font does not support the reshaped characters. I believe this because when debugging, I can see that some characters such as0x0627
become0x0FE8E
. This transformation happens here: https://github.com/LibrePDF/OpenPDF/blob/master/openpdf/src/main/java/com/lowagie/text/pdf/BidiLine.java#L197Expected behavior
I expect
ColumnText
and adding elements to a document directly to have the same output OR I expect to be able to skip the "reshaping" process so that I can continue to use a font which supports the 0x0600 to 0x06FF character range.Screenshots
screenshots added above.
System (please complete the following information):
Additional context
The text was updated successfully, but these errors were encountered: