-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hindi text is rendered incorrectly #365
Comments
Unfortunately, this is not a trivial problem to solve, and fpdf is a deliberately simple PDF generation library. What you're seeing is a lack of support for automatic ligatures, more specifically Devaganari conjuncts. There are hundreds of those, many more than normal characters. A supporting font will include a table of character sequences that are supposed to be substituted for a ligature glyph. The most complex example in your text is this (separated with spaces on the left side, so the browser does not combine them): Unfortunately, fpdf currently operates on a character-by-character basis when first determining the width of each character and later printing a suitable glyph from the selected font. Supporting ligatures would require their substitution to happen as the very first step. We would also need a custom datastructure to represent them, because they cannot be represented by a python unicode character. Technically all of that it is certainly possible, but I wouldn't hold my breath for it right now. Anyone who knows enough about the internal structure of ttf fonts is of course welcome to contribute... Btw: Ligatures exist in many other writing systems. And another peculiarity that might also be interesting is contextual forms, where a different glyph is used for the same character, depending on whether it appears at the beginning, the middle, or the end of a word, or isolated (common eg. in Arabic, Hebrew, Mongolian, etc.). |
This issue also affects Tamil text: global-healthy-liveable-cities/global_scorecards#7 |
@gmischler @Lucas-C unfortunately still the same problem, Pillow had the same problem rendering fonts then they added font layout engine and ImageFont.Layout.RAQM which solves the problem i am not really good but ig libraqm can help if someone can add it in fpdf2 useful link https://github.com/python-pillow/Pillow/blob/main/src/_imagingft.c#L118 |
This is an interesting lead, thank you @MayankFawkes. |
Interesting indeed! Especially since we already have Pillow as a dependency... |
@Lucas-C there are some ways we can use it like, The build binary of libraqm is available, We can just use the binary of libraqm for Linux is it really simple First, there is the ctypes module in the standard library. It allows you to load a dynamic-link library (DLL on Windows, shared libraries .so on Linux) and call functions from these libraries, directly from Python. Such libraries are usually written in C. -- source dependency problem: Pillow uses libraqm and doesn't care about installing it with pillow because it is optional if we want pillow to decode fonts properly then we have to manually install it, we can do the same and if we want to provide it as a dependency the best way to make our build for different architecture and put it in the pip wheel file. to add libraqm dependency: there is a lib written in c/c++ for decoding qr and barcodes called zbar and they also have binary files so someone made a warper for that called pyzbar and this is how he building wheel file to add support of zbar binary link just adding binaries to wheel I am dropping some more links to add |
If fpdf2 were linux-only, using stuff like ctypes would be no problem. This specific issue here is "only" about ligatures, which is primarily necessary for indic scripts. A python implementation of the bidi algorithm is available in python-bidi, though it doesn't look particularly complicated, so we could easily roll our own. A general solution to ligatures requires a lookup of the substitutions in the font data. This seems straightforward, but we'll have to see what pitfalls we run into with it. So I suspect we can't just slap on a few more dependencies and let those do the work for us.
Obviously all of this won't happen within a few weeks. Care should also be taken at any step to take the possible requirements of the following steps into account, at least as far as can be predicted at the given time. |
@andersonhc PR #820 has been merged today. Could you test if that solved your issue @namastevis? You can install
The documentation is there: https://pyfpdf.github.io/fpdf2/TextShaping.html |
@mohindra9211 did you try "set_text_shaping()"? here is the small test I did: from fpdf import FPDF
text= "परी कथाएँ काल्पनिक होते हुए भी मन को उड़ान देने वाली और शिक्षाप्रद होती हैं।"
pdf = FPDF()
pdf.add_page()
pdf.add_font(family="Mangal", fname="C:\\Apps\\fpdf2\\test\\text_shaping\\Mangal 400.ttf")
pdf.set_font("Mangal", size=40)
pdf.set_text_shaping(False)
pdf.multi_cell(w=pdf.epw, txt=text, new_x="LEFT", new_y="NEXT")
pdf.ln()
pdf.set_text_shaping(True)
pdf.multi_cell(w=pdf.epw, txt=text, new_x="LEFT", new_y="NEXT")
pdf.output("hindi.pdf") And the results with text shaping enabled looks correct. |
Dear AndersonHC, I want to express my sincere gratitude for your assistance. You've helped me resolve a significant issue. However, I've noticed a minor problem in the output, and I suspect it might be related to the font. I'll try using different fonts to see if that resolves the issue. Thank you once again for your valuable help. |
Can you tell me what font and text you used? |
This problem is solved Thank you once again for your valuable help. |
This comment was marked as resolved.
This comment was marked as resolved.
Tomorrow, I'll provide a list of fonts that correctly support Hindi text. Please incorporate this information into your document. It will be particularly beneficial for FPDF2 users, especially those in India. I appreciate your support and prompt response. Thank you. |
Those fonts that don't work with fpdf2, do they produce correct results with other software? |
I tried with the given code but it's not working & tried to Mangal_Regular font. May font problem. please correct it. |
Read "AttributeError" carefully and write the correct path |
fixed, the file path was not correct. Thank you! |
To gain a better understanding of fpdf2, it is advisable to peruse the fpdf2 documentation along with its tutorials. It's worth noting that you can incorporate both Hindi and English text into your documents, depending on your coding proficiency. |
While trying to generate a pdf using FPDF2, the Hindi text is not generated correctly. I have tried using different fonts (Gargi, Mangal, Arjun-Wide, Mukta, Lohit) but all give the wrong result similar to what shown below.
Correct hindi text: इण्टरनेट पर हिन्दी के साधन
What is printed:
It seems the issue happens in the following two scenarios:
1.
When this appears before a character, while printing it moves to the next character.
The text was updated successfully, but these errors were encountered: