Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle preceding whitespaces in write() for line break #947

Merged
merged 6 commits into from
Oct 11, 2023

Conversation

SaiHarshaK
Copy link

@SaiHarshaK SaiHarshaK commented Oct 8, 2023

Fixes #902

This PR adds support to handle text starting with whitespaces which cause linebreak. In case of text with preceding whitespace being added any line with text, this would trigger a linebreak.
The issue currently is that on triggering linebreak, it currently throws an exception since the fragment is basically empty which would mean the perform_harfbuzz_shaping() would return None, and we would be unable to compute the shaped_text_width.

To solve this issue, we add a conditional check when perform_harfbuzz_shaping() return None, where we simply return shaped_text_width as (0, 0) (char len, char width)

Using the repro in the issue, have added UT for the same.
Before code changes:

test\text_shaping\test_text_shaping.py:169: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
fpdf\fpdf.py:218: in wrapper
    return fn(self, *args, **kwargs)
fpdf\fpdf.py:3674: in write
    new_page = self._render_styled_text_line(
fpdf\fpdf.py:2831: in _render_styled_text_line
    unscaled_width = frag.get_width(initial_cs=i != 0)
fpdf\line_break.py:179: in get_width
    (char_len, w) = self.font.get_text_width(
fpdf\fonts.py:199: in get_text_width
    return self.shaped_text_width(text, font_size_pt, text_shaping_parms)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = TTFFont(i=2, fontkey=mangal), text = [], font_size_pt = 40
text_shaping_parms = {'direction': None, 'features': {}, 'language': None, 'script': None, ...}

    def shaped_text_width(self, text, font_size_pt, text_shaping_parms):
        """
        When texts are shaped, the length of a string is not always the sum of all individual character widths
        This method will invoke harfbuzz to perform the text shaping and return the sum of "x_advance"
        and "x_offset" for each glyph. This method works for "left to right" or "right to left" texts.
        """
        _, glyph_positions = self.perform_harfbuzz_shaping(
            text, font_size_pt, text_shaping_parms
        )
        text_width = 0
>       for pos in glyph_positions:
E       TypeError: 'NoneType' object is not iterable

fpdf\fonts.py:212: TypeError

After code changes:

test/text_shaping/test_text_shaping.py::test_mixed_text_shaping PASSED                                       [100%]

Checklist:

  • The GitHub pipeline is OK (green),
    meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.

  • A unit test is covering the code added / modified by this PR

  • This PR is ready to be merged

  • [] In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder

  • A mention of the change is present in CHANGELOG.md

Since this is a bugfix and not a feature, not adding anything to docs folder.

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

@SaiHarshaK SaiHarshaK changed the title Handle preceding whitespaces for line break and add corresponding UT Handle preceding whitespaces for line break Oct 8, 2023
@SaiHarshaK SaiHarshaK changed the title Handle preceding whitespaces for line break Handle preceding whitespaces in write() for line break Oct 8, 2023
@gmischler
Copy link
Collaborator

To solve this issue, we add a whitespace so that the fragment is not empty anymore and it able to print whitespace and go ahead with linebreak and continue rendering further output.

I'm really not sure if this is the correct approach. Why add extra text data that is not actually supposed to appear on page? How can you be sure that this never has any unintended side effects?

It would be much simpler and more logical to just add a check in shaped_text_width() and return 0.0 if the fragment is empty.

As a potential source of unwanted interaction: In #897 I am adding functionality to MultiLineBreak() so that it can remove any space characters at the start of each line. This is necessary to render converted HTML text correctly.

fpdf/line_break.py Outdated Show resolved Hide resolved
@andersonhc
Copy link
Collaborator

@SaiHarshaK
I agree with gmischler, changing shaped_text_width() to return 0, 0 when harfbuzz returns None is a much simpler and cleaner solution.

@SaiHarshaK
Copy link
Author

Got it, that makes sense. will do that and update this PR

@SaiHarshaK
Copy link
Author

@andersonhc @gmischler Please check now

@SaiHarshaK
Copy link
Author

@gmischler @andersonhc is there a way to rerun the tests? i think its complaining some existing test took too long than the limit set.

Not sure why it complains only for a specific version of python, since it ran successfully some versions on windows

@SaiHarshaK
Copy link
Author

Thanks, it seems the reruns got these passed :)

@Lucas-C
Copy link
Member

Lucas-C commented Oct 9, 2023

@gmischler @andersonhc is there a way to rerun the tests? i think its complaining some existing test took too long than the limit set.

Not sure why it complains only for a specific version of python, since it ran successfully some versions on windows

This is a reccuring problem that we have, tracked in #923

I re-run the pipeline, and everything is ✅ now.

@SaiHarshaK
Copy link
Author

Can you get this merged @Lucas-C

@Lucas-C
Copy link
Member

Lucas-C commented Oct 9, 2023

Can you get this merged @Lucas-C

I'll defer to @gmischler on this, as he is much more expert than me on the subject,
and I would not want this PR to conflict with his work in #897.

By the way, are you taking part to Hacktoberfest @SaiHarshaK?
Is it important for you that this PR gets merged or validated with the hacktoberfest-accepted label in October, if possible?

@SaiHarshaK
Copy link
Author

Yes i am taking part in it, so it would be great if it gets merged.

Maybe the tag itself is sufficient but im not sure though

@gmischler
Copy link
Collaborator

I'll defer to @gmischler on this, as he is much more expert than me on the subject,
and I would not want this PR to conflict with his work in #897.

Actually, @andersonhc is the real expert here, as I haven't really touched the text shaping stuff.
Neither does #897 touch it, so there won't be any conflict.

@andersonhc
Copy link
Collaborator

@allcontributors please add SaiHarshaK for code

@allcontributors
Copy link

@andersonhc

I've put up a pull request to add @SaiHarshaK! 🎉

@andersonhc andersonhc merged commit e918119 into py-pdf:master Oct 11, 2023
10 checks passed
@SaiHarshaK
Copy link
Author

Thank you!

@Lucas-C
Copy link
Member

Lucas-C commented Oct 13, 2023

Thank YOU for your contribution to fpdf2 @SaiHarshaK 🙂

This has been released in 2.7.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mixed scripts with text shaping failure
4 participants