Handling some edge cases with table() (#768)

py-pdf · Apr 19, 2023 · 8763cad · 8763cad
1 parent 28edd82
commit 8763cad
Show file tree

Hide file tree

Showing 27 changed files with 266 additions and 133 deletions.
diff --git a/.github/workflows/continuous-integration-workflow.yml b/.github/workflows/continuous-integration-workflow.yml
@@ -24,11 +24,11 @@ jobs:
           python-version: ${{ matrix.python-version }}
       - name: Install system dependencies ⚙️
         if: matrix.platform == 'ubuntu-latest'
-        run: sudo apt-get install ghostscript libjpeg-dev
+        run: sudo apt-get update && sudo apt-get install ghostscript libjpeg-dev
       - name: Install qpdf ⚙️
         if: matrix.platform == 'ubuntu-latest' && matrix.python-version != '3.9'
         # We run the unit tests WITHOUT qpdf for a single parallel execution / Python version:
-        run: sudo apt-get install qpdf
+        run: sudo apt-get update && sudo apt-get install qpdf
       - name: Install Python dependencies ⚙️
         run: |
           python -m pip install --upgrade pip setuptools wheel

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -19,9 +19,15 @@ This can also be enabled programmatically with `warnings.simplefilter('default',
 ## [2.7.4] - Not released yet
 ### Added
 - documentation on how to embed `graphs` and `charts` generated using `Pygal` lib: [documentation section](https://pyfpdf.github.io/fpdf2/Maths.html#using-pygal) - thanks to @ssavi-ict
-- Documentation on how to use `fpdf2` with [FastAPI](https://fastapi.tiangolo.com/): <https://pyfpdf.github.io/fpdf2/UsageInWebAPI.html#FastAPI> - thanks to @KamarulAdha
+- documentation on how to use `fpdf2` with [FastAPI](https://fastapi.tiangolo.com/): <https://pyfpdf.github.io/fpdf2/UsageInWebAPI.html#FastAPI> - thanks to @KamarulAdha
+- [`FPDF.write_html()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html): `<table>` elements can now be aligned left or right on the page using `align=`
+### Fixed
+- [`FPDF.table()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.table): text overflow in the last cell of the header row is now properly handled
+- [`FPDF.table()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.table): when `align="RIGHT"` is provided, the page right margin is now properly taken in consideration
 ### Changed
 - [`FPDF.write_html()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.write_html) does not render the top row as a header, in bold with a line below, when no `<th>` are used, in order to be more backward-compatible with earlier versions of `fpdf2` - _cf._ [#740](https://github.com/PyFPDF/fpdf2/issues/740)
+### Deprecated
+- the `split_only` optional parameter of [`FPDF.multi_cell()`](https://pyfpdf.github.io/fpdf2/fpdf/fpdf.html#fpdf.fpdf.FPDF.multi_cell), which is replaced by two new distincts optional parameters: `dry_run` & `output`
 
 ## [2.7.3] - 2023-04-03
 ### Fixed

diff --git a/docs/Development.md b/docs/Development.md
@@ -198,7 +198,7 @@ To preview the API documentation, launch a local rendering server with:
 
 ## PDF spec & new features
 The **PDF 1.7 spec** is available on Adobe website:
-[PDF32000_2008.pdf](https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf).
+[PDF32000_2008.pdf](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf).
 
 It may be intimidating at first, but while technical, it is usually quite clear and understandable.
 

diff --git a/docs/HTML.md b/docs/HTML.md
@@ -84,7 +84,7 @@ pdf.output("html.pdf")
 * `<ol>`, `<ul>`, `<li>`: ordered, unordered and list items (can be nested)
 * `<dl>`, `<dt>`, `<dd>`: description list, title, details (can be nested)
 * `<sup>`, `<sub>`: superscript and subscript text
-* `<table>`: (and `border`, `width` attributes)
+* `<table>`: (with `align`, `border`, `width` attributes)
     + `<thead>`: optional tag, wraps the table header row
     + `<tfoot>`: optional tag, wraps the table footer row
     + `<tbody>`: optional tag, wraps the table rows with actual content

diff --git a/docs/LineBreaks.md b/docs/LineBreaks.md
@@ -10,5 +10,3 @@ An automatic break is performed at the location of the nearest space or soft-hyp
 A soft-hyphen will be replaced by a normal hyphen when triggering a line break, and ignored otherwise.
 
 If the parameter `print_sh=False` in `multi_cell()` or `write()` is set to `True`, then they will print the soft-hyphen character to the document (as a normal hyphen with most fonts) instead of using it as a line break opportunity.
-
-When using [multi_cell()](fpdf/fpdf.html#fpdf.fpdf.FPDF.multi_cell), the parameter `split_only=True` will perform word-wrapping only and return the resulting multi-lines as a list of strings. This can be used in conjunction with the cursor position and document height to determine if inserting a [multi_cell()](fpdf/fpdf.html#fpdf.fpdf.FPDF.multi_cell) will result in a page break.
diff --git a/docs/Text.md b/docs/Text.md
@@ -90,8 +90,7 @@ the background painted.
 Using `new_x="RIGHT", new_y="TOP", maximum height=pdf.font_size` can be
 useful to build tables with multiline text in cells.
 
-In normal operation, returns a boolean indicating if page break was triggered.
-When `split_only == True`, returns `txt` split into lines in an array (with any markdown markup removed).
+In normal operation, returns a boolean indicating if page break was triggered. The return value can be altered by specifying the `output` parameter.
 
 [Signature and parameters for.multi_cell()](fpdf/fpdf.html#fpdf.fpdf.FPDF.multi_cell)
 

diff --git a/fpdf/enums.py b/fpdf/enums.py
@@ -129,6 +129,10 @@ def coerce(cls, value):
             return value
 
         if isinstance(value, str):
+            try:
+                return cls[value.upper()]
+            except KeyError:
+                pass
             try:
                 flags = cls[value[0].upper()]
                 for char in value[1:]:
@@ -198,6 +202,7 @@ def coerce(cls, value):
 class TextEmphasis(CoerciveIntFlag):
     """
     Indicates use of bold / italics / underline.
+
     This enum values can be combined with & and | operators:
         style = B | I
     """
@@ -231,6 +236,24 @@ def coerce(cls, value):
         return super(cls, cls).coerce(value)
 
 
+class MethodReturnValue(CoerciveIntFlag):
+    """
+    Defines the return value(s) of a FPDF content-rendering method.
+
+    This enum values can be combined with & and | operators:
+        PAGE_BREAK | LINES
+    """
+
+    PAGE_BREAK = 1
+    "The method will return a boolean indicating if a page break occured"
+
+    LINES = 2
+    "The method will return a multi-lines array of strings, after performing word-wrapping"
+
+    HEIGHT = 4
+    "The method will return how much vertical space was used"
+
+
 class TableBordersLayout(CoerciveEnum):
     "Defines how to render table borders"
 

diff --git a/fpdf/fpdf.py b/fpdf/fpdf.py
@@ -60,6 +60,7 @@ class Image:
     EncryptionMethod,
     FontDescriptorFlags,
     FileAttachmentAnnotationName,
+    MethodReturnValue,
     PageLayout,
     PageMode,
     PathPaintRule,
@@ -256,7 +257,7 @@ def check_page(fn):
 
     @wraps(fn)
     def wrapper(self, *args, **kwargs):
-        if not self.page and not kwargs.get("split_only"):
+        if not self.page and not (kwargs.get("dry_run") or kwargs.get("split_only")):
             raise FPDFException("No page open, you need to call add_page() first")
         return fn(self, *args, **kwargs)
 
@@ -3342,6 +3343,19 @@ def _perform_page_break(self):
     def _has_next_page(self):
         return self.pages_count > self.page
 
+    @contextmanager
+    def _disable_writing(self):
+        self._out = lambda *args, **kwargs: None
+        self.add_page = lambda *args, **kwargs: None
+        self._perform_page_break = lambda *args, **kwargs: None
+        prev_x, prev_y = self.x, self.y
+        yield
+        # restore writing functions:
+        del self.add_page
+        del self._out
+        del self._perform_page_break
+        self.set_xy(prev_x, prev_y)  # restore location
+
     @check_page
     def multi_cell(
         self,
@@ -3351,7 +3365,7 @@ def multi_cell(
         border=0,
         align=Align.J,
         fill=False,
-        split_only=False,
+        split_only=False,  # DEPRECATED
         link="",
         ln="DEPRECATED",
         max_line_height=None,
@@ -3360,6 +3374,8 @@ def multi_cell(
         new_x=XPos.RIGHT,
         new_y=YPos.NEXT,
         wrapmode: WrapMode = WrapMode.WORD,
+        dry_run=False,
+        output=MethodReturnValue.PAGE_BREAK,
     ):
         """
         This method allows printing text with line breaks. They can be automatic
@@ -3384,8 +3400,8 @@ def multi_cell(
                 `C`: center; `X`: center around current x; `R`: right align
             fill (bool): Indicates if the cell background must be painted (`True`)
                 or transparent (`False`). Default value: False.
-            split_only (bool): if `True`, does not output anything, only perform
-                word-wrapping and return the resulting multi-lines array of strings.
+            split_only (bool): **DEPRECATED since 2.7.4**:
+                Use `dry_run=True` and `output=("LINES",)` instead.
             link (str): optional link to add on the cell, internal
                 (identifier returned by `add_link`) or external URL.
             new_x (fpdf.enums.XPos, str): New current position in x after the call. Default: RIGHT
@@ -3398,13 +3414,46 @@ def multi_cell(
                 character, instead of a line breaking opportunity. Default value: False
             wrapmode (fpdf.enums.WrapMode): "WORD" for word based line wrapping (default),
                 "CHAR" for character based line wrapping.
+            dry_run (bool): if `True`, does not output anything in the document.
+                Can be useful when combined with `output`.
+            output (fpdf.enums.MethodReturnValue): defines what this method returns.
+                If several enum values are joined, the result will be a tuple.
 
         Using `new_x=XPos.RIGHT, new_y=XPos.TOP, maximum height=pdf.font_size` is
         useful to build tables with multiline text in cells.
 
-        Returns: a boolean indicating if page break was triggered,
-            or if `split_only == True`: `txt` splitted into lines in an array
+        Returns: a single value or a tuple, depending on the `output` parameter value
         """
+        if split_only:
+            warnings.warn(
+                (
+                    'The parameter "split_only" is deprecated.'
+                    ' Use instead dry_run=True and output="LINES".'
+                ),
+                DeprecationWarning,
+                stacklevel=3,
+            )
+        if dry_run or split_only:
+            with self._disable_writing():
+                return self.multi_cell(
+                    w=w,
+                    h=h,
+                    txt=txt,
+                    border=border,
+                    align=align,
+                    fill=fill,
+                    link=link,
+                    ln=ln,
+                    max_line_height=max_line_height,
+                    markdown=markdown,
+                    print_sh=print_sh,
+                    new_x=new_x,
+                    new_y=new_y,
+                    wrapmode=wrapmode,
+                    dry_run=False,
+                    split_only=False,
+                    output=MethodReturnValue.LINES if split_only else output,
+                )
         wrapmode = WrapMode.coerce(wrapmode)
         if isinstance(w, str) or isinstance(h, str):
             raise ValueError(
@@ -3443,10 +3492,6 @@ def multi_cell(
         align = Align.coerce(align)
 
         page_break_triggered = False
-        if split_only:
-            self._out = lambda *args, **kwargs: None
-            self.add_page = lambda *args, **kwargs: None
-            self._perform_page_break_if_need_be = lambda *args, **kwargs: None
 
         if h is None:
             h = self.font_size
@@ -3462,6 +3507,7 @@ def multi_cell(
 
         prev_font_style, prev_underline = self.font_style, self.underline
         prev_x, prev_y = self.x, self.y
+        total_height = 0
 
         if not border:
             border = ""
@@ -3490,8 +3536,6 @@ def multi_cell(
                     trailing_nl=False,
                 )
             ]
-        if align == Align.X:
-            prev_x = self.x
         should_render_bottom_blank_cell = False
         for text_line_index, text_line in enumerate(text_lines):
             is_last_line = text_line_index == len(text_lines) - 1
@@ -3527,6 +3571,7 @@ def multi_cell(
                 link=link,
             )
             page_break_triggered = page_break_triggered or new_page
+            total_height += current_cell_height
             if not is_last_line and align == Align.X:
                 # prevent cumulative shift to the left
                 self.x = prev_x
@@ -3566,26 +3611,29 @@ def multi_cell(
         if new_y == YPos.TOP:  # We may have jumped a few lines -> reset
             self.y = prev_y
 
-        if split_only:
-            # restore writing functions
-            del self.add_page
-            del self._out
-            del self._perform_page_break_if_need_be
-            self.set_xy(prev_x, prev_y)  # restore location
-            result = []
-            for text_line in text_lines:
-                characters = []
-                for frag in text_line.fragments:
-                    characters.extend(frag.characters)
-                result.append("".join(characters))
-            return result
         if markdown:
             if self.font_style != prev_font_style:
                 self.font_style = prev_font_style
                 self.current_font = self.fonts[self.font_family + self.font_style]
             self.underline = prev_underline
 
-        return page_break_triggered
+        output = MethodReturnValue.coerce(output)
+        return_value = ()
+        if output & MethodReturnValue.PAGE_BREAK:
+            return_value += (page_break_triggered,)
+        if output & MethodReturnValue.LINES:
+            output_lines = []
+            for text_line in text_lines:
+                characters = []
+                for frag in text_line.fragments:
+                    characters.extend(frag.characters)
+                output_lines.append("".join(characters))
+            return_value += (output_lines,)
+        if output & MethodReturnValue.HEIGHT:
+            return_value += (total_height,)
+        if len(return_value) == 1:
+            return return_value[0]
+        return return_value
 
     @check_page
     def write(

diff --git a/fpdf/html.py b/fpdf/html.py
@@ -434,8 +434,10 @@ def handle_starttag(self, tag, attrs):
                     if self.table_line_separators
                     else "SINGLE_TOP_LINE"
                 )
+            align = attrs.get("align", "center").upper()
             self.table = Table(
                 self.pdf,
+                align=align,
                 borders_layout=borders_layout,
                 line_height=self.h * 1.30,
                 width=width,