Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indent HTML lists correctly (Issue 1073) #1170

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

lcgeneralprojects
Copy link

@lcgeneralprojects lcgeneralprojects commented May 17, 2024

Fixes #1073
Implements paragraph indentation via adjustments of pdf.x instead of using a natural number of whitespaces.
Breaks up list items into individual paragraphs.

Checklist:

  • The GitHub pipeline is OK (green),
    meaning that both pylint (static code analyzer) and black (code formatter) are happy with the changes of this PR.

  • A unit test is covering the code added / modified by this PR

  • This PR is ready to be merged

  • In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder

  • A mention of the change is present in CHANGELOG.md

Not sure how to actually name the HTML2FPDF.list_pseudo_margin attribute. It is used for determining the height of the \n line created when a <ul> or <ol> starting tag is handled.

Should the re-implementation of paragraph indentation be reflected in a doctstring, even though the tag_indents parameter of write_html() has not been touched? If so, where should it be placed?

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

Some debugging and polish needed.
Some variables need tweaking.
Needs testing.
Code reuse unsatisfactory.
Some variables need tweaking.
Needs testing.
Code reuse unsatisfactory.
Potentially significant issues with tests:
1. test_html_ln_outside_p - IndexError: list index out of range.
2. test_html_ol_ul_line_height - actual distance between lines differs slightly from expected.

Code reuse unsatisfactory.
Need feedback for handling <dd> and <blockquote>.

Potentially significant issues with tests:
1. test_html_ol_ul_line_height - actual distance between lines differs slightly from expected.

Need feedback for whether or not the new indentation that contradicts old tests is satisfactory.

Code reuse unsatisfactory.
Need feedback.
Bug present: bullets are made one per line instead of one per paragraph.
Saving progress before introducing a `Bullet` class.
Feature implemented.
Testing and adjustments of tests needed.
Prevented from `Paragraph.top_margin` being added to `pdf.y` of first lines of paragraphs with bullets.
Prevented from `Paragraph.top_margin` being added to `pdf.y` of first lines of paragraphs with bullets.
`<ul>` and `<ol>` tags now cause a creation of a paragraph with the string `\n` being used to generate a fragment of the height `list_pseudo_margin`.
Adjusted defaults for `li_tag_indent`.
fpdf/fpdf.py Outdated Show resolved Hide resolved
Changed `Paragraph.generate_bullet_frag()` into `generate_bullet_frag_and_tl`, and made it also generate the bullet text line.
Dealing with the issue of inappropriately large distance between `<dt>` and their child `<dd>` elements when `Paragraph.top_margin` is 0.
Changed `Paragraph.generate_bullet_frag()` into `generate_bullet_frag_and_tl`, and made it also generate the bullet text line.
@lcgeneralprojects lcgeneralprojects marked this pull request as ready for review May 20, 2024 07:37
@gmischler gmischler changed the title Issue 1073 Indent HTML lists correctly (Issue 1073) May 20, 2024
Copy link
Collaborator

@gmischler gmischler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work so far, but the devil is in the detail...

As I'm sure you've noticed, the interplay between the HTML parser, text regions, line wrapping, and rendering is non-trivial. I've added some pointers of how to fix the parts that don't quite add up yet.

fpdf/line_break.py Outdated Show resolved Hide resolved
fpdf/html.py Outdated Show resolved Hide resolved
fpdf/html.py Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/text_region.py Outdated Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/text_region.py Show resolved Hide resolved
fpdf/html.py Outdated Show resolved Hide resolved
test/html/test_html.py Show resolved Hide resolved
…`list_vertical_margin`.

Removed the `MultiLineBreak.indent` attribute.
Added a test for long `<ol>` bullets.
…ags_and_tl` method and in the `Bullet` class.
Edited `TextRegion.md` to reflect the introduced changes for `Paragraph`s.
Added tests for `Paragraph` generation in `test_html.py`
@lcgeneralprojects
Copy link
Author

Reformatted html.py using black manually.

@gmischler
Copy link
Collaborator

Now that the code has largely settled, let's look at the results.

  • html_blockquote_indent.pdf - I know this is a pre-existing test, but it would probably be good to add a blockquoted paragraph with several lines, just to show the effect, and also to prevent possible future regressions.

  • html_customize_ul.pdf - Small indents.

  • html_li_prefix_color.pdf - Small indents.

  • html_li_tag_indent.pdf - Small indents.

  • html_ln_outside_p.pdf - Small indents.

  • html_ol_ul_line_height.pdf - Small indents.

  • html_ul_type.pdf - Small indents.

  • html_tag_indent.pdf - Besides the width of the indent, this one would also benefit from a bit more (multi-line) context.

  • html_description.pdf - Has received a vertical margin it didn't have before. Is this now more "correct" (HTML specs / typical browser behaviour)?

  • html_features.pdf - Reflects changes already seen in other files.

  • html_long_list_entries - New. Demonstrates multiline list item indent.

  • html_long_ol_bullets - New. Demonstrates bullet left cut-off.

The indents for lists are now much smaller than before. The previous default indent depended on the font size (5 x width of NBSP). This now seems to have changed to 5 "document units". With the default of mm this is too small. With the document units set to eg. inches, it will be way too large. You will have to pick a reasonable size in mm (eg. 8 or 10), and then make sure that if the document units aren't mm, this value gets converted appropriately before being used. The documentation also must clearly state that tag_indents values have to be given in document units.
Warning: I'm unable to test this right now, but this is my conclusion looking at the code. You'll have to test what actually happens when the document units aren't mm, and obviously those tests need to be added to our permanent collection.
Maybe the simplest solution will be to whenever we assign a value from DEFAULT_TAG_INDENTS to self.tag_indents, to systematically convert it from mm to document units.

Note that top and bottom margins of Paragraph()s are also in document units (which the documentation currently doesn't make sufficiently clear).
You'll have to take the same precautions with those as with the indents. In most cases they are defined in terms of font height, which is fine, but there are some hard-coded "magic numbers" present in the code as well. Most of those will have been there before you started (you can probably blame me for some of them 😉), but this is a good opportunity to fix them.
The same applies to the last hardcoded self._ln(2) call within the <li> tag code. This also incorrectly assumes mm, and needs to be converted to document units.

The docstring for Paragraph() looks good. However, the end user will access this functionality through ParagraphCollectorMixin.paragraph(), so maybe it should rather go there?

@lcgeneralprojects
Copy link
Author

lcgeneralprojects commented May 30, 2024

Regarding tests with small indents, do you want me to pass tag_indents values into pdf.write_html() calls in tests where that argument is not used, in addition to adjusting default tag indentation handling?

Regarding the new margins between elements in html_description.pdf - Firefox and Edge both produce the margin with that HTML code.

I might not have time to confidently deal with the 'magic numbers' tonight, so I will likely be pushing the changes tomorrow, and not today. Should I just do the conversion into appropriate units with them? If so, would you prefer for me to intentionally change them a little in order for them to look nicer, or would you prefer a more exact conversion?

EDIT: Going to note that, currently, due to the HTML2FPDF._ln() call when handling <li> start tags, there will be a gap of the relevant size, even if the list is the first visible content of the document, in case there will be a need to eliminate it in the future.

…margin values in `html.py` to the chosen document unit of measurement.

Adjusted default tag indent values.
Moved the `Paragraph` docstring to the `ParagraphCollectorMixin.paragraph()` method.
Changed the `CustomPDF` class in `test_html_customize_ul` to have non-static attributes `li_tag_indent` and `ul_bullet_char`.
Adjusted tests.
@@ -270,8 +270,8 @@ def __init__(
self,
pdf,
image_map=None,
li_tag_indent=5,
dd_tag_indent=10,
li_tag_indent=30,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are still magic numbers (overly large now for mm), which will vary in their effect depending on the current document units.
pdf.k contains the scale factor between the current document units and typographical points (pica), which the PDF format uses internally. To get the conversion factor between an arbitrary unit and pica, you can use eg, util.get_scale_factor("mm").

You will also need test cases specifically for verifying that any constants defined internally really give the same result even when the user has set different document units.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdf.k contains the scale factor between the current document units and typographical points (pica), which the PDF format uses internally. To get the conversion factor between an arbitrary unit and pica, you can use eg, util.get_scale_factor("mm").

pdf.k is already defined using util.get_scale_factor, though.

Are there any cases where the needed conversion factor isn't going to be the same as the scale factor?

You will also need test cases specifically for verifying that any constants defined internally really give the same result even when the user has set different document units

I have already introduced test_html_measurement_units. Do I need to have tests for something not covered by it?

Copy link
Collaborator

@gmischler gmischler Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdf.k is already defined using util.get_scale_factor, though.

Yes of course, when the user sets the document units, then unit util.get_scale_factor() is used to populate self.k with the appropriate conversion factor. But if we define built-in constants in mm, then we need to use it directly, to make sure we're independent of the document units. It's a bit tricky to keep track of what unit any specific value is in, so we need to to be careful and validate as much as possible.

I have already introduced test_html_measurement_units

I didn't see that, will check it out.
edit: Now I see. They look the same, which is good. Actually, you could check against the same file with all units, to verify that automatically.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 tests are failing:

test_html_customize_ul
test_html_li_tag_indent
test_html_measurement_units

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough time to deal with things quite right now, so I will be tackling this stuff in a few hours instead.

Yes of course, when the user sets the document units, then unit util.get_scale_factor() is used to populate self.k with the appropriate conversion factor. But if we define built-in constants in mm, then we need to use it directly, to make sure we're independent of the document units. It's a bit tricky to keep track of what unit any specific value is in, so we need to to be careful and validate as much as possible

Where should we get the information about the units from, in that case? Should I introduce a separate FPDF.unit attribute?

Now I see. They look the same, which is good. Actually, you could check against the same file with all units, to verify that automatically

Any advice on how to do that elegantly?
Changing FPDF.k seems inelegant, but I don't see another way of handling this without introducing changes to FDPF. So, I take it, I should introduce FPDF.unit?

3 tests are failing: ...

My bad. I didn't do the tests this time.

EDIT: I don't seem to understand the task regarding the replacement of self.pdf.k with something else when converting constants within html.py.

Yes of course, when the user sets the document units, then unit util.get_scale_factor() is used to populate self.k with the appropriate conversion factor. But if we define built-in constants in mm, then we need to use it directly, to make sure we're independent of the document units

What do we pass to util.get_scale_factor() if we are supposed to not use document units here? What are we trying to convert them to if we aren't expressing the constants in the chosen document units.

If it is not the document units that we want to convert the constants to, then I'm at a loss as to what we are trying to do.

EDIT 2: Going to just update the test files for now.

Will be introducing the FPDF.unit property in the next commit, so that the measurement unit test files can be consolidated into a single one.

Not sure what is required regarding the replacement of self.pdf.k in html.py if we are not supposed to be converting the constants in accordance with the chosen document units, which is currently represented in FPDF objects with the scale factor FPDF.k. Unless stopped, I will be replacing relevant usages of self.pdf.k with get_scale_factor(self.pdf.unit) individually.
Again, though, I am confused as to why we should be replacing the usage of self.pdf.k with anything else there, so I'd like to receive further input regardless.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where should we get the information about the units from, in that case? Should I introduce a separate FPDF.unit attribute?

That would just be a redundant copy of pdf.k. You can convert between all units if you use points as an intermediate value.

  • From current units to points: x * pdf.k
  • From mm to points: x * util.get_scale_factor("mm")
  • From mm to current units: x * util.get_scale_factor("mm") / pdf.k
  • etc.
    It seems to me that you're actually giving the default values in points (30 pts ~= 12 mm). Is this intentional?

Now I see. They look the same, which is good. Actually, you could check against the same file with all units, to verify that automatically

Any advice on how to do that elegantly?

Just use the same file name for all units. They should result in identical output, after all. (If you're very unlucky though, then rounding errors will make them just a tiny bit different...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that you're actually giving the default values in points (30 pts ~= 12 mm). Is this intentional?

Yes. This way the conversion is just a simple division by self.pdf.k. I am still unsure why we want to replace that with division by util.get_scale_factor(), and what to pass to the function as a parameter, unless I am to introduce FPDF.unit. I do question the utility of making additional function calls when we already store the relevant information in self.pdf.k.

Just use the same file name for all units

The question is how to best go about it? Without storing information about units in FPDF, the solution seems to be to modify pdf.k, and, as you noted, there might be rounding errors.
Introducing the FPDF. unit property attribute with a setter that also modifies FPDF.k introduces some redundancy. I have also encountered some issues with this approach - got misaligned indents and failure to generate pages that use centimetres or inches.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that you're actually giving the default values in points (30 pts ~= 12 mm). Is this intentional?

Yes. This way the conversion is just a simple division by self.pdf.k

Readability (for humans) is much more important than computational efficiency in most cases, and especially for a conversion that happens at most once for every table (in some cases once per program run). And where converting a constant is really too expensive, it needs a comment explaining what unit it is and what that means in real world terms.
Actually, any dimensional constant needs a comment specifying its unit anyway (unless the name already makes it clear).

I am still unsure why we want to replace that with division by util.get_scale_factor()

I explained that several times already:

  • pdf.k is the conversion factor for the current document units. You don't know what the user will supply for those (could be anything), and you don't want to know.
  • You call util.get_scale_factor(<unit>) directly if you need the conversion factor for a fixed and known unit (for constants).

and what to pass to the function as a parameter

Have a look at the definition of util.get_scale_factor(). That might make things more clear.

Just use the same file name for all units

The question is how to best go about it?

Just don't vary the file name for each iteration. There's no problem at all to have several tests compare against the same file.

as you noted, there might be rounding errors.

Why do you let a hypothetical problem stop you from trying? This is a theoretical possibility, but since we only store a very limited number of decimal digits in the PDF file, differences through rounding errors are actually very rare (which is why I wrote "if you're very unlucky").

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained that several times already:
pdf.k is the conversion factor for the current document units. You don't know what the user will supply for those (could be anything), and you don't want to know.
You call util.get_scale_factor(<unit>) directly if you need the conversion factor for a fixed and known unit (for constants).

Do I understand it correctly, then, that I should introduce the attribute FPDF.unit, potentially as a property, and then convert default values of relevant variables with * get_scale_factor('mm') / get_scale_factor(self.pdf.unit)? This approach seems to currently do the same thing as * get_scale_factor('mm') / self.pdf.k while making this conversion potentially not cohere with the scale factor of the document itself. In addition, it seems to contradict what you said previously regarding the addition of the FPDF.unit attribute:

That would just be a redundant copy of pdf.k


Have a look at the definition of util.get_scale_factor(). That might make things more clear.

I have. We need to pass the chosen unit as a parameter, and currently there is no FPDF.attribute, meaning that to get the relevant unit, we would have to compare FPDF.k to possible return values of utils.get_scale_factor().

For the time being, I will assume that I misunderstood something, return the default values of relevant variables to what they were before, and convert them to document units with * get_scale_factor('mm') / self.pdf.k. If need be, I will implement conversion using get_scale factor() instead of self.pdf.k.

Why do you let a hypothetical problem stop you from trying?

Math brain, being used to preferring perfect precision.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I repeat: You don't need and want to care about what the document units actually are (or how they might be named). There's a reason those are not actually stored anywhere. All you need is the conversion factor stored in FPDF.k. This was set once using get_scale_factor(<user_supplied_unit>), and there's no point in repeating that.

You can use the same function to produce conversion factors for units of your own chosing.
Most size values in fpdf2 are stored and passed around in document units. So if you also store your constants in document units, that will save you many repeated conversions when they are used with other modules.

If you look at FPDF.__init__(), you'll find initializations like follows, converting the values from points, but with comments noting the mm/cm sizes:

    self.line_width = 0.567 / self.k  # line width (0.2 mm)
    ...
    # Page margins (1 cm)
    margin = (7200 / 254) / self.k

You can follow this pattern for simplicity, or you can do somethin like this:

    mm_to_doc_u = get_scale_factor("mm") / self.pdf.k 
    XXX = 123 * mm_to_doc_u  # this constant value
    YYY = 456 * mm_to_doc_u  # another constant value

Why do you let a hypothetical problem stop you from trying?

Math brain, being used to preferring perfect precision.

Alas, the real world is utterly imprecise... 😉
The default resolution of the PDF format (eg. for images) is 72 dpi. We position elements more accurately than that, but any difference below 0.01 mm or so has no practical relevance, because it is imperceptible to the human eye.

Minor changes to HTML contents of `test_html_measurement_units()`.
@@ -84,6 +84,10 @@ For more typographical control, you can use the following arguments. Most of tho
* line_height (float, optional) - factor by which the line spacing will be different from the font height. (default: by region)
* top_margin (float, optional) - how much spacing is added above the paragraph. No spacing will be added at the top of the paragraph if the current y position is at (or above) the top margin of the page. (Default: 0.0)
* bottom_margin (float, optional) - Those two values determine how much spacing is added below the paragraph. No spacing will be added at the bottom if it would result in overstepping the bottom margin of the page. (Default: 0.0)
* indent (float, optional): determines the indentation of the paragraph. (Default: 0.0)
* bullet_rel_x_displacement (float, optional) - determines the relative displacement of the bullet along the x-axis. The distance is between the rightmost point of the bullet to the leftmost point of the paragraph's text. (Default: 2.0)
Copy link
Collaborator

@gmischler gmischler Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "bullet_right_margin" describes more accurately what this does?

@@ -84,6 +84,10 @@ For more typographical control, you can use the following arguments. Most of tho
* line_height (float, optional) - factor by which the line spacing will be different from the font height. (default: by region)
* top_margin (float, optional) - how much spacing is added above the paragraph. No spacing will be added at the top of the paragraph if the current y position is at (or above) the top margin of the page. (Default: 0.0)
* bottom_margin (float, optional) - Those two values determine how much spacing is added below the paragraph. No spacing will be added at the bottom if it would result in overstepping the bottom margin of the page. (Default: 0.0)
* indent (float, optional): determines the indentation of the paragraph. (Default: 0.0)
* bullet_rel_x_displacement (float, optional) - determines the relative displacement of the bullet along the x-axis. The distance is between the rightmost point of the bullet to the leftmost point of the paragraph's text. (Default: 2.0)
* bullet_rel_y_displacement (float, optional) - determines the relative displacement of the bullet along the y-axis. The distance is between the topmost point of the bullet and the topmost point of the paragraph's text. (Default: 0.0)
Copy link
Collaborator

@gmischler gmischler Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And "bullet_top_margin" ?
Edit: I just noticed your earlier suggestion to remove this one entirely. That's a good idea, since it isn't really needed at the moment, and might stand in the way of other possible enhancements later.

…associated class attributes and docstring mentions to `bullet_r_margin`, `bullet_t_margin`, `bullet.r_margin`, `bullet.t_margin`.
@@ -352,7 +359,7 @@ def __init__(
# "inserted" is a special attribute indicating that a cell has be inserted in self.table_row

if not tag_indents:
tag_indents = {}
tag_indents = {k: v / self.pdf.k for k, v in DEFAULT_TAG_INDENTS.items()}
if dd_tag_indent != DEFAULT_TAG_INDENTS["dd"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to compare between different units. The user supplied argument is supposed to be in document units, but your currently store the default values in points. For the assignment below you then convert document units to points, which also doesn't looks correct.

You should probably compare against tag_indents instead and not convert the user input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the assignment below you then convert document units to points, which also doesn't looks correct

I'm not seeing that anywhere.

Although, it seems that I made some changes for naught tonight. More specifically, the conversion of li_tag_indent and dd_tag_indent. That was an error, as those values are only relevant if they differ from the defaults, meaning that they shouldn't be subject to this conversion. I'm going back on those changes.

As such, this note seems to not be correct:

Those are still magic numbers (overly large now for mm), which will vary in their effect depending on the current document units

as the generated files evidently do not differ with change in document units provided that those default values in question are used.

You should probably compare against tag_indents instead and not convert the user input

Assuming you are talking about the change in the if blocks for non-default li_tag_indent and dd_tag_indent, yeah, that was an error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing that anywhere.

You actually do. Your next paragraph explains exactly what the problem is. 👍

As such, this note seems to not be correct:

At that time it wasn't clear to me that the value of 30 is actually in points (problem of readability!).
That works for DEFAULT_TAG_INDENTS, but for li_tag_indent and dd_tag_indent the default unit is mm, which would result in ridiculously large indents.
Actually, since those two parameters are deprecated and their default value should never get used, it might make sense to default them to None, which will make it simpler to determine whether they have been set by the user.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually do. Your next paragraph explains exactly what the problem is

But there was conversion from points to document units, and not the other way around.
In any case, seems like that matter is resolved now anyway.

That works for DEFAULT_TAG_INDENTS, but for li_tag_indent and dd_tag_indent the default unit is mm, which would result in ridiculously large indents

Considering that we only want to deal with those if the user passes the relevant parameters, and in that case we want to handle them regardless of whether or not they are equal to relevant entries in DEFAULT_TAG_INDENTS, I'm making them have the value of None by default, so that's not going to be an issue going forward anyway.

Actually, since those two parameters are deprecated and their default value should never get used, it might make sense to default them to None

Yeah. And there was another reason to handle them that way anyway.

fpdf/html.py Show resolved Hide resolved
self.ol_type = [] # when inside a <ol> tag, can be "a", "A", "i", "I" or "1"
self.bullet = []
if list_vertical_margin is None:
self.list_vertical_margin = 0.3 / self.pdf.k
Copy link
Collaborator

@gmischler gmischler Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This converts 0.3 points (0.12 mm) to document units.

You may have seen self.font_size being converted like this a few times. The pitfall being that self.font_size is derived from pdf.font_size_pt, which actually is in points. We should probably rename self.font_size in this module also to self.font_size_pt to make this more obvious.

Besides the "magical number" problem in this specific instance, there are more cases where you do a similarly incorrect conversion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this conversion incorrect, and in what sense is it a 'magical number' if it's simply the default value for the vertical margin, considering that the conversion is handled?

EDIT: Do you want the default value of HTML2PDF.list_vertical_margin to depend on something else?

Copy link
Collaborator

@gmischler gmischler Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the conversion is intentional, but then the amount doesn't look like something you'd want.

I think the default here used to be 2 mm (implemented incorrectly without taking document units into account).

If you wanted to specify the same amount in points and convert it to document units for passing on, then 2 * util.get_scale_factor("mm") would result in ~5.67 points, not 0.3.
Your current value amounts to roughly a tenth of a mm, so you could just as well leave the margin away.

And again, a constant like that should come with a comment specifying its unit and how much that is in mm (most people are not aware of how big a typographical point is).

And no, I don't see this depending on any other size. What did you have in mind?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And again, a constant like that should come with a comment specifying its unit and how much that is in mm (most people are not aware of how big a typographical point is)

Given that there were no such comments prior to my intrusion, I assumed that I shouldn't be changing that. Where else should I introduce such comments, if you don't mind me asking?

And no, I don't see this depending on any other size. What did you have in mind?

I am/was confused about why we should make get_scale_factor() calls instead of just using self.pdf.k. Do I understand it correctly now that you want me to return the relevant default values to how they were prior to my changes, and to convert them with * get_scale_factor('mm') / self.pdf.k into the appropriate units?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there were no such comments prior to my intrusion, I assumed that I shouldn't be changing that. Where else should I introduce such comments, if you don't mind me asking?

This module still contains a lot of legacy code of rather low quality. Any improvement to that is welcome. So far we have just fixed the parts we were touching anyway when adding features. I recommend you do the same, in order to keep the focus on your actual goal. At some later point someone will have to go through every line with a fine toothed comb, but that is a separate and rather tedious task, requiring a good grasp of the whole code base.

I've commented on recommended unit use further above.

@lcgeneralprojects
Copy link
Author

Actually, I see an issue with how the default values for li_tag_indent and dd_tag_indent are handled in my code. I am currently unable to dedicate time to fixing it, so that will have to wait until tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long HTML list entries are not correctly indented
3 participants