Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that header component HTML tags pass HTML validation #235

Merged
merged 7 commits into from
Mar 14, 2024

Conversation

rich-iannone
Copy link
Member

There were unclosed tags when generating a header. This generally causes problems when the output HTML is consumed by another process (expectation is that the input HTML should be valid; browsers can handle invalid HTML to great extent, other programs cannot really).

This PR fixes this issue and also performs a few HTML formatting cleanups at the same time. The output HTML was verified in the Nu HTML checker tool (at https://validator.w3.org/nu) to have been improved from the changes here.

Fixes: #224

And this should fix a user issue in quarto-dev/quarto-cli#8971.

@rich-iannone rich-iannone marked this pull request as ready for review March 8, 2024 17:35
@github-actions github-actions bot temporarily deployed to pr-235 March 8, 2024 19:55 Destroyed
@machow
Copy link
Collaborator

machow commented Mar 13, 2024

Can you paste in what you saw that indicated something was fixed? Either the output of a tiny reprex, or a screenshot of what you looked at that indicated things were fixed?

@machow
Copy link
Collaborator

machow commented Mar 13, 2024

When reviewing this PR, I see 3 changed snapshots. Without guidance in the PR description, it's not clear which ones I should be looking at (and where in the snapshot I should be looking).

For example, this is what I see for test_formats.ambr. I'm not sure if I'm supposed to scan it manually, or if most of the changes here are just linebreak related):

image

<th colspan="{n_cols_total}" class="gt_heading gt_title gt_font_normal">{title}
</tr>"""
)

if has_subtitle:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I'm adding the closing </th> tags and also improving somewhat the formatting of the output HTML (makes it easier to visually inspect the generated HTML). Snapshots had to be regenerated for this, though it had to happen anyway because of the addition of closing tags.

@@ -21,6 +21,10 @@ def create_heading_component_h(data: GTData) -> StringBuilder:
if not has_title and not has_subtitle:
return result

# Raise an error if there is a subtitle but no title
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an additional check here to raise in the case that subtitle text is available in the absence of title text. While this may be also caught within the .tab_header() method, I think it's good to have a second check here in case there is manual addition/removal of title/subtitle (manipulations done outside the prescribed API).

@@ -154,8 +162,8 @@ def create_columns_component_h(data: GTData) -> str:
)
)

# Join the <th> cells into a string and separate each with a newline
th_cells = "\n".join([str(tag) for tag in table_col_headings])
# Join the <th> cells into a string and begin each with a newline
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An opportunity was taken here to beautify the HTML output of the column labels portion since snapshots had to be regenerated anyway.

@rich-iannone
Copy link
Member Author

There are generally quite a few HTML validation problems with the generated output but this PR does address the closing of <th> values in the title block. Here are before and after screenshots of the changes.

Before PR changes:

html-before

After:

html-after

I was going to attach a screenshot of the HTML checker results but since there are quite a few problems to be resolved (beyond the scope of this PR) the failure of not having closed <th> is masked by preceding validation failures.

<th colspan="4" class="gt_heading gt_title gt_font_normal">Data listing from <strong>exibble</strong>
<thead class="gt_header">
<tr>
<th colspan="4" class="gt_heading gt_title gt_font_normal">Data listing from <strong>exibble</strong></th>
Copy link
Collaborator

@machow machow Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the test that changed, based on this fix. There is a closing </th> element now in this title (and subtitle below).

<th colspan="4" class="gt_heading gt_subtitle gt_font_normal gt_bottom_border"><code>exibble</code> is a <strong>Great Tables</strong> dataset.</th>
</tr>
</thead>
<tr class="gt_col_headings">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the rest of snapshot test changes are just stylistic

@machow
Copy link
Collaborator

machow commented Mar 14, 2024

Thanks, this is super helpful! Based on these notes, I was able to find the snapshot that updated with the added </th>.

Copy link
Collaborator

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@machow machow merged commit 420d5f2 into main Mar 14, 2024
7 checks passed
@rich-iannone rich-iannone deleted the header-html-fixes branch March 14, 2024 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Table header (by way of .tab_header()) generates malformed and invalid HTML
2 participants