Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ‘©β€πŸ’» Improve HTML processing #680

Merged
merged 2 commits into from
Oct 17, 2023
Merged

πŸ‘©β€πŸ’» Improve HTML processing #680

merged 2 commits into from
Oct 17, 2023

Conversation

rowanc1
Copy link
Collaborator

@rowanc1 rowanc1 commented Oct 16, 2023

This improves a lot of the functionality around HTML processing to get it into the AST before any other transformations. We can now have things like <figure><img src=""><figcaption>The caption</figcaption></figure> which will be processed into the correct container. This can also work with HTML links directly to that caption, which will be processed as normal.

The IDs are required to be on the figure, not the image, but that seems reasonable.

cc @dellaert who is motivated by having HTML that works in google collab, but also works with MyST for LaTeX export.

@fwkoch there are probably a number of other changes incoming for the HTML processing along these lines, but I think we can get them in iteratively.

Fixes:

@rowanc1 rowanc1 requested a review from fwkoch October 16, 2023 21:53
Comment on lines +177 to +183
{
type: 'html',
value: '<hr>',
},
{
type: 'html',
value: '<br>',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-closing HTML values are now correctly added.

Comment on lines +101 to +106
selectAll('paragraph > htmlParsed', tree).forEach((parsed) => {
const node = parsed as GenericParent;
if (node?.children?.length === 1 && node.children[0].type === 'paragraph') {
node.children = node.children[0].children as GenericNode[];
}
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML is by default added into a paragraph, if it is already contained in one -- this strips it.

Copy link
Collaborator

@fwkoch fwkoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - Any implications of pulling in the reconstructHtmlPlugin to the default behaviour? Probably not - it will just make the htmlPlugin better πŸš€. Other than that, this PR adds: βœ… more custom handling of html, βœ… better self closing tags, and βœ… small paragraph nesting tweak. Looks good!

@rowanc1 rowanc1 merged commit 8bd4ee2 into main Oct 17, 2023
3 checks passed
@rowanc1 rowanc1 deleted the feat/html branch October 17, 2023 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants