Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update guides generation to use Nokogiri's HTML5 parser #48521

Merged

Conversation

flavorjones
Copy link
Member

@flavorjones flavorjones commented Jun 19, 2023

Motivation / Background

Use Nokogiri::HTML5 to generate the guides. This doesn't meaningfully change the output, but will help ensure we're emitting valid HTML5 markup. A followup commit could reasonably rip out the W3C validator, I think.

Note that the most frequent changes to the output are:

  • attribute values, most often data-clipboard-text. TheHTML5 spec prescribes entity-escaping fewer characters in attribute values than libxml2 did, and wraps attribute values in double-quotes. In particular > and < are not escaped per the HTML5 spec which may be surprising to look at if you're used to libxml2's output.
  • linebreaks are different for some HTML elements, particularly lists and tables have fewer linebreaks between elements.

Detail

This Pull Request replaces calls to Nokogiri::HTML.fragment with a call to a new private method, html_fragment which uses either Nokogiri::HTML5.fragment if HTML5 available, else Nokogiri::HTML4.fragment.

Additional information

Part of an ongoing effort to update Rails to use HTML5.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

Note that the most frequent change to the output are:

- attribute values, most often data-clipboard-text. libgumbo
  entity-escapes fewer characters in attribute values than libxml2, and
  wraps them in double-quotes. In particulary `>` and `<` are not
  escaped per the HTML5 spec.
- linebreaks are different for some HTML elements, particularly lists.
@rails-bot rails-bot bot added the docs label Jun 19, 2023
@guilleiguaran guilleiguaran merged commit e05245d into rails:main Jun 20, 2023
8 of 9 checks passed
@flavorjones flavorjones deleted the flavorjones-update-rails-guides-to-html5 branch June 20, 2023 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants