Skip to content

Fix HTML5 spec violations and flip validator to hard-fail (closes #37)#42

Merged
jmxpearson merged 1 commit into
masterfrom
claude/issue-37-html5-validator-tightening
May 12, 2026
Merged

Fix HTML5 spec violations and flip validator to hard-fail (closes #37)#42
jmxpearson merged 1 commit into
masterfrom
claude/issue-37-html5-validator-tightening

Conversation

@jmxpearson
Copy link
Copy Markdown
Member

Closes #37.

Summary

Validate HTML in site-health.yml was continue-on-error: true because the templates produced a flood of HTML5 spec warnings. Triaged the actual output of vnu (Nu HTML Checker, the same tool the action wraps) and fixed the real bugs. Now the step hard-fails on regressions.

What was actually broken

Layouts:

  • default.html, blogpost.html, home.html had no <!DOCTYPE html> (cascade cause of many parser errors)
  • All three put {% include footer.html %} outside <body> (invalid)
  • _layouts/home.html had literal comma typos between img attributes (<img src=", width=", style="..."), width="25%" (percent not allowed on width attr), missing alt, and bottom: 20 (no unit)

Includes:

  • head.html was emitting two <title> tags — one via {% seo %} (jekyll-seo-tag) and one hardcoded. Removed the duplicate. Also added a page.extra_head injection point.
  • jumbotron.html and blog_image.html used the obsolete <font> element — replaced with CSS.
  • _includes/person.html was missing alt on the avatar — added alt={{ include.name }}.

Pages:

  • about.md, location.md (4 photos): added alt text, swapped invalid width="200px" / width="500" for sensible equivalents.
  • people.html, publications.html: moved inline <style> blocks (forbidden as child of <div> per HTML5) to css/people.css and css/publications.css, loaded via the new page.extra_head mechanism.
  • blog.html: was wrapping {{ post.excerpt }} in <p>...</p> even though Jekyll already wraps excerpts in <p>, producing <p><p>...</p></p>. Removed the redundant wrapper.

Blog posts whose images leak into the blog index excerpt (5 posts): added alt text and converted width="X%" to inline style. These had to be fixed even though some are in the legacy archive, because the index aggregates excerpts.

Config

  • .html5validator.yaml: blacklist: [2015, 2016, 2017, 2018] skips the legacy archive (matches any directory with those names; covers both _site/blog/2015/... and _site/2015/... from a few placeholder posts).
  • site-health.yml: dropped continue-on-error: true from Validate HTML.

Test plan

  • bundle exec jekyll build clean
  • vnu run against all non-legacy HTML files → 0 errors (was 53 before fixes, then 1)
  • lychee still passes (89 OK / 0 errors)
  • Pre-commit hooks pass on the diff
  • CI pre-commit and site-health pass on the PR
  • Visual spot-check post-merge: home, people, research, location, publications, blog index — make sure the CSS migration didn't change anything noticeable

Notes

  • The 3 "Sample Blog Post" placeholder posts under _posts/2015-11-02-sample_blogpost.md, _posts/2015-11-03-sample_blogpost.md, _posts/2015-11-03-sample_blogpost2.md are pure scaffolding ("Blog post blog post. Blog post blog post.") that ended up at _site/2015/.../ because their category: is commented out. Excluded via the blacklist for now — could be deleted in a follow-up.
  • HTML5 lets <style> live in <head> only; placing it in <body> flow content is invalid even when not nested in a <div>. The page.extra_head mechanism added here also unlocks per-page meta tags, link rels, etc. if needed in the future.

https://claude.ai/code/session_01S5QXfkxZBNSAf2Y1XAD8H7


Generated by Claude Code

Real HTML5 bugs fixed (per vnu Nu HTML Checker output):

Layouts (_layouts/*.html):
- Add <!DOCTYPE html> to default.html, blogpost.html, and home.html
  (missing doctype was the cascade cause of many parser errors).
- Move {% include footer.html %} inside <body> (was placed outside).
- home.html: fix attribute comma typos (img src=, width=, style= had
  literal commas between attributes), replace width="25%" with inline
  style, add alt text, fix "bottom: 20" missing unit.

Includes (_includes/*.html):
- head.html: remove duplicate <title> tag (jekyll-seo-tag already
  injects one via {% seo %}); add page.extra_head injection point.
- jumbotron.html: replace obsolete <font> element with inline style.
- blog_image.html: replace <font size="1"> and align="bottom" with
  caption-side / font-size CSS.
- person.html: add alt={{ include.name }} to the avatar img.

Pages:
- about.md, location.md (4 photos): add alt text, replace width="200px"
  / width="500" attrs with sane equivalents.
- people.html, publications.html: move inline <style> blocks (which
  HTML5 forbids inside <div>) to dedicated css/people.css and
  css/publications.css, loaded via page.extra_head front-matter.
- blog.html: drop redundant <p>...</p> wrapping around post.excerpt
  (Jekyll already wraps excerpts in <p>, producing invalid <p><p>...).

Blog posts whose <img> tags leak into the blog index excerpt:
- 2015-11-06-announcing_plab.md, 2015-11-13-big-data-nih.md,
  2018-10-30-high-throughput-legal-decisions.md,
  2018-12-5-incubator-award.md, 2019-7-26-huang-poster.md: add alt
  attributes and convert width="X%" to inline style.

Config:
- .html5validator.yaml: blacklist directory names 2015/2016/2017/2018
  to skip the legacy archive (both _site/blog/2015/... and _site/2015/
  exist; blacklist matches dir names anywhere).
- site-health.yml: remove continue-on-error: true from the Validate
  HTML step — now hard-fails on new regressions.

Local: 0 vnu errors across active pages; 0 lychee errors.
@jmxpearson jmxpearson merged commit dee61d5 into master May 12, 2026
2 checks passed
jmxpearson pushed a commit that referenced this pull request May 12, 2026
The home layout's body had `background-size: 80%` with no height set.
In quirks mode (pre-PR-#42, when no doctype existed) body filled the
viewport by default, so 80% scaled the [λ] logo nicely against the
full window. After adding DOCTYPE in #42 standards mode kicked in,
body shrinks to content height, and 80% scales against a much smaller
body — which pushed the floated DUSOM image into visual overlap with
the (now smaller) central [λ].

Setting min-height: 100vh restores the original visual.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tighten HTML5 validation in site-health CI

2 participants