Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch link titles during build to populate a[title] #63

Open
numist opened this issue May 31, 2023 · 2 comments
Open

Fetch link titles during build to populate a[title] #63

numist opened this issue May 31, 2023 · 2 comments

Comments

@numist
Copy link
Owner

numist commented May 31, 2023

Javascript can't cross origins, but Ruby sure can.

Populating a[title] at build-time with the titles of the link targets would significantly help with accessibility. Bonus points if it updates the Markdown (only in dev?) so the results can be committed.

This infrastructure could also be a good foundation for validating external links in our neverending battle against bit rot.

@numist
Copy link
Owner Author

numist commented Jul 4, 2023

Doing this on every Jekyll build might be a bit much, especially if it also checks external links for liveness; maybe this is a good candidate for a script? Then it could:

  • Run incrementally against only external links in dirtied files
  • Be invoked in a targeted way by Jekyll when it comes across a link without a title
  • Run on every build to check all internal links for liveness

@numist
Copy link
Owner Author

numist commented Apr 27, 2024

Restricting to Jekyll.env == "development" (which is read from JEKYLL_ENV) is probably good enough.

Possible starting point:

require 'nokogiri'
require 'open-uri'

Jekyll::Hooks.register :site, :after_reset do |site|
  return unless Jekyll.env == "development"  # Run only in development environment

  site.pages.each do |page|
    next unless page.path.end_with?(".md")

    filename = File.join(site.source, page.path)
    content = File.read(filename)
    updated_content = content.gsub(/\[([^\]]+)\]\((http[^)"\s]+)\)/) do |match|
      text = $1
      url = $2
      title = fetch_title(url)
      title ? "[#{text}](#{url} \"#{title}\")" : match
    end

    # Write changes back to the disk only if there were changes
    File.write(filename, updated_content) if content != updated_content
  end
end

def fetch_title(url)
  doc = Nokogiri::HTML(URI.open(url))
  doc.at('title')&.text.strip
rescue
  nil
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant