Skip to content

ci: handle <sitemapindex> in check-links workflow#5334

Merged
davidkonigsberg merged 1 commit intomainfrom
devin/1777896104-fix-check-links-sitemapindex
May 4, 2026
Merged

ci: handle <sitemapindex> in check-links workflow#5334
davidkonigsberg merged 1 commit intomainfrom
devin/1777896104-fix-check-links-sitemapindex

Conversation

@davidkonigsberg
Copy link
Copy Markdown
Contributor

Summary

The Check Links workflow (runs) has been failing since 2026-05-04 with:

No links were found. This usually indicates a configuration error.

Root cause: https://buildwithfern.com/learn/sitemap.xml recently changed shape from a flat <urlset> to a <sitemapindex> that points to per-language sub-sitemaps:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap><loc>https://buildwithfern.com/learn/sitemap-en.xml</loc></sitemap>
  <sitemap><loc>https://buildwithfern.com/learn/sitemap-zh.xml</loc></sitemap>
</sitemapindex>

The previous step extracted <loc> entries directly, so urls.txt ended up containing only the two child-sitemap URLs (XML files), not the actual page URLs. lychee then fetched those XML files, found 0 HTML links, and bailed out with failIfEmpty.

Fix: Recursively expand <sitemapindex> entries into their child sitemaps' <loc> URLs before writing urls.txt. Behaviour is unchanged for the legacy flat-<urlset> shape, so this is forward-compatible whether the platform serves an index or a flat sitemap.

Verified locally — the new logic produces 325 URLs from the current sitemapindex (vs. 2 before), and exits with a clear error if the sitemap is genuinely empty instead of silently producing an empty urls.txt.

Review & Testing Checklist for Human

  • Trigger the workflow manually via workflow_dispatch on this branch and confirm it gets past the Check non-GitHub links step (i.e. lychee's "Total" is non-zero).
  • Confirm the run completes without the failIfEmpty error.

Notes

While diagnosing this, I noticed a separate issue worth flagging: https://buildwithfern.com/learn/sitemap-en.xml is currently returning an empty <urlset>, while sitemap-zh.xml correctly contains 325 URLs. With this PR's fix, the workflow will exercise the Chinese pages (which still cover the same docs structure), but English pages won't be exercised until the platform is fixed. Worth investigating in fern-platform separately — likely related to the same change that introduced the sitemapindex.

Link to Devin session: https://app.devin.ai/sessions/f71d9cbb5efc401799576519b960ef05
Requested by: @davidkonigsberg

Co-Authored-By: David Konigsberg <davidakonigsberg@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

@davidkonigsberg davidkonigsberg merged commit 56c0656 into main May 4, 2026
2 of 3 checks passed
@davidkonigsberg davidkonigsberg deleted the devin/1777896104-fix-check-links-sitemapindex branch May 4, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants