Skip to content

v2.0.5 — hreflang false positives fixed (logic-lens scan)

Choose a tag to compare

@adityaarsharma adityaarsharma released this 05 Jun 04:35
· 18 commits to main since this release

Two false-positive sources in the hreflang audit caught by an end-to-end logic-lens scan on cloudflare.com.

Bug 1 — VALID_HREFLANG_RE rejected lowercase region codes

Strict BCP-47 says uppercase, but Google accepts both cases and major sites (Cloudflare, etc) use lowercase: de-de, fr-fr, zh-cn, pt-br. 262 false positives on cloudflare.com alone. Fix: re.IGNORECASE + accept both cases. en-USA and en_US still correctly rejected as invalid.

Bug 2 — hreflang_conflicts_lang_attr matched x-default as self-entry

On home pages, the canonical-lang entry AND x-default often point at the same URL. The next() iterator was sometimes picking x-default as self_entry, then "x-default".startswith("en") = False triggered a bogus conflict. 261 false positives on cloudflare.com alone. Fix: iterate self-matches, skip x-default, pick first real-language entry. Skip the conflict check when only x-default matches.

Result

cloudflare.com 534-page audit:

  • v2.0.4: 936 findings, 23 check classes (hreflang_invalid_codes: 262, hreflang_conflicts_lang_attr: 261)
  • v2.0.5: ~400 findings, 21 check classes (hreflang_invalid_codes: 0, hreflang_conflicts_lang_attr: 0)

Remaining ~400 findings are all genuine: real sitemap-noindex entries, real JS redirects, real performance issues. Zero false positives target met for hreflang.