v2.0.5 — hreflang false positives fixed (logic-lens scan)
Two false-positive sources in the hreflang audit caught by an end-to-end logic-lens scan on cloudflare.com.
Bug 1 — VALID_HREFLANG_RE rejected lowercase region codes
Strict BCP-47 says uppercase, but Google accepts both cases and major sites (Cloudflare, etc) use lowercase: de-de, fr-fr, zh-cn, pt-br. 262 false positives on cloudflare.com alone. Fix: re.IGNORECASE + accept both cases. en-USA and en_US still correctly rejected as invalid.
Bug 2 — hreflang_conflicts_lang_attr matched x-default as self-entry
On home pages, the canonical-lang entry AND x-default often point at the same URL. The next() iterator was sometimes picking x-default as self_entry, then "x-default".startswith("en") = False triggered a bogus conflict. 261 false positives on cloudflare.com alone. Fix: iterate self-matches, skip x-default, pick first real-language entry. Skip the conflict check when only x-default matches.
Result
cloudflare.com 534-page audit:
- v2.0.4: 936 findings, 23 check classes (
hreflang_invalid_codes: 262,hreflang_conflicts_lang_attr: 261) - v2.0.5: ~400 findings, 21 check classes (
hreflang_invalid_codes: 0,hreflang_conflicts_lang_attr: 0)
Remaining ~400 findings are all genuine: real sitemap-noindex entries, real JS redirects, real performance issues. Zero false positives target met for hreflang.