Skip to content

Investigate domains with high/total failure rates #319

@simonsmallchua

Description

@simonsmallchua

Summary

Several domains show consistently high failure rates across multiple independent job runs. These are distinct from crash-related failures (which affect all domains equally) — these domains fail repeatedly even under normal operating conditions.

Suspect domains

Near-total blocks (0 or ~0 completed)

Domain Runs Pattern
glossier.com 3 runs 0 completed each, 14K–54K tasks
kmart.com.au 2 runs 1 completed each, 127K tasks
clerk.com 1 run 2,717/2,719 failed
openai.com 1 run 18/1,088 completed, 60s adaptive delay

Consistently high failure rates

Domain Runs Pattern
smashingmagazine.com 3 runs 0/7785, 498/13825, ~505/36551
creativebloq.com 3 runs Heavy failures each run
sitepoint.com 2 runs ~30–33% failure rate
changelog.com 1 run 9,326/22,733 failed (41%)
sidebar.io 1 run 2,696/7,899 failed (34%)
render.com 1 run 559/10,649 failed
fly.io 2 runs 1,919/8,236 and 53/1,990 failed

Likely causes to investigate

  • Bot protection / Cloudflare challenge pages (glossier, kmart, smashingmagazine)
  • JS-rendered content the crawler can't parse
  • Aggressive rate limiting causing task timeouts
  • Redirect loops or auth walls on key URL patterns
  • Invalid/unexpected content types being enqueued

Out of scope

Domains with 0 completed that only failed once during the 2026-04-10 system incident are excluded — those are crash artifacts, not domain issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions