Skip to content

single-page-sample diagnostic should apply a score cap #73

@dacharyc

Description

@dacharyc

Summary

When the single-page-sample diagnostic fires, the overall score can still land in the B range despite no meaningful page-level signal. The diagnostic is severity: warning and only marks page-level checks as notApplicable (excluding them from numerator and denominator). This leaves the overall score driven by a tiny subset of checks, none of which reflect site-wide agent-friendliness.

Concrete case

Scoring https://docs.readthedocs.com/platform/stable/ produced:

  • Overall: 81 (B)
  • Diagnostics: single-page-sample (warning)
  • Cap: none

The 81 was computed from only 3 checks out of 19:

Check Earned/Max
llms-txt-exists 10/10
llms-txt-valid 0/4
llms-txt-size 7/7
Total 17/21 = 81%

Every other check (markdown availability, page size, content structure, URL stability, observability, auth) was excluded as notApplicable. The 81 effectively scores the structural validity of the llms.txt file itself, not the documentation site.

Why this site falls through the existing safety nets

The site has a real llms.txt at /llms.txt that's the right size but uses non-spec-compliant link syntax (- Name: url instead of - [Name](url): description). So llms-txt-exists and llms-txt-size pass, leaving only the low-weight llms-txt-valid (weight 4) to fail. The existing llms-txt-exists critical-check cap doesn't fire because the file exists. llms-txt-valid isn't a critical check, so its failure doesn't trigger a cap. single-page-sample is severity: warning, so it doesn't cap either. Net result: a site where afdocs can reach exactly 1 page lands at 81 (B).

By contrast, sites that fire single-page-sample and lack any llms.txt at all are rescued — llms-txt-exists failing either zeroes the numerator entirely or triggers its critical cap. Sites that have a structurally invalid llms.txt aren't covered.

Proposed fix

Have single-page-sample apply a score cap when it fires. A few options:

  1. Cap at F threshold (~59): matches the existing llms-txt-exists cap pattern. Conveys "this score is unreliable" via the letter grade.
  2. Cap at a lower value (e.g., 40): stronger signal that the result shouldn't be trusted.
  3. Elevate severity to critical: requires adding cap-via-diagnostic plumbing if not already present.

Option 1 feels most consistent with how the codebase already handles other "we can't actually evaluate this site" failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions