You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The probe cycle's performance envelope — what constitutes a normal cycle (request count, timing, error rate), what thresholds indicate degradation, and what operational responses are appropriate — exists only as the sole contributor's institutional knowledge. With a bus factor of one, this knowledge has no durable representation. A future operator inheriting the service would have no quantitative reference for the system's most operationally sensitive component: the HEAD request volume against isocpp.org.
Acceptance Criteria
Create docs/probe-operations.md documenting: normal request count per cycle (1,600-2,000), expected cycle duration, hot/cold split ratio, configurable parameters and their effects, and degradation indicators
Document the HTTP_CONCURRENCY, POLL_INTERVAL_SECONDS, and POLL_OVERRUN_COOLDOWN_SECONDS settings with their operational implications and recommended ranges
Include a "What to do if..." troubleshooting section: cycle takes >X minutes, error rate exceeds Y%, isocpp.org returns 429s
Add structured logging (or Prometheus metrics if applicable) that emits per-cycle summary stats: request count, success/error counts, wall-clock duration
In ISOProber._probe_one (sources.py): after parsedate_to_datetime, if last_modified.tzinfo is None, normalize to UTC (replace(tzinfo=timezone.utc)) before comparing to datetime.now(timezone.utc) (document in a short comment).
On parse failure or post-parse errors in that block: set last_modified = None, is_recent = True, bucket as hit_no_lm (same as absent header; comment that bad LM is intentionally merged with no-LM to avoid silent drops).
Update ProbeHit docstring in models.py so “recent” explicitly includes missing or unusable Last-Modified.
Invert tests/test_sources.pytest_probe_one_bad_last_modified_header to expect is_recent is True and last_modified is None; add a test for naive-but-parseable LM within the alert window if practical.
Implementation Notes
This item complements item 2 (benchmark) but focuses on operational documentation rather than automated testing. The structured logging addition is the highest-value code change: emit a single log line per cycle with {"cycle_requests": N, "cycle_duration_s": X, "errors": M, "hot_probes": H, "cold_probes": C}. This gives operators (and future Grafana/Datadog integrations) a machine-readable performance record without requiring a full observability stack.
Problem
The probe cycle's performance envelope — what constitutes a normal cycle (request count, timing, error rate), what thresholds indicate degradation, and what operational responses are appropriate — exists only as the sole contributor's institutional knowledge. With a bus factor of one, this knowledge has no durable representation. A future operator inheriting the service would have no quantitative reference for the system's most operationally sensitive component: the HEAD request volume against isocpp.org.
Acceptance Criteria
docs/probe-operations.mddocumenting: normal request count per cycle (1,600-2,000), expected cycle duration, hot/cold split ratio, configurable parameters and their effects, and degradation indicatorsHTTP_CONCURRENCY,POLL_INTERVAL_SECONDS, andPOLL_OVERRUN_COOLDOWN_SECONDSsettings with their operational implications and recommended rangesBugfix bundle —
Last-Modifiedhandling in probes (paperscout_bugfix_bundle_27f91caa.plan.md §2)ISOProber._probe_one(sources.py): afterparsedate_to_datetime, iflast_modified.tzinfo is None, normalize to UTC (replace(tzinfo=timezone.utc)) before comparing todatetime.now(timezone.utc)(document in a short comment).last_modified = None,is_recent = True, bucket ashit_no_lm(same as absent header; comment that bad LM is intentionally merged with no-LM to avoid silent drops).ProbeHitdocstring inmodels.pyso “recent” explicitly includes missing or unusableLast-Modified.tests/test_sources.pytest_probe_one_bad_last_modified_headerto expectis_recent is Trueandlast_modified is None; add a test for naive-but-parseable LM within the alert window if practical.Implementation Notes
This item complements item 2 (benchmark) but focuses on operational documentation rather than automated testing. The structured logging addition is the highest-value code change: emit a single log line per cycle with
{"cycle_requests": N, "cycle_duration_s": X, "errors": M, "hot_probes": H, "cold_probes": C}. This gives operators (and future Grafana/Datadog integrations) a machine-readable performance record without requiring a full observability stack.References
src/paperscout/sources.py(ISOProber),src/paperscout/config.py,src/paperscout/monitor.py