Probe Volume Envelope is Undocumented Institutional Knowledge

#### Problem

The probe cycle's performance envelope — what constitutes a normal cycle (request count, timing, error rate), what thresholds indicate degradation, and what operational responses are appropriate — exists only as the sole contributor's institutional knowledge. With a bus factor of one, this knowledge has no durable representation. A future operator inheriting the service would have no quantitative reference for the system's most operationally sensitive component: the HEAD request volume against isocpp.org.

#### Acceptance Criteria

- [ ] Create `docs/probe-operations.md` documenting: normal request count per cycle (1,600-2,000), expected cycle duration, hot/cold split ratio, configurable parameters and their effects, and degradation indicators
- [ ] Document the `HTTP_CONCURRENCY`, `POLL_INTERVAL_SECONDS`, and `POLL_OVERRUN_COOLDOWN_SECONDS` settings with their operational implications and recommended ranges
- [ ] Include a "What to do if..." troubleshooting section: cycle takes >X minutes, error rate exceeds Y%, isocpp.org returns 429s
- [ ] Add structured logging (or Prometheus metrics if applicable) that emits per-cycle summary stats: request count, success/error counts, wall-clock duration

**Bugfix bundle — `Last-Modified` handling in probes ([paperscout_bugfix_bundle_27f91caa.plan.md](paperscout_bugfix_bundle_27f91caa.plan.md) §2)**

- [ ] In `ISOProber._probe_one` (`sources.py`): after `parsedate_to_datetime`, if `last_modified.tzinfo is None`, normalize to UTC (`replace(tzinfo=timezone.utc)`) before comparing to `datetime.now(timezone.utc)` (document in a short comment).
- [ ] On parse failure or post-parse errors in that block: set `last_modified = None`, `is_recent = True`, bucket as **`hit_no_lm`** (same as absent header; comment that bad LM is intentionally merged with no-LM to avoid silent drops).
- [ ] Update `ProbeHit` docstring in `models.py` so “recent” explicitly includes missing **or** unusable `Last-Modified`.
- [ ] Invert `tests/test_sources.py` `test_probe_one_bad_last_modified_header` to expect `is_recent is True` and `last_modified is None`; add a test for naive-but-parseable LM within the alert window if practical.

#### Implementation Notes

This item complements item 2 (benchmark) but focuses on operational documentation rather than automated testing. The structured logging addition is the highest-value code change: emit a single log line per cycle with `{"cycle_requests": N, "cycle_duration_s": X, "errors": M, "hot_probes": H, "cold_probes": C}`. This gives operators (and future Grafana/Datadog integrations) a machine-readable performance record without requiring a full observability stack.

#### References

- Eval finding: Compound-5 (Unobservable Probe Volume), T33 + T38
- Related files: `src/paperscout/sources.py` (ISOProber), `src/paperscout/config.py`, `src/paperscout/monitor.py`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probe Volume Envelope is Undocumented Institutional Knowledge #39

Problem

Acceptance Criteria

Implementation Notes

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Probe Volume Envelope is Undocumented Institutional Knowledge #39

Description

Problem

Acceptance Criteria

Implementation Notes

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions