Bugfix/config benchmark lockfile types by henry0816191 · Pull Request #40 · cppalliance/paperscout

henry0816191 · 2026-05-18T19:00:46Z

Summary

Issue 34

src/paperscout/storage.py: Typed matches_for_users, inner matched_papers / matched_hits, and public APIs using dict[str, Any] where needed; used cast for JSON/cache returns.
src/paperscout/models.py: Safer Paper.from_index_entry with WARNING on bad/unknown type; dict[str, Any] entry typing.
src/paperscout/health.py: Typed get_disc callback for mypy.
tests/test_models.py: Test for unknown type.
pyproject.toml: Dev extras mypy, pytest-benchmark; [tool.mypy] + import overrides.

Issue 33 (lockfile + Docker + CI)

Dockerfile: Multi-stage build with ghcr.io/astral-sh/uv:0.8.3, uv sync --frozen --no-dev, copy .venv into runtime image.
.github/workflows/ci.yml: lockfile job, uv on lint/test, workflow env for Slack placeholders + _PAPERSCOUT_TESTING=1, benchmark job running pytest benchmarks/ -m benchmark.
.github/workflows/cd.yml: Same uv install + env for the test step.
.github/dependabot.yml: package-ecosystem: uv (replaced pip for root deps).
CONTRIBUTING.md & run: uv workflow and lockfile maintenance.

Issue 32 (benchmarks)

benchmarks/: conftest.py (CLI options, env, fake_pool), test_probe_cycle.py (mock httpx.MockTransport, median-of-3 with fresh FakePool per iteration), benchmarks/baseline.json.
docs/probe-performance.md: Explains metrics, knobs, baseline updates.

Issue 31 (config vs transient failures)

src/paperscout/errors.py: FailureCategory.CONFIGURATION, ConfigurationError.
src/paperscout/config.py: @model_validator requiring Slack tokens unless _PAPERSCOUT_TESTING=1.
tests/conftest.py & benchmarks/conftest.py: Set _PAPERSCOUT_TESTING and default Slack env before importing paperscout.
src/paperscout/main.py: Slack check after logging (mirrors DATABASE_URL).
src/paperscout/monitor.py: except ConfigurationError → log.critical + return (no sleep / no retry).
tests/test_config.py & tests/test_monitor.py: New tests.

Checks run

pytest tests/, pytest benchmarks/ -m benchmark, mypy src/, ruff check/format, uv lock --check, pre-commit run --all-files.

Note

Docker wasn’t available in this environment, so the image wasn’t built here; the Dockerfile follows the standard uv sync --frozen pattern.

Issue Close

close #31
close #32
close #33
close #34

Summary by CodeRabbit

New Features
- Offline probe-cycle performance benchmark with CI regression gate.
- Startup now enforces Slack credentials and fails fast when missing.
Improvements
- Migrate CI, dev tooling, Docker, and local scripts to a reproducible dependency runner.
- Stronger input validation, typing, and defensive parsing across the app.
Documentation
- Added probe-performance guide and updated contributor/testing instructions.
Tests
- Added tests for config validation, model parsing, monitor halting, and benchmarks.
Chores
- Dependabot ecosystem configuration updated.

coderabbitai · 2026-05-18T19:00:58Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54eff38b-f14a-4d0e-945a-cb14815a2921

📥 Commits

Reviewing files that changed from the base of the PR and between 33dec6d and 82a7238.

📒 Files selected for processing (5)

CONTRIBUTING.md
Dockerfile
benchmarks/conftest.py
docs/probe-performance.md
src/paperscout/__main__.py

🚧 Files skipped from review as they are similar to previous changes (4)

docs/probe-performance.md
src/paperscout/main.py
benchmarks/conftest.py
Dockerfile

📝 Walkthrough

Walkthrough

Adds Slack credential validation and ConfigurationError; makes the scheduler halt on configuration faults; adds an offline ISOProber benchmark with baseline/regression gate and docs; migrates tooling to uv/uv.lock across CI/Docker/local; tightens defensive parsing and typing.

Changes

Configuration Validation, Benchmarking, and Toolchain Migration

Layer / File(s)	Summary
Configuration Error Classification and Validation `src/paperscout/errors.py`, `src/paperscout/config.py`, `tests/conftest.py`, `tests/test_config.py`	New `ConfigurationError` and `FailureCategory.CONFIGURATION`; `Settings` has an after-validator that enforces non-empty Slack credentials unless `_PAPERSCOUT_TESTING` is set; test-time env defaults for tests.
Startup Guard and Scheduler Halt on Configuration Error `src/paperscout/__main__.py`, `src/paperscout/monitor.py`, `tests/test_monitor.py`	Startup exits if Slack tokens are missing (unless testing). `Scheduler.run_forever` treats `ConfigurationError` as fatal and returns immediately; tests cover halt behavior.
Probe Cycle Benchmark Suite with Baseline and Regression Gates `benchmarks/conftest.py`, `benchmarks/test_probe_cycle.py`, `benchmarks/baseline.json`, `docs/probe-performance.md`, `pyproject.toml`	Adds an offline benchmark using `httpx.MockTransport` to measure cycle wall time, request count, peak concurrency, and latency percentiles; stores baseline in JSON and fails on regressions beyond threshold; provides pytest marker and CLI knobs; docs added.
Dependency Management: uv Lockfile and Toolchain Migration `.github/dependabot.yml`, `.github/workflows/cd.yml`, `.github/workflows/ci.yml`, `Dockerfile`, `run`, `CONTRIBUTING.md`	Migrates CI, CD, Docker, and local workflows to `uv` with `uv.lock`; Dependabot tracks `uv`; workflows install via `uv sync --frozen` and run commands via `uv run`; Docker builder uses `uv sync --frozen --no-dev`.
Type Safety: Defensive Parsing and Annotated Storage `src/paperscout/models.py`, `src/paperscout/storage.py`, `src/paperscout/health.py`, `tests/test_models.py`	`Paper.from_index_entry` defensive parsing with warning logs for unknown types; `PaperCache.read` enforces JSON dict shape; narrowed return and param annotations in storage; health fallback typed via `cast`; test added for unknown paper type.

Sequence Diagram(s)

sequenceDiagram
  participant TestProbe as test_probe_cycle_regression
  participant Prober as ISOProber
  participant MockHTTP as httpx.MockTransport
  participant Metrics as metrics_accumulator
  participant Baseline as baseline.json
  TestProbe->>Prober: build prober (_build_prober)
  loop 3 cycles
    TestProbe->>Prober: _run_one_cycle(per_request_delay)
    Prober->>MockHTTP: AsyncClient -> MockTransport handler
    MockHTTP->>Metrics: record latency, increment concurrent, count requests
    Prober-->>TestProbe: return cycle metrics
  end
  TestProbe->>Baseline: load thresholds
  TestProbe->>TestProbe: compute median wall_seconds and compare regression_threshold_pct
  TestProbe->>TestProbe: assert min_request_count

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

cppalliance/paperscout#29: Prior work on FailureCategory and scheduler failure logging that this PR extends with CONFIGURATION and fatal-halt behavior.
cppalliance/paperscout#16: Related storage/invalid-JSON behavior fixes and tests intersecting PaperCache.read tightening.
cppalliance/paperscout#5: Overlaps with /health discovered-state plumbing updates.

Suggested labels

bug

Suggested reviewers

wpak-ai

"I'm a rabbit with a tidy lock,
I hop through tests and tick the clock.
I sniff for Slack tokens, loudly cry,
Then benchmark probes that zip on by.
Dependencies pinned — a tidy flock!"

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 27.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Bugfix/config benchmark lockfile types' is vague and does not clearly convey the scope of the changes. It uses generic terms ('Bugfix') and abbreviations ('lockfile types') that don't meaningfully describe the primary work: implementing lockfile mechanism, benchmarking, configuration error handling, and type safety.	Clarify the title to reflect the main objective, e.g., 'Add uv lockfile, benchmarks, config validation, and type safety improvements' or 'Implement lockfile, benchmarks, config errors, and typing fixes for issues `#31`–34'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description covers all major changes across issues `#31`–34 with clear summaries of file modifications and implementation details. It lists the files changed and links the issues being closed. However, it lacks a structured test plan section with checkboxes as specified in the description template.
Linked Issues check	✅ Passed	All code changes align with the four linked issues: `#31` (config validation and ConfigurationError handling), `#32` (benchmarks with baseline), `#33` (uv lockfile and Docker/CI/Dependabot updates), and `#34` (type safety for matches_for_users and Paper.from_index_entry hardening).
Out of Scope Changes check	✅ Passed	All file modifications are scoped to the linked issues: dependency tooling (uv), configuration/validation, benchmarking, typing, and CI/CD workflows. No unrelated changes to core business logic, unreviewed refactorings, or tangential work are present.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bugfix/config-benchmark-lockfile-types

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/test_probe_cycle.py`:
- Around line 61-70: The handler function used by httpx.AsyncClient is currently
synchronous and calls time.sleep, blocking the event loop; change the handler to
an async function (async def handler(request: httpx.Request) -> httpx.Response)
and replace time.sleep(per_request_delay_sec) with await
asyncio.sleep(per_request_delay_sec); keep all existing metrics updates
(metrics["lock"], metrics["active"], metrics["peak_concurrent"],
metrics["request_count"]) and ensure the active-counter decrement remains in the
same finally/cleanup path so concurrency accounting is preserved when converting
to async; update imports to include asyncio if not already present and verify
this integrates with _run_one_cycle which expects an async-safe handler.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f07b9728-0b10-441a-a13e-e845b47c6caa

📥 Commits

Reviewing files that changed from the base of the PR and between 6f800c6 and 79640e8.

📒 Files selected for processing (22)

.github/dependabot.yml
.github/workflows/cd.yml
.github/workflows/ci.yml
CONTRIBUTING.md
Dockerfile
benchmarks/baseline.json
benchmarks/conftest.py
benchmarks/test_probe_cycle.py
docs/probe-performance.md
pyproject.toml
run
src/paperscout/__main__.py
src/paperscout/config.py
src/paperscout/errors.py
src/paperscout/health.py
src/paperscout/models.py
src/paperscout/monitor.py
src/paperscout/storage.py
tests/conftest.py
tests/test_config.py
tests/test_models.py
tests/test_monitor.py

henry0816191 · 2026-05-18T20:16:34Z

@coderabbitai review resume

coderabbitai · 2026-05-18T20:16:45Z

@henry0816191 Sure! Resuming and triggering a review of PR #40 now.

✅ Actions performed

Reviews resumed.

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

henry0816191 · 2026-05-18T20:22:21Z

@coderabbitai full review

coderabbitai · 2026-05-18T20:22:27Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

benchmarks/test_probe_cycle.py (1)

172-173: ⚡ Quick win

Add a sanity assertion for peak_concurrent bounds.

You already collect peak_concurrent; asserting 1 <= peak_concurrent <= http_conc strengthens regression signal when concurrency behavior drifts.

Suggested fix

     # Sanity: we exercised HEAD/GET traffic.
     assert last_metrics["request_count"] >= int(baseline.get("min_request_count", 10))
+    assert 1 <= last_metrics["peak_concurrent"] <= http_conc

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/test_probe_cycle.py` around lines 172 - 173, Add a sanity
assertion that validates peak_concurrent is within expected bounds: after the
existing request_count assertion, check that last_metrics["peak_concurrent"] is
at least 1 and at most int(baseline.get("http_conc", http_conc)) (or
int(http_conc) if baseline key absent). Locate the assertion block using
last_metrics and baseline in test_probe_cycle.py and insert the check
referencing "peak_concurrent" and "http_conc" to fail when concurrency behavior
drifts.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/conftest.py`:
- Around line 43-47: The benchmark CLI options can be invalid (non-positive
concurrency or poll interval, negative per-request delay) so add validation in
pytest_configure (or a small helper called from it) that reads config.getoption
for "--bench-http-concurrency", "--bench-poll-interval-minutes", and
"--bench-per-request-delay-ms" and enforces ranges: bench-http-concurrency > 0,
bench-poll-interval-minutes > 0, and bench-per-request-delay-ms >= 0; if any
value is out of range raise pytest.UsageError (or pytest.exit) with a clear
message identifying the offending option and required range so tests fail fast
before running benchmarks.

In `@CONTRIBUTING.md`:
- Around line 65-70: Add a non-uv fallback for the pre-commit commands by
appending the plain pre-commit equivalents after the existing "uv run pre-commit
install" and "uv run pre-commit run --all-files" examples; specifically, include
"pre-commit install" and "pre-commit run --all-files" so users using a classic
venv have the equivalent commands.

In `@Dockerfile`:
- Around line 14-16: The Dockerfile runs "uv sync --frozen --no-dev" which
installs the project in editable mode by default; update the RUN command to add
the "--no-editable" flag so uv performs a non‑editable install (preventing
build-stage source paths from being embedded into the runtime .venv); keep ENV
UV_COMPILE_BYTECODE=1 unchanged and only modify the RUN uv sync invocation to
include --no-editable.

In `@docs/probe-performance.md`:
- Line 49: The guidance that `wall_seconds_median` in `baseline.json` should use
an 8–15× multiplier is too permissive; update the docs/probe-performance.md text
to recommend a much tighter slack (e.g. 1.1–1.5× or 10–50% above the raw median)
or recommend using percentile-based or CI-controlled offsets instead of a large
fixed multiplier, and explicitly call out `wall_seconds_median` as the field to
change so readers set a realistic regression gate.

In `@src/paperscout/__main__.py`:
- Around line 96-105: The startup guard in __main__.py currently aborts
unconditionally when Slack credentials are missing; change it to skip the exit
when tests are running by checking the testing bypass flag (e.g.
settings.PAPERSCOUT_TESTING or equivalent testing boolean on settings) before
exiting. In other words, only perform the log.error/sys.exit(1) when credentials
are missing AND the testing flag is falsey; reference the existing
settings.slack_bot_token and settings.slack_signing_secret checks and
short-circuit them when the settings testing flag indicates a test run.

---

Nitpick comments:
In `@benchmarks/test_probe_cycle.py`:
- Around line 172-173: Add a sanity assertion that validates peak_concurrent is
within expected bounds: after the existing request_count assertion, check that
last_metrics["peak_concurrent"] is at least 1 and at most
int(baseline.get("http_conc", http_conc)) (or int(http_conc) if baseline key
absent). Locate the assertion block using last_metrics and baseline in
test_probe_cycle.py and insert the check referencing "peak_concurrent" and
"http_conc" to fail when concurrency behavior drifts.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f3862261-99cc-4ece-83f7-3a992b564f59

📥 Commits

Reviewing files that changed from the base of the PR and between 6f800c6 and 33dec6d.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (22)

.github/dependabot.yml
.github/workflows/cd.yml
.github/workflows/ci.yml
CONTRIBUTING.md
Dockerfile
benchmarks/baseline.json
benchmarks/conftest.py
benchmarks/test_probe_cycle.py
docs/probe-performance.md
pyproject.toml
run
src/paperscout/__main__.py
src/paperscout/config.py
src/paperscout/errors.py
src/paperscout/health.py
src/paperscout/models.py
src/paperscout/monitor.py
src/paperscout/storage.py
tests/conftest.py
tests/test_config.py
tests/test_models.py
tests/test_monitor.py

henry0816191 added 2 commits May 19, 2026 02:50

addressed issue_01-issue_04

a69b220

added probe-performance

79640e8

henry0816191 self-assigned this May 18, 2026

henry0816191 requested a review from wpak-ai as a code owner May 18, 2026 19:00

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread benchmarks/test_probe_cycle.py Outdated

henry0816191 added 3 commits May 19, 2026 03:05

added uv

b279907

fixed lint error

654af7e

addressed ai review result

33dec6d

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread benchmarks/conftest.py

Comment thread CONTRIBUTING.md

Comment thread Dockerfile

Comment thread docs/probe-performance.md Outdated

Comment thread src/paperscout/__main__.py Outdated

henry0816191 added 2 commits May 19, 2026 04:48

addressed ai review results

34002d9

fixed lint errors

82a7238

wpak-ai approved these changes May 18, 2026

View reviewed changes

wpak-ai merged commit 85679e0 into develop May 18, 2026
9 checks passed

wpak-ai deleted the bugfix/config-benchmark-lockfile-types branch May 18, 2026 23:48

Conversation

henry0816191 commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issue 34

Issue 33 (lockfile + Docker + CI)

Issue 32 (benchmarks)

Issue 31 (config vs transient failures)

Checks run

Note

Issue Close

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

henry0816191 commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026

Uh oh!

henry0816191 commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

henry0816191 commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading