Skip to content

Bugfix/config benchmark lockfile types#40

Merged
wpak-ai merged 7 commits into
developfrom
bugfix/config-benchmark-lockfile-types
May 18, 2026
Merged

Bugfix/config benchmark lockfile types#40
wpak-ai merged 7 commits into
developfrom
bugfix/config-benchmark-lockfile-types

Conversation

@henry0816191
Copy link
Copy Markdown
Collaborator

@henry0816191 henry0816191 commented May 18, 2026

Summary

Issue 34

  • src/paperscout/storage.py: Typed matches_for_users, inner matched_papers / matched_hits, and public APIs using dict[str, Any] where needed; used cast for JSON/cache returns.
  • src/paperscout/models.py: Safer Paper.from_index_entry with WARNING on bad/unknown type; dict[str, Any] entry typing.
  • src/paperscout/health.py: Typed get_disc callback for mypy.
  • tests/test_models.py: Test for unknown type.
  • pyproject.toml: Dev extras mypy, pytest-benchmark; [tool.mypy] + import overrides.

Issue 33 (lockfile + Docker + CI)

  • Dockerfile: Multi-stage build with ghcr.io/astral-sh/uv:0.8.3, uv sync --frozen --no-dev, copy .venv into runtime image.
  • .github/workflows/ci.yml: lockfile job, uv on lint/test, workflow env for Slack placeholders + _PAPERSCOUT_TESTING=1, benchmark job running pytest benchmarks/ -m benchmark.
  • .github/workflows/cd.yml: Same uv install + env for the test step.
  • .github/dependabot.yml: package-ecosystem: uv (replaced pip for root deps).
  • CONTRIBUTING.md & run: uv workflow and lockfile maintenance.

Issue 32 (benchmarks)

  • benchmarks/: conftest.py (CLI options, env, fake_pool), test_probe_cycle.py (mock httpx.MockTransport, median-of-3 with fresh FakePool per iteration), benchmarks/baseline.json.
  • docs/probe-performance.md: Explains metrics, knobs, baseline updates.

Issue 31 (config vs transient failures)

  • src/paperscout/errors.py: FailureCategory.CONFIGURATION, ConfigurationError.
  • src/paperscout/config.py: @model_validator requiring Slack tokens unless _PAPERSCOUT_TESTING=1.
  • tests/conftest.py & benchmarks/conftest.py: Set _PAPERSCOUT_TESTING and default Slack env before importing paperscout.
  • src/paperscout/main.py: Slack check after logging (mirrors DATABASE_URL).
  • src/paperscout/monitor.py: except ConfigurationError → log.critical + return (no sleep / no retry).
  • tests/test_config.py & tests/test_monitor.py: New tests.

Checks run

pytest tests/, pytest benchmarks/ -m benchmark, mypy src/, ruff check/format, uv lock --check, pre-commit run --all-files.

Note

Docker wasn’t available in this environment, so the image wasn’t built here; the Dockerfile follows the standard uv sync --frozen pattern.

Issue Close

close #31
close #32
close #33
close #34

Summary by CodeRabbit

  • New Features

    • Offline probe-cycle performance benchmark with CI regression gate.
    • Startup now enforces Slack credentials and fails fast when missing.
  • Improvements

    • Migrate CI, dev tooling, Docker, and local scripts to a reproducible dependency runner.
    • Stronger input validation, typing, and defensive parsing across the app.
  • Documentation

    • Added probe-performance guide and updated contributor/testing instructions.
  • Tests

    • Added tests for config validation, model parsing, monitor halting, and benchmarks.
  • Chores

    • Dependabot ecosystem configuration updated.

Review Change Stack

@henry0816191 henry0816191 self-assigned this May 18, 2026
@henry0816191 henry0816191 requested a review from wpak-ai as a code owner May 18, 2026 19:00
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54eff38b-f14a-4d0e-945a-cb14815a2921

📥 Commits

Reviewing files that changed from the base of the PR and between 33dec6d and 82a7238.

📒 Files selected for processing (5)
  • CONTRIBUTING.md
  • Dockerfile
  • benchmarks/conftest.py
  • docs/probe-performance.md
  • src/paperscout/__main__.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/probe-performance.md
  • src/paperscout/main.py
  • benchmarks/conftest.py
  • Dockerfile

📝 Walkthrough

Walkthrough

Adds Slack credential validation and ConfigurationError; makes the scheduler halt on configuration faults; adds an offline ISOProber benchmark with baseline/regression gate and docs; migrates tooling to uv/uv.lock across CI/Docker/local; tightens defensive parsing and typing.

Changes

Configuration Validation, Benchmarking, and Toolchain Migration

Layer / File(s) Summary
Configuration Error Classification and Validation
src/paperscout/errors.py, src/paperscout/config.py, tests/conftest.py, tests/test_config.py
New ConfigurationError and FailureCategory.CONFIGURATION; Settings has an after-validator that enforces non-empty Slack credentials unless _PAPERSCOUT_TESTING is set; test-time env defaults for tests.
Startup Guard and Scheduler Halt on Configuration Error
src/paperscout/__main__.py, src/paperscout/monitor.py, tests/test_monitor.py
Startup exits if Slack tokens are missing (unless testing). Scheduler.run_forever treats ConfigurationError as fatal and returns immediately; tests cover halt behavior.
Probe Cycle Benchmark Suite with Baseline and Regression Gates
benchmarks/conftest.py, benchmarks/test_probe_cycle.py, benchmarks/baseline.json, docs/probe-performance.md, pyproject.toml
Adds an offline benchmark using httpx.MockTransport to measure cycle wall time, request count, peak concurrency, and latency percentiles; stores baseline in JSON and fails on regressions beyond threshold; provides pytest marker and CLI knobs; docs added.
Dependency Management: uv Lockfile and Toolchain Migration
.github/dependabot.yml, .github/workflows/cd.yml, .github/workflows/ci.yml, Dockerfile, run, CONTRIBUTING.md
Migrates CI, CD, Docker, and local workflows to uv with uv.lock; Dependabot tracks uv; workflows install via uv sync --frozen and run commands via uv run; Docker builder uses uv sync --frozen --no-dev.
Type Safety: Defensive Parsing and Annotated Storage
src/paperscout/models.py, src/paperscout/storage.py, src/paperscout/health.py, tests/test_models.py
Paper.from_index_entry defensive parsing with warning logs for unknown types; PaperCache.read enforces JSON dict shape; narrowed return and param annotations in storage; health fallback typed via cast; test added for unknown paper type.

Sequence Diagram(s)

sequenceDiagram
  participant TestProbe as test_probe_cycle_regression
  participant Prober as ISOProber
  participant MockHTTP as httpx.MockTransport
  participant Metrics as metrics_accumulator
  participant Baseline as baseline.json
  TestProbe->>Prober: build prober (_build_prober)
  loop 3 cycles
    TestProbe->>Prober: _run_one_cycle(per_request_delay)
    Prober->>MockHTTP: AsyncClient -> MockTransport handler
    MockHTTP->>Metrics: record latency, increment concurrent, count requests
    Prober-->>TestProbe: return cycle metrics
  end
  TestProbe->>Baseline: load thresholds
  TestProbe->>TestProbe: compute median wall_seconds and compare regression_threshold_pct
  TestProbe->>TestProbe: assert min_request_count
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

bug

Suggested reviewers

  • wpak-ai

"I'm a rabbit with a tidy lock,
I hop through tests and tick the clock.
I sniff for Slack tokens, loudly cry,
Then benchmark probes that zip on by.
Dependencies pinned — a tidy flock!"

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Bugfix/config benchmark lockfile types' is vague and does not clearly convey the scope of the changes. It uses generic terms ('Bugfix') and abbreviations ('lockfile types') that don't meaningfully describe the primary work: implementing lockfile mechanism, benchmarking, configuration error handling, and type safety. Clarify the title to reflect the main objective, e.g., 'Add uv lockfile, benchmarks, config validation, and type safety improvements' or 'Implement lockfile, benchmarks, config errors, and typing fixes for issues #31–34'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description covers all major changes across issues #31–34 with clear summaries of file modifications and implementation details. It lists the files changed and links the issues being closed. However, it lacks a structured test plan section with checkboxes as specified in the description template.
Linked Issues check ✅ Passed All code changes align with the four linked issues: #31 (config validation and ConfigurationError handling), #32 (benchmarks with baseline), #33 (uv lockfile and Docker/CI/Dependabot updates), and #34 (type safety for matches_for_users and Paper.from_index_entry hardening).
Out of Scope Changes check ✅ Passed All file modifications are scoped to the linked issues: dependency tooling (uv), configuration/validation, benchmarking, typing, and CI/CD workflows. No unrelated changes to core business logic, unreviewed refactorings, or tangential work are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bugfix/config-benchmark-lockfile-types

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/test_probe_cycle.py`:
- Around line 61-70: The handler function used by httpx.AsyncClient is currently
synchronous and calls time.sleep, blocking the event loop; change the handler to
an async function (async def handler(request: httpx.Request) -> httpx.Response)
and replace time.sleep(per_request_delay_sec) with await
asyncio.sleep(per_request_delay_sec); keep all existing metrics updates
(metrics["lock"], metrics["active"], metrics["peak_concurrent"],
metrics["request_count"]) and ensure the active-counter decrement remains in the
same finally/cleanup path so concurrency accounting is preserved when converting
to async; update imports to include asyncio if not already present and verify
this integrates with _run_one_cycle which expects an async-safe handler.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f07b9728-0b10-441a-a13e-e845b47c6caa

📥 Commits

Reviewing files that changed from the base of the PR and between 6f800c6 and 79640e8.

📒 Files selected for processing (22)
  • .github/dependabot.yml
  • .github/workflows/cd.yml
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • Dockerfile
  • benchmarks/baseline.json
  • benchmarks/conftest.py
  • benchmarks/test_probe_cycle.py
  • docs/probe-performance.md
  • pyproject.toml
  • run
  • src/paperscout/__main__.py
  • src/paperscout/config.py
  • src/paperscout/errors.py
  • src/paperscout/health.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/storage.py
  • tests/conftest.py
  • tests/test_config.py
  • tests/test_models.py
  • tests/test_monitor.py

Comment thread benchmarks/test_probe_cycle.py Outdated
@henry0816191
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

@henry0816191 Sure! Resuming and triggering a review of PR #40 now.

✅ Actions performed

Reviews resumed.


Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@henry0816191
Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
benchmarks/test_probe_cycle.py (1)

172-173: ⚡ Quick win

Add a sanity assertion for peak_concurrent bounds.

You already collect peak_concurrent; asserting 1 <= peak_concurrent <= http_conc strengthens regression signal when concurrency behavior drifts.

Suggested fix
     # Sanity: we exercised HEAD/GET traffic.
     assert last_metrics["request_count"] >= int(baseline.get("min_request_count", 10))
+    assert 1 <= last_metrics["peak_concurrent"] <= http_conc
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/test_probe_cycle.py` around lines 172 - 173, Add a sanity
assertion that validates peak_concurrent is within expected bounds: after the
existing request_count assertion, check that last_metrics["peak_concurrent"] is
at least 1 and at most int(baseline.get("http_conc", http_conc)) (or
int(http_conc) if baseline key absent). Locate the assertion block using
last_metrics and baseline in test_probe_cycle.py and insert the check
referencing "peak_concurrent" and "http_conc" to fail when concurrency behavior
drifts.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/conftest.py`:
- Around line 43-47: The benchmark CLI options can be invalid (non-positive
concurrency or poll interval, negative per-request delay) so add validation in
pytest_configure (or a small helper called from it) that reads config.getoption
for "--bench-http-concurrency", "--bench-poll-interval-minutes", and
"--bench-per-request-delay-ms" and enforces ranges: bench-http-concurrency > 0,
bench-poll-interval-minutes > 0, and bench-per-request-delay-ms >= 0; if any
value is out of range raise pytest.UsageError (or pytest.exit) with a clear
message identifying the offending option and required range so tests fail fast
before running benchmarks.

In `@CONTRIBUTING.md`:
- Around line 65-70: Add a non-uv fallback for the pre-commit commands by
appending the plain pre-commit equivalents after the existing "uv run pre-commit
install" and "uv run pre-commit run --all-files" examples; specifically, include
"pre-commit install" and "pre-commit run --all-files" so users using a classic
venv have the equivalent commands.

In `@Dockerfile`:
- Around line 14-16: The Dockerfile runs "uv sync --frozen --no-dev" which
installs the project in editable mode by default; update the RUN command to add
the "--no-editable" flag so uv performs a non‑editable install (preventing
build-stage source paths from being embedded into the runtime .venv); keep ENV
UV_COMPILE_BYTECODE=1 unchanged and only modify the RUN uv sync invocation to
include --no-editable.

In `@docs/probe-performance.md`:
- Line 49: The guidance that `wall_seconds_median` in `baseline.json` should use
an 8–15× multiplier is too permissive; update the docs/probe-performance.md text
to recommend a much tighter slack (e.g. 1.1–1.5× or 10–50% above the raw median)
or recommend using percentile-based or CI-controlled offsets instead of a large
fixed multiplier, and explicitly call out `wall_seconds_median` as the field to
change so readers set a realistic regression gate.

In `@src/paperscout/__main__.py`:
- Around line 96-105: The startup guard in __main__.py currently aborts
unconditionally when Slack credentials are missing; change it to skip the exit
when tests are running by checking the testing bypass flag (e.g.
settings.PAPERSCOUT_TESTING or equivalent testing boolean on settings) before
exiting. In other words, only perform the log.error/sys.exit(1) when credentials
are missing AND the testing flag is falsey; reference the existing
settings.slack_bot_token and settings.slack_signing_secret checks and
short-circuit them when the settings testing flag indicates a test run.

---

Nitpick comments:
In `@benchmarks/test_probe_cycle.py`:
- Around line 172-173: Add a sanity assertion that validates peak_concurrent is
within expected bounds: after the existing request_count assertion, check that
last_metrics["peak_concurrent"] is at least 1 and at most
int(baseline.get("http_conc", http_conc)) (or int(http_conc) if baseline key
absent). Locate the assertion block using last_metrics and baseline in
test_probe_cycle.py and insert the check referencing "peak_concurrent" and
"http_conc" to fail when concurrency behavior drifts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f3862261-99cc-4ece-83f7-3a992b564f59

📥 Commits

Reviewing files that changed from the base of the PR and between 6f800c6 and 33dec6d.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (22)
  • .github/dependabot.yml
  • .github/workflows/cd.yml
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • Dockerfile
  • benchmarks/baseline.json
  • benchmarks/conftest.py
  • benchmarks/test_probe_cycle.py
  • docs/probe-performance.md
  • pyproject.toml
  • run
  • src/paperscout/__main__.py
  • src/paperscout/config.py
  • src/paperscout/errors.py
  • src/paperscout/health.py
  • src/paperscout/models.py
  • src/paperscout/monitor.py
  • src/paperscout/storage.py
  • tests/conftest.py
  • tests/test_config.py
  • tests/test_models.py
  • tests/test_monitor.py

Comment thread benchmarks/conftest.py
Comment thread CONTRIBUTING.md
Comment thread Dockerfile
Comment thread docs/probe-performance.md Outdated
Comment thread src/paperscout/__main__.py Outdated
@wpak-ai wpak-ai merged commit 85679e0 into develop May 18, 2026
9 checks passed
@wpak-ai wpak-ai deleted the bugfix/config-benchmark-lockfile-types branch May 18, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants