Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds a performance benchmarking suite: deterministic data generator and seeder, pytest session fixtures for multiple dataset scales, parameterized pytest-benchmark tests, a bench JSON→CSV converter and CLI, Makefile targets for running/reporting benchmarks, and dev tooling/config updates. Changes
Sequence DiagramsequenceDiagram
participant Runner as Test Runner
participant Fixture as pytest fixture
participant Generator as Data Generator
participant DB as Database
participant Engine as SlayerQueryEngine
note over Runner,Engine: benchmark setup and execution
Runner->>Fixture: request env_<scale>
Fixture->>Generator: generate_dataset(order_count)
activate Generator
Generator-->>Fixture: Dataset
deactivate Generator
Fixture->>DB: create/connect DB and seed_database(dataset)
activate DB
DB-->>Fixture: DB ready (schema, rows, INDEXES applied)
deactivate DB
Fixture->>Engine: init (YAMLStorage + DB)
Fixture-->>Runner: return (Engine, Dataset)
Runner->>Engine: execute SlayerQuery (benchmarked)
activate Engine
Engine->>DB: run SQL
DB-->>Engine: rows
Engine-->>Runner: SlayerResponse (timing)
deactivate Engine
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/perf/seed.py (2)
408-459: Consider keyword-only parameters and moving import to top of file.Two minor suggestions:
- Per coding guidelines, functions with >1 parameter should use keyword arguments
- The
sqlalchemyimport could be moved to the top of the file for consistency♻️ Optional refactor
Move import to top:
+import sqlalchemy as sa from dataclasses import dataclass from datetime import datetime, timedeltaUse keyword-only parameters:
-def seed_database(engine, dataset: Dataset) -> None: +def seed_database(*, engine, dataset: Dataset) -> None:As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter" and "Place imports at the top of files".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/perf/seed.py` around lines 408 - 459, The function seed_database currently takes positional parameters (engine, dataset) and imports sqlalchemy inside the function; change the signature to require keyword-only parameters (def seed_database(*, engine, dataset): or def seed_database(engine, *, dataset):) so callers must pass by name, move the "import sqlalchemy as sa" to the top-level imports, and update any call sites to pass engine=... and dataset=...; ensure the function still references sa in its body (e.g., sa.text) after the import move.
140-154: Consider using keyword-only arguments for public API.Per coding guidelines, functions with more than 1 parameter should use keyword arguments. The
generate_datasetfunction signature could be clearer with keyword-only parameters.♻️ Optional refactor
def generate_dataset( - order_count: int, - start_date: str = "2023-01-01", - end_date: str = "2024-12-31", - seed: int = 42, + *, + order_count: int, + start_date: str = "2023-01-01", + end_date: str = "2024-12-31", + seed: int = 42, ) -> Dataset:This would require updating the call site in
conftest.py:dataset = generate_dataset(order_count=order_count)As per coding guidelines: "Use keyword arguments for functions with more than 1 parameter".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/perf/seed.py` around lines 140 - 154, The public function generate_dataset currently accepts multiple positional parameters; change its signature to enforce keyword-only arguments (e.g., def generate_dataset(order_count: int, *, start_date: str = "2023-01-01", end_date: str = "2024-12-31", seed: int = 42) -> Dataset) so callers must pass start_date, end_date, and seed by name; update any call sites (such as the test conftest.py that invokes generate_dataset) to use keyword invocation like generate_dataset(order_count=order_count) and leave internal usage of _prng unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/perf/bench_to_csv.py`:
- Around line 68-69: The code reads the output path via
sys.argv[sys.argv.index("-o") + 1] which will raise IndexError if "-o" is the
last arg; update the logic around out_path to first find the index =
sys.argv.index("-o") and confirm index + 1 < len(sys.argv) (or use a try/except)
before accessing the next element, and if missing, print a clear error/usage
message or exit; modify the block that sets out_path to validate presence of the
following value rather than directly indexing into sys.argv.
---
Nitpick comments:
In `@tests/perf/seed.py`:
- Around line 408-459: The function seed_database currently takes positional
parameters (engine, dataset) and imports sqlalchemy inside the function; change
the signature to require keyword-only parameters (def seed_database(*, engine,
dataset): or def seed_database(engine, *, dataset):) so callers must pass by
name, move the "import sqlalchemy as sa" to the top-level imports, and update
any call sites to pass engine=... and dataset=...; ensure the function still
references sa in its body (e.g., sa.text) after the import move.
- Around line 140-154: The public function generate_dataset currently accepts
multiple positional parameters; change its signature to enforce keyword-only
arguments (e.g., def generate_dataset(order_count: int, *, start_date: str =
"2023-01-01", end_date: str = "2024-12-31", seed: int = 42) -> Dataset) so
callers must pass start_date, end_date, and seed by name; update any call sites
(such as the test conftest.py that invokes generate_dataset) to use keyword
invocation like generate_dataset(order_count=order_count) and leave internal
usage of _prng unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bc476819-2dae-4fe8-b2bb-8b063f9b8ec3
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
.gitignorepyproject.tomltests/perf/__init__.pytests/perf/bench_to_csv.pytests/perf/conftest.pytests/perf/seed.pytests/perf/test_bench.py
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
Makefile (1)
5-5: Consider addingallandcleanphony targets.
checkmakeis already flagging this file for missing standard entrypoints. Thin aliases are enough if you want to keep the Makefile minimal.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` at line 5, Add the standard phony targets "all" and "clean" to satisfy checkmake: declare them alongside the existing .PHONY line and implement thin aliases so "all" depends on the primary validation targets (e.g., test lint) and "clean" removes build/test artifacts (or delegates to existing cleanup commands) using the Makefile's existing targets and conventions (refer to .PHONY and the test, lint, bench, bench-report, bench-csv target names to locate where to wire these aliases).tests/perf/seed.py (1)
92-133: Use Pydantic models for the benchmark entities.
Region,Shop,Customer,Order, andDatasetare the module’s canonical models, but they’re plain dataclasses. That diverges from the repo standard and drops the validation/serialization layer right where bad seed data is hardest to debug.As per coding guidelines, "Use Python 3.11+ with Pydantic v2 for all models".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/perf/seed.py` around lines 92 - 133, Replace the plain dataclasses with Pydantic v2 models: convert Region, Shop, Customer, Order, and Dataset to inherit from pydantic.BaseModel (import from pydantic) and use typing annotations (e.g., int, str, datetime, list[int], Optional[datetime]) and pydantic.Field where defaults/metadata are needed; ensure Order fields completed_at and cancelled_at are Optional[datetime], keep avg_cost as int (cents) and avg_frequency/size as int, and make Dataset reference lists of the Pydantic model classes (list[Region], list[Shop], etc.) so validation/serialization are automatic. Also add any lightweight validators (if needed) on Shop.avg_cost or Order.cost to enforce non-negative values using model validators in the corresponding model names (Region, Shop, Customer, Order, Dataset).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Makefile`:
- Around line 1-3: The comment and BENCH_FLAGS disagree: the comment says
minimum 10 rounds but BENCH_FLAGS sets --benchmark-min-rounds=5; pick one and
make them consistent by either updating BENCH_FLAGS to use
--benchmark-min-rounds=10 or changing the comment to state 5 rounds; edit the
Makefile, update the BENCH_FLAGS definition (the variable name BENCH_FLAGS and
its --benchmark-min-rounds flag) or the top-line comment to match the chosen
value so documentation and flags align.
In `@tests/perf/conftest.py`:
- Around line 126-132: The current loop in db_engine.connect() that executes
INDEXES swallows all exceptions and can let benchmarks run on the wrong schema;
update the block around conn.execute(sa.text(idx_sql)) to capture failures
(collect failing idx_sql values and their exception messages), log them via the
test logger or process logger including the dialect (db_engine.dialect.name) and
then fail fast by raising an exception after the loop (or re-raising
immediately) so the test run aborts for that dialect instead of proceeding
without the intended indexes; reference the INDEXES constant, the
conn.execute(...) call, and the surrounding with db_engine.connect() /
conn.commit() logic when making the change.
- Around line 112-117: The current DB_BACKEND=="url" branch calls
seed_database(engine=db_engine, dataset=dataset, clean=True) unconditionally
which can destructively reset any external DB; change this to require an
explicit opt-in before running destructive cleanup: add a check for a dedicated
benchmark marker or a boolean env/param like BENCH_DB_RESET (or require the
DB_URL to reference a dedicated schema/database name such as containing
"bench"), and only call seed_database when that opt-in is present; otherwise
raise a clear ValueError requiring BENCH_DB_RESET or a bench-specific DB_URL.
Update the logic around DB_BACKEND, DB_URL, DB_TYPE, seed_database and
DatasourceConfig so the DatasourceConfig is still created for read-only runs but
destructive seed_database(clean=True) is gated behind the explicit opt-in.
- Around line 156-162: The fixtures created inside the loop are registered with
the inner function name "_fixture" so pytest can't find names like "env_1k";
update the decorator call in _make_fixture so the pytest fixture uses the
desired name by passing name=f"env_{_name}" to `@pytest.fixture` (i.e., change the
decorator on _fixture inside _make_fixture to `@pytest.fixture`(scope="session",
name=f"env_{_name}")), ensuring SCALES, _make_fixture and the returned fixtures
match the names requested by request.getfixturevalue(f"env_{scale_name}").
In `@tests/perf/seed.py`:
- Around line 140-145: In generate_dataset, validate inputs up front: ensure
order_count is non-negative (raise ValueError if order_count < 0) and ensure the
date window is valid (parse start_date and end_date and raise ValueError if
start_date > end_date or if parsing fails) so day_cumsum/total_day_weight cannot
be empty/zero; also apply the same precondition checks to the similar generator
at lines 160-162. Use clear ValueError messages referencing the invalid
parameter (order_count or start_date/end_date) so invalid perf params fail fast.
- Around line 313-315: The current remap assigns late-opening orders to an
arbitrary early shop, breaking the customer→shop affinity; change the branch
that handles (shop_id > late_shop_threshold and created_at < mid_date) to
instead re-roll the shop_id by sampling from that customer’s eligible early
shops (i.e., filter the customer’s weighted shop list to ids <=
late_shop_threshold and sample using the same weights) and assign that sampled
id to shop_id; if the customer has no eligible early shops, shift created_at
forward to mid_date (or the customer’s first available shop open date) so the
original late shop assignment remains valid. Ensure you reference and use the
same customer shop list and weighting logic used elsewhere in seed.py when
re-rolling to preserve the affinity distribution.
- Around line 463-475: The insert currently stringifies timestamps with
.isoformat() in the batch comprehension inside conn.execute (the INSERT INTO
orders using sa.text); change the batch payload so the "created_at",
"completed_at", and "cancelled_at" values are the native datetime objects (use
o.created_at, o.completed_at or None, o.cancelled_at or None) instead of calling
.isoformat(), so SQLAlchemy/DBAPI can bind them as TIMESTAMPs with the correct
dialect-specific processing.
---
Nitpick comments:
In `@Makefile`:
- Line 5: Add the standard phony targets "all" and "clean" to satisfy checkmake:
declare them alongside the existing .PHONY line and implement thin aliases so
"all" depends on the primary validation targets (e.g., test lint) and "clean"
removes build/test artifacts (or delegates to existing cleanup commands) using
the Makefile's existing targets and conventions (refer to .PHONY and the test,
lint, bench, bench-report, bench-csv target names to locate where to wire these
aliases).
In `@tests/perf/seed.py`:
- Around line 92-133: Replace the plain dataclasses with Pydantic v2 models:
convert Region, Shop, Customer, Order, and Dataset to inherit from
pydantic.BaseModel (import from pydantic) and use typing annotations (e.g., int,
str, datetime, list[int], Optional[datetime]) and pydantic.Field where
defaults/metadata are needed; ensure Order fields completed_at and cancelled_at
are Optional[datetime], keep avg_cost as int (cents) and avg_frequency/size as
int, and make Dataset reference lists of the Pydantic model classes
(list[Region], list[Shop], etc.) so validation/serialization are automatic. Also
add any lightweight validators (if needed) on Shop.avg_cost or Order.cost to
enforce non-negative values using model validators in the corresponding model
names (Region, Shop, Customer, Order, Dataset).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a69d49c7-bdf6-4112-bcf9-d44f9528cd0d
📒 Files selected for processing (7)
.gitignoreMakefiletests/perf/bench_to_csv.pytests/perf/conftest.pytests/perf/params.pytests/perf/seed.pytests/perf/test_bench.py
✅ Files skipped from review due to trivial changes (2)
- .gitignore
- tests/perf/params.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/perf/bench_to_csv.py
- tests/perf/test_bench.py
Designed and implemented:
Summary by CodeRabbit
Tests
Chores