test: distributed_executor.cpp branch coverage by poyrazK · Pull Request #67 · poyrazK/cloudSQL

poyrazK · 2026-04-25T16:25:21Z

Summary

Added DistributedExecutorWithNodesTests fixture for in-process testing of distributed execution paths without external cluster
47 tests covering shard routing, RPC failures, DDL forwarding, 2PC commit/rollback, shuffle joins (INNER/LEFT/RIGHT/FULL), UPDATE/DELETE with sharding key, global aggregates (COUNT/SUM/MIN/MAX), ORDER BY, LIMIT/OFFSET
Branch coverage: 0% → 35.16% (443/1260 branches taken at least once)

Test plan

cmake --build build --target distributed_executor_tests && ./build/distributed_executor_tests — 47 tests pass
gcov -b build/CMakeFiles/sqlEngineCore.dir/src/distributed/distributed_executor.cpp.gcda — 59.53% lines, 35.16% taken branches

Disabled tests (known async timing issues)

DISABLED_FullJoinPhase3_5: hangs in async Phase 3-5 unmatched rows collection. Root cause is async std::async + RpcClient::call() hang - appears to be a subtle issue in Phase 3-5 coordinator/data node coordination requiring further investigation.

Changes from initial PR

InsertRpcFailure and CommitPrepareFailure were re-enabled (fixed handler setup order - set handlers before pushing unique_ptr into servers_)
Extracted set_success_reply_handler helper to reduce repetitive RPC handler boilerplate

Added DistributedExecutorWithNodesTests fixture that registers mock data nodes in ClusterManager and uses RpcServer to test distributed execution paths: - SELECT with nodes but no server (RPC connect failure) - SELECT with shard routing (get_leader + compute_shard paths) - DDL forwarding to data nodes (lines 156-163) - INSERT shard routing (lines 454-519) - COMMIT with 2PC prepare/commit (lines 388-449) - ROLLBACK with nodes (lines 366-386) - SELECT broadcast without sharding key (line 565) Before: 0% branch coverage on distributed_executor.cpp After: 21.75% branch coverage (40.00% branches taken)

coderabbitai · 2026-04-25T16:25:27Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds extensive new tests: a DistributedExecutorWithNodesTests fixture that can register mock data nodes and run real RpcServer instances with installed handlers, plus ~150 integration/unit tests covering distributed execution, transactions, joins, aggregates, DDL, and many SQL features.

Changes

Cohort / File(s)	Summary
Distributed executor & integration tests `tests/distributed_executor_tests.cpp`	Adds `DistributedExecutorWithNodesTests` fixture, utilities to register mock data nodes and start/stop `network::RpcServer`, install RPC handlers, and emit serialized `QueryResultsReply` and transaction replies. Implements many integration-style tests for RPC connect-failures, shard-routed SELECTs, DDL forwarding, sharded INSERT success/failure, 2PC Txn flows (prepare/commit/abort), join shuffle/bloom/execute sequences (INNER/LEFT/RIGHT; FULL disabled), aggregates, ORDER/LIMIT/OFFSET, and several disabled negative tests.
Query executor unit tests `tests/query_executor_tests.cpp`	Appends numerous `TEST_F(QueryExecutorTests, ...)` cases exercising SQL caching, prepared statements for INSERT/SELECT, index creation/drop and maintenance (UPDATE/DELETE), transaction edge-cases (commit/rollback/no-begin, nested begins), broad SQL feature smoke tests (aggregates, GROUP BY/HAVING, predicates, subqueries, ORDER/LIMIT/OFFSET), DDL/DML via string API, parse/plan error cases, and a metadata validation test for index column positions.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Test as Executor (test)
    participant Rpc as RpcServer
    participant Handler as RPC Handler
    participant Node as Mock Data Node
    rect rgba(200,230,255,0.5)
    Test->>Rpc: register node metadata / connect
    Rpc-->>Test: accept or refuse connection
    Test->>Rpc: send ExecuteFragment or Txn RPC
    Rpc->>Handler: invoke registered handler
    Handler->>Node: request QueryResultsReply or txn result
    Node-->>Handler: return serialized response
    Handler-->>Rpc: send serialized reply
    Rpc-->>Test: deliver reply (success or error)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

test: add query_executor unit tests #26: Adds extensive QueryExecutor unit tests and modifies tests/query_executor_tests.cpp — strong overlap with appended query tests.
test: add distributed_executor and ShardManager unit tests #27: Adds DistributedExecutor/ShardManager tests and edits tests/distributed_executor_tests.cpp — closely related to the new fixture and integration tests.
Phase 7: Replication & High Availability (v2) #9: Changes to RpcServer implementation (worker/accept/stop behavior) — relevant to spinning up real network::RpcServer instances in these tests.

Suggested labels

enhancement

Poem

🐰 I spun up tiny servers, ears alight,
Hopped through shards and joins into the night,
Replies serialized, transactions in line,
Tests hop, assert, and pass — a carrot-fueled sign,
🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 74.47% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'test: distributed_executor.cpp branch coverage' is directly related to the main purpose of the PR, which adds comprehensive test coverage for distributed_executor.cpp and reports 35.16% branch coverage improvement.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/dist-exec-coverage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/distributed_executor_tests.cpp (1)

532-557: Extract the inline Txn* reply lambdas into a helper.

CommitWithNodes (lines 532-557) and RollbackWithNodes (lines 577-590) repeat the same boilerplate as set_execute_fragment_handler — only the RpcType differs. The duplication invites drift (e.g., updating header layout in one place but not others). Add a small helper alongside set_execute_fragment_handler:

♻️ Generic success-reply handler

+    // Set up a handler that returns successful QueryResultsReply for any RpcType
+    void set_success_reply_handler(network::RpcServer& srv, network::RpcType in_type,
+                                   network::RpcType reply_type) {
+        srv.set_handler(in_type,
+                        [reply_type](const network::RpcHeader&,
+                                     const std::vector<uint8_t>&, int fd) {
+                            network::QueryResultsReply reply;
+                            reply.success = true;
+                            auto data = reply.serialize();
+                            network::RpcHeader resp_h;
+                            resp_h.type = reply_type;
+                            resp_h.payload_len = static_cast<uint16_t>(data.size());
+                            char h_buf[network::RpcHeader::HEADER_SIZE];
+                            resp_h.encode(h_buf);
+                            send(fd, h_buf, network::RpcHeader::HEADER_SIZE, 0);
+                            if (!data.empty()) send(fd, data.data(), data.size(), 0);
+                        });
+    }

Then call set_success_reply_handler(*servers_[0], network::RpcType::TxnPrepare, network::RpcType::TxnPrepare); etc. As a side benefit you avoid the double reply.serialize() call that exists in every current handler.

Also applies to: 577-590

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/distributed_executor_tests.cpp` around lines 532 - 557, The two inline
lambdas used in CommitWithNodes/RollbackWithNodes that call
servers_[0]->set_handler for network::RpcType::TxnPrepare and TxnCommit should
be refactored into a shared helper (similar to set_execute_fragment_handler)
that takes a Server reference and an RpcType and sends a generic successful
QueryResultsReply; update CommitWithNodes and RollbackWithNodes to call that
helper instead of duplicating logic; ensure the helper builds the reply once
(avoid calling reply.serialize() twice), sets resp_h.type from the passed
RpcType, encodes the header, sends header+payload to fd, and reuse the same
helper for both TxnPrepare and TxnCommit handlers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/distributed_executor_tests.cpp`:
- Line 368: The fixture defines a static next_port_ but never uses it; either
remove next_port_ or use it consistently to allocate unique ports for tests like
SelectWithNodesButNoServer and SelectWithShardRouting instead of hard-coding
6411/6412. To fix, replace hard-coded port literals in those tests with an
allocation from next_port_ (e.g., take next_port_++ or a GetNextPort() helper
that increments next_port_), or delete the next_port_ member and update
comments; ensure RpcServer behavior (SO_REUSEADDR) remains unchanged and update
all references (SelectWithNodesButNoServer, SelectWithShardRouting and other
tests noted) to avoid duplicate port assignments.
- Around line 481-498: The test
DistributedExecutorWithNodesTests::DDLForwardsToNodes currently never verifies
the handler was called; modify the set_execute_fragment_handler(...) usage to
capture an invocation counter (e.g., atomic<int> or std::atomic<size_t>) inside
the ExecuteFragment handler lambda registered on servers_[0], increment it when
the handler runs, then after auto res = exec_->execute(...); add an assertion
that the counter was incremented (e.g., EXPECT_EQ(counter.load(), 1)) to prove
forwarding occurred; keep the existing EXPECT_TRUE(res.success()) as well.

---

Nitpick comments:
In `@tests/distributed_executor_tests.cpp`:
- Around line 532-557: The two inline lambdas used in
CommitWithNodes/RollbackWithNodes that call servers_[0]->set_handler for
network::RpcType::TxnPrepare and TxnCommit should be refactored into a shared
helper (similar to set_execute_fragment_handler) that takes a Server reference
and an RpcType and sends a generic successful QueryResultsReply; update
CommitWithNodes and RollbackWithNodes to call that helper instead of duplicating
logic; ensure the helper builds the reply once (avoid calling reply.serialize()
twice), sets resp_h.type from the passed RpcType, encodes the header, sends
header+payload to fd, and reuse the same helper for both TxnPrepare and
TxnCommit handlers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5c61aec-b421-468a-abb5-f95fd0adb469

📥 Commits

Reviewing files that changed from the base of the PR and between 109c3b9 and 720eb89.

📒 Files selected for processing (1)

tests/distributed_executor_tests.cpp

coderabbitai · 2026-04-25T16:29:24Z

+        config_.mode = config::RunMode::Coordinator;
+        cm_ = std::make_unique<ClusterManager>(&config_);
+        exec_ = std::make_unique<DistributedExecutor>(*catalog_, *cm_);
+        next_port_ = 6410;  // Start from different base to avoid rpc_client_tests


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Confirm RpcServer enables SO_REUSEADDR; otherwise port reuse between tests will flake. fd -t f 'rpc_server\.(c|cc|cpp|cxx)$' | xargs rg -nP -C3 'SO_REUSEADDR|setsockopt'

Repository: poyrazK/cloudSQL

Length of output: 261

Unused next_port_ suggests incomplete implementation; remove or use it consistently.

The static next_port_ member is initialized but never read by any test — each test hard-codes its own port. Additionally, SelectWithNodesButNoServer (lines 437-438) and SelectWithShardRouting (lines 454-463) both use ports 6411/6412, duplicating port assignments within the same fixture. While RpcServer properly enables SO_REUSEADDR (verified in rpc_server.cpp), hard-coding identical ports is still poor practice and makes it unclear whether port allocation is intentional or accidental.

Either remove the dead next_port_ member entirely, or use it to allocate unique ports per test instead of hard-coding them. This improves maintainability and makes test setup intentions explicit.

Also applies to: 423-423, 431-431

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/distributed_executor_tests.cpp` at line 368, The fixture defines a static next_port_ but never uses it; either remove next_port_ or use it consistently to allocate unique ports for tests like SelectWithNodesButNoServer and SelectWithShardRouting instead of hard-coding 6411/6412. To fix, replace hard-coded port literals in those tests with an allocation from next_port_ (e.g., take next_port_++ or a GetNextPort() helper that increments next_port_), or delete the next_port_ member and update comments; ensure RpcServer behavior (SO_REUSEADDR) remains unchanged and update all references (SelectWithNodesButNoServer, SelectWithShardRouting and other tests noted) to avoid duplicate port assignments.

coderabbitai · 2026-04-25T16:29:25Z

+TEST_F(DistributedExecutorWithNodesTests, DDLForwardsToNodes) {
+    // Start RpcServer for node1
+    auto srv1 = std::make_unique<network::RpcServer>(6413);
+    srv1->start();
+    servers_.push_back(std::move(srv1));
+
+    cm_->register_node("node_1", "127.0.0.1", 6413, config::RunMode::Data);
+    set_execute_fragment_handler(*servers_[0], true);
+
+    // DDL: CREATE TABLE - should succeed locally and try to forward to nodes
+    auto lexer = std::make_unique<Lexer>("CREATE TABLE ddl_test (id INT, name TEXT)");
+    Parser parser(std::move(lexer));
+    auto stmt = parser.parse_statement();
+    ASSERT_NE(stmt, nullptr);
+
+    auto res = exec_->execute(*stmt, "CREATE TABLE ddl_test (id INT, name TEXT)");
+    EXPECT_TRUE(res.success());  // DDL always succeeds (local catalog)
+}


⚠️ Potential issue | 🟡 Minor

DDLForwardsToNodes does not actually verify forwarding.

The only assertion (EXPECT_TRUE(res.success())) already passes for DDL with zero nodes (see ExecuteDDLWithoutNodes at lines 140-148). Registering a node and installing an ExecuteFragment handler here does not prove the executor invoked it — the test would still pass if forwarding were silently broken. Capture an invocation counter in the handler lambda and assert it after execute(...).

♻️ Make the test observe the forwarding

- cm_->register_node("node_1", "127.0.0.1", 6413, config::RunMode::Data); - set_execute_fragment_handler(*servers_[0], true); + cm_->register_node("node_1", "127.0.0.1", 6413, config::RunMode::Data); + std::atomic<int> calls{0}; + servers_[0]->set_handler(network::RpcType::ExecuteFragment, + [&calls, this](const network::RpcHeader& h, + const std::vector<uint8_t>& p, int fd) { + calls.fetch_add(1); + // delegate to the standard success reply + set_execute_fragment_handler(*servers_[0], true); + // ...or inline a success reply here + }); @@ - EXPECT_TRUE(res.success()); // DDL always succeeds (local catalog) + EXPECT_TRUE(res.success()); + // Wait briefly for the async forward, then assert the handler was hit. + EXPECT_GT(calls.load(), 0);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/distributed_executor_tests.cpp` around lines 481 - 498, The test DistributedExecutorWithNodesTests::DDLForwardsToNodes currently never verifies the handler was called; modify the set_execute_fragment_handler(...) usage to capture an invocation counter (e.g., atomic<int> or std::atomic<size_t>) inside the ExecuteFragment handler lambda registered on servers_[0], increment it when the handler runs, then after auto res = exec_->execute(...); add an assertion that the counter was incremented (e.g., EXPECT_EQ(counter.load(), 1)) to prove forwarding occurred; keep the existing EXPECT_TRUE(res.success()) as well.

Added 8 new tests to DistributedExecutorWithNodesTests: - InnerJoinShuffle, LeftJoinShuffle, RightJoinShuffle: exercise Phase 1/2 ShuffleFragment, bloom filter aggregation/broadcast - SelectWithCountAggregate, SelectWithSumAggregate: global aggregates - SelectWithOrderBy: global sort with key resolution - SelectWithLimitOffset: result truncation - DISABLED_FullJoinPhase3_5, DISABLED_CommitPrepareFailure: hang due to async thread reply timing Branch coverage: 21.75% → 32.94% taken at least once

The TxnPrepare client.call() hangs even with reply handler set. Root cause: RpcServer handler runs in worker thread and send() completes but RpcClient.call() may not receive it due to socket buffering or timing. DISABLED for now.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/distributed_executor_tests.cpp (1)
540-616: Extract a shared helper for the transaction RPC handlers.

The TxnPrepare / TxnCommit reply blocks (lines 548–573), the TxnAbort handler (lines 593–606), and the TxnPrepare-failure handler in the disabled test (lines 869–882) all repeat the same ~10-line "build QueryResultsReply, encode header, send" boilerplate. A small helper on the fixture (e.g., set_simple_reply_handler(server, rpc_type, success, error_msg)) would substantially reduce duplication and bring these tests in line with the existing set_execute_fragment_handler pattern. This also reduces the surface area where the payload_len cast issue noted above must be fixed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/distributed_executor_tests.cpp` around lines 540 - 616, Extract a small
helper on the test fixture (e.g., set_simple_reply_handler(server, rpc_type,
bool success, const std::string& err="")) that builds a
network::QueryResultsReply, serializes it, encodes a network::RpcHeader and
sends header+payload via the server's set_handler callback; replace the
duplicate lambdas for RpcType::TxnPrepare, RpcType::TxnCommit and
RpcType::TxnAbort with calls to this helper (use the same server object passed
into the helper instead of servers_[0]). Ensure you compute auto data =
reply.serialize() first and then set resp_h.payload_len =
static_cast<uint16_t>(data.size()) before resp_h.encode(...) to fix the
payload_len cast ordering issue.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/distributed_executor_tests.cpp`:
- Around line 470-475: RpcServer::start() calls are unchecked and can silently
fail; change every srv->start() invocation (e.g., the srv1/srv2 startup in
distributed_executor_tests.cpp) to assert the return value (e.g.,
ASSERT_TRUE(srv->start())) or create a small helper like register_mock_node that
constructs the RpcServer, calls start(), asserts it returned true, and then
pushes the server into servers_; update all occurrences (all srv->start() calls)
to use that assertion or the helper so test setup fails noisily on bind/start
errors.
- Around line 395-437: The code blindly casts reply.serialize().size() to
RpcHeader::payload_len (uint16_t) and calls serialize() twice in
send_success_reply and the lambda in set_execute_fragment_handler, risking
silent truncation and wasted work; fix by serializing once into a local variable
(e.g., auto data = reply.serialize()), check data.size() <=
std::numeric_limits<uint16_t>::max(), and if it exceeds that limit handle it
(set reply.success=false with an explanatory error_msg or return/log an error)
before assigning resp_h.payload_len = static_cast<uint16_t>(data.size()) and
sending the header and data; apply the same change to send_success_reply and the
ExecuteFragment handler to avoid double-serialization and truncation.

---

Nitpick comments:
In `@tests/distributed_executor_tests.cpp`:
- Around line 540-616: Extract a small helper on the test fixture (e.g.,
set_simple_reply_handler(server, rpc_type, bool success, const std::string&
err="")) that builds a network::QueryResultsReply, serializes it, encodes a
network::RpcHeader and sends header+payload via the server's set_handler
callback; replace the duplicate lambdas for RpcType::TxnPrepare,
RpcType::TxnCommit and RpcType::TxnAbort with calls to this helper (use the same
server object passed into the helper instead of servers_[0]). Ensure you compute
auto data = reply.serialize() first and then set resp_h.payload_len =
static_cast<uint16_t>(data.size()) before resp_h.encode(...) to fix the
payload_len cast ordering issue.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9e04ba6f-e4d2-48e7-aa28-ae898def1002

📥 Commits

Reviewing files that changed from the base of the PR and between 720eb89 and abc9f21.

📒 Files selected for processing (1)

tests/distributed_executor_tests.cpp

coderabbitai · 2026-04-25T17:46:58Z

+    auto srv1 = std::make_unique<network::RpcServer>(6411);
+    auto srv2 = std::make_unique<network::RpcServer>(6412);
+    srv1->start();
+    srv2->start();
+    servers_.push_back(std::move(srv1));
+    servers_.push_back(std::move(srv2));


⚠️ Potential issue | 🟡 Minor

RpcServer::start() return value is unchecked across all tests.

Every srv->start() call (lines 472–473, 500, 521, 542, 588, 622, 643–644, 678–679, 713–714, 748–749, 790, 808, 826, 844, 863) ignores the boolean return. If a port is already bound (e.g., concurrent CI runs, or the duplicate 6411/6412 reuse already flagged), start() likely returns false and the test will silently exercise a connect-failure path while the assertion text claims it is exercising the success path — producing misleading green coverage. Consider ASSERT_TRUE(srv->start()) (or asserting via a register_mock_node helper that wraps it).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/distributed_executor_tests.cpp` around lines 470 - 475, RpcServer::start() calls are unchecked and can silently fail; change every srv->start() invocation (e.g., the srv1/srv2 startup in distributed_executor_tests.cpp) to assert the return value (e.g., ASSERT_TRUE(srv->start())) or create a small helper like register_mock_node that constructs the RpcServer, calls start(), asserts it returned true, and then pushes the server into servers_; update all occurrences (all srv->start() calls) to use that assertion or the helper so test setup fails noisily on bind/start errors.

New tests: - UpdateWithShardKey: exercises lines 535-536 where_expr for UPDATE - DeleteWithShardKey: exercises lines 537-538 where_expr for DELETE - SelectWithMinAggregate: MIN branch in global aggregate loop (line 879) - SelectWithMaxAggregate: MAX branch (line 882) Branch coverage: 33.41% → 35.16% (443/1260 taken)

These tests hang when using inline lambda handlers that return rows in std::async context with RpcServer worker threads. Reverting to simple handler and disabling the tests to keep suite passing. Co-authored-by: claude-code

These tests were disabled due to inline lambda crashes in std::async context. Now using set_execute_fragment_handler() which works correctly. Co-authored-by: claude-code

Added cm_->set_leader(2, "node_1") to trigger the leader routing path (lines 549-558) where get_leader() returns non-empty and found_leader branch is exercised. Co-authored-by: claude-code

Shard idx = hash("1") % 2 = 0, so group_id = shard_idx + 1 = 1. This triggers the leader routing path at lines 549-558. Co-authored-by: claude-code

Added 10 new tests: - SqlCachingSecondCall / SqlCachingDifferentQueries (lines 248-277) - PrepareInsertStatement / PrepareSelectStatement / PrepareWithNonexistentTable (lines 133-171) - UpdateWithIndexRebuild (lines 822-848) - DeleteWithIndex / DeleteNonexistentRow - CreateMultipleIndexes Coverage: 29.24% → 35.50% (+6.26%) Co-authored-by: claude-code

…cutor Added 16 new tests: - CommitWithoutTransaction / RollbackWithoutTransaction / BeginAfterActiveTransaction - CreateIndexOnNonexistentColumn / CreateIndexThenSelect - DropIndexByName - SelectWithCountStar / SelectWithSum / SelectWithAvg - SelectWithOrderByNew / SelectWithLimitNew / SelectWithOffsetNew Coverage: 35.50% → 36.93% (+1.43%) Co-authored-by: claude-code

… tests Added 9 new tests: - SelectWithIn, SelectWithLike - SelectWithIsNull, SelectWithIsNotNull - SelectWithGroupByHaving - SelectWithConcat, SelectWithAbs, SelectWithMaxMin - SelectWithSubquery Coverage: 36.93% → 37.64% (+0.71%) Co-authored-by: claude-code

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

tests/distributed_executor_tests.cpp (2)
371-382: TearDown removes files that no test creates.

./test_dist_exec/node1.db and ./test_dist_exec/node2.db are never produced by any test in this file — every test interacts with the executor purely through the mocked RpcServer handlers. The std::remove calls are harmless but misleading; consider removing them (or, if a future test does write under ./test_dist_exec/, prefer wiping the directory wholesale rather than enumerating individual filenames).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/distributed_executor_tests.cpp` around lines 371 - 382, The TearDown
method currently calls std::remove on "./test_dist_exec/node1.db" and
"./test_dist_exec/node2.db" even though no tests create those files; remove
these two std::remove calls from TearDown in the TearDown override to avoid
misleading cleanup, or if you want a future-proof solution replace those two
removes with a single directory wipe of "./test_dist_exec/" (e.g., remove all
files in that directory) so cleanup is correct and not hard-coded; locate the
method named TearDown in the same test fixture that references servers_, exec_,
cm_, and catalog_ to apply the change.
363-369: Reassigning the static next_port_ counter inside SetUp() defeats it.

next_port_ is a static atomic counter intended to hand out unique ports across tests, but SetUp() runs per-test and resets it to 6410 every time, so it can never advance. Combined with the fact that no test currently reads it (and ports 6411/6412 are still hard-coded in multiple tests), the member is effectively dead. Either remove the assignment and use next_port_.fetch_add(1) consistently in register_mock_node/each test, or drop the member entirely.
♻️ Suggested cleanup
     void SetUp() override {
         catalog_ = Catalog::create();
         config_.mode = config::RunMode::Coordinator;
         cm_ = std::make_unique<ClusterManager>(&config_);
         exec_ = std::make_unique<DistributedExecutor>(*catalog_, *cm_);
-        next_port_ = 6410;  // Start from different base to avoid rpc_client_tests
     }
And, if you keep next_port_, route register_mock_node and the inline tests through next_port_.fetch_add(1) instead of hard-coded literals.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/distributed_executor_tests.cpp` around lines 363 - 369, The SetUp()
method currently resets the static atomic counter next_port_ to 6410 each test
which prevents it from ever advancing; remove the assignment from SetUp() so
next_port_ remains a global allocator, then update register_mock_node and any
tests that currently use hard-coded ports (e.g., literals 6411/6412) to obtain
ports via next_port_.fetch_add(1) so each test/mocked node gets a unique port;
alternatively, if you prefer to drop the counter, remove the next_port_ static
member and replace all uses with explicit per-test constants to keep behavior
consistent.
tests/query_executor_tests.cpp (1)
559-927: Test isolation: many new tests leave stale .heap files behind.

TestEnvironment::~TestEnvironment() (lines 57–62) only removes test_table.heap, table_a.heap, and table_b.heap, but the new tests create dozens of additional tables — cache_test, prep_test, prep_select, null_plan_test, idx_update, idx_del, del_test, multi_idx, idx_err, idx_test, drop_idx_test, agg_test, sum_test, avg_test, order_test2, limit_test2, offset_test2, group_having, in_test, like_test, null_test, notnull_test, concat_test, math_test, maxmin_test, sub_test — all of which share the same ./test_data directory across runs. Because each test constructs a fresh in-memory Catalog but reuses the on-disk files, residual heap pages from a previous run can leak into a subsequent run and produce non-deterministic results (especially under coverage builds where the same target is run repeatedly).

Either (a) make TestEnvironment use a unique per-test directory (e.g., testing::UnitTest::GetInstance()->current_test_info()->name()), or (b) blow away ./test_data wholesale in the destructor instead of enumerating individual files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/query_executor_tests.cpp` around lines 559 - 927,
TestEnvironment::~TestEnvironment currently only removes a few hard-coded heap
files (test_table.heap, table_a.heap, table_b.heap) causing stale .heap files
from many newly added tests (e.g., cache_test, prep_test, idx_update, etc.) to
persist; update TestEnvironment to ensure test isolation by either (a) switching
TestEnvironment to create a unique per-test data directory (use
testing::UnitTest::GetInstance()->current_test_info()->name() or similar when
constructing TestEnvironment) so each test writes to its own folder, or (b)
modify TestEnvironment::~TestEnvironment to recursively remove/clean the entire
./test_data directory instead of enumerating specific filenames; locate the
destructor TestEnvironment::~TestEnvironment and implement one of these two
approaches so all .heap files created by tables like cache_test, prep_select,
null_plan_test, idx_update, etc. are removed between tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/distributed_executor_tests.cpp`:
- Around line 566-586: The test moves srv1 into servers_ and then calls
srv1->set_handler, causing a use-after-move on the moved-from unique_ptr; fix by
either calling srv1->set_handler(...) before pushing/moving srv1 into servers_
or by accessing the stored server object after the move (e.g.
servers_.back()->set_handler(...)); update the code around RpcServer srv1,
servers_, and the set_handler call so the handler is registered on a valid
RpcServer instance before registering the node and starting the test.
- Around line 1010-1036: In DISABLED_CommitPrepareFailure the test moves srv1
into servers_ with std::move( srv1 ) then calls srv1->set_handler(...), causing
use-after-move; fix by registering both Rpc handlers on srv1 before pushing it
into servers_ (i.e., call srv1->set_handler(...) for RpcType::TxnPrepare and
RpcType::TxnAbort prior to servers_.push_back(std::move(srv1))), or
alternatively after the move access the owned server via
servers_.back()->set_handler(...) instead of srv1; update the test in the
DISABLED_CommitPrepareFailure function to use one of these approaches so
set_handler is never called on a moved-from std::unique_ptr.

---

Nitpick comments:
In `@tests/distributed_executor_tests.cpp`:
- Around line 371-382: The TearDown method currently calls std::remove on
"./test_dist_exec/node1.db" and "./test_dist_exec/node2.db" even though no tests
create those files; remove these two std::remove calls from TearDown in the
TearDown override to avoid misleading cleanup, or if you want a future-proof
solution replace those two removes with a single directory wipe of
"./test_dist_exec/" (e.g., remove all files in that directory) so cleanup is
correct and not hard-coded; locate the method named TearDown in the same test
fixture that references servers_, exec_, cm_, and catalog_ to apply the change.
- Around line 363-369: The SetUp() method currently resets the static atomic
counter next_port_ to 6410 each test which prevents it from ever advancing;
remove the assignment from SetUp() so next_port_ remains a global allocator,
then update register_mock_node and any tests that currently use hard-coded ports
(e.g., literals 6411/6412) to obtain ports via next_port_.fetch_add(1) so each
test/mocked node gets a unique port; alternatively, if you prefer to drop the
counter, remove the next_port_ static member and replace all uses with explicit
per-test constants to keep behavior consistent.

In `@tests/query_executor_tests.cpp`:
- Around line 559-927: TestEnvironment::~TestEnvironment currently only removes
a few hard-coded heap files (test_table.heap, table_a.heap, table_b.heap)
causing stale .heap files from many newly added tests (e.g., cache_test,
prep_test, idx_update, etc.) to persist; update TestEnvironment to ensure test
isolation by either (a) switching TestEnvironment to create a unique per-test
data directory (use
testing::UnitTest::GetInstance()->current_test_info()->name() or similar when
constructing TestEnvironment) so each test writes to its own folder, or (b)
modify TestEnvironment::~TestEnvironment to recursively remove/clean the entire
./test_data directory instead of enumerating specific filenames; locate the
destructor TestEnvironment::~TestEnvironment and implement one of these two
approaches so all .heap files created by tables like cache_test, prep_select,
null_plan_test, idx_update, etc. are removed between tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a6d9dd0f-0a55-4c91-9fc9-4bdcf0ba7ab7

📥 Commits

Reviewing files that changed from the base of the PR and between abc9f21 and 59a2461.

📒 Files selected for processing (2)

tests/distributed_executor_tests.cpp
tests/query_executor_tests.cpp

…xecutor - Add StringExecute tests to cover SQL caching path (lines 248-277) - Add StringExecuteSelectMultiple, StringExecuteSelectOrderBy, StringExecuteSelectLimit - Add StringExecuteJoin, StringExecuteSelectGroupBy - Add CREATE INDEX, DROP INDEX, DROP TABLE via string execute - Add PreparedInsertWithParams, StringExecuteUpdate, StringExecuteDelete - Add CreateTableLocalOnlyMode to cover is_local_only_ path (line 440) - Add CreateCompositeIndexNotSupported to cover rejection (line 473) - Coverage: 37.64% → 39.43% (+1.79%) - Test count: 75 → 97 (+22)

…tests - Add UpdateIndexedColumn tests for UPDATE index rebuild paths (816-848) - Add SelectFromNonexistentTable for execute_select error path (398-399) - Add StringExecuteDDL for DDL via string execute path - Add RollbackWithoutBegin, CommitWithoutBegin, BeginTwice for transaction edge cases - Add SelectWithQualifiedColumn, SelectCountWithGroupBy, SelectMinMaxWithGroupBy - Test count: 97 → 108 (+11)

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

tests/query_executor_tests.cpp (2)
57-62: ⚠️ Potential issue | 🟠 Major

Clean up all generated test artifacts, not just three fixed files.

This fixture creates many different tables and indexes under ./test_data, but teardown only removes test_table.heap, table_a.heap, and table_b.heap. That leaves stale files behind and can make reruns order-dependent or flaky after a previous failed run.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/query_executor_tests.cpp` around lines 57 - 62, The TestEnvironment
destructor (~TestEnvironment) only removes three hard-coded files; change it to
remove all generated artifacts under "./test_data" instead: iterate the
"./test_data" directory (e.g., using std::filesystem::directory_iterator or
std::filesystem::remove_all) and delete all files matching test artifacts (or
remove the entire directory contents) while handling/ignoring non-existent
entries and logging/reporting errors as appropriate so reruns are deterministic
and no stale .heap/.idx or other files remain.
473-487: ⚠️ Potential issue | 🟠 Major

Use a separate transaction context for the isolation check.

exec2 shares the same TransactionManager instance as the writer, so this does not really model a second client. The read can still observe the same active transaction state, which makes the “read before commit” assertion unreliable. If you want to verify isolation, run the read from a distinct transaction context/thread.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/query_executor_tests.cpp` around lines 473 - 487, The test uses
QueryExecutor exec2 constructed with env.txn_manager so it does not model an
independent client; change the read to use a distinct transaction context by
constructing exec2 with a separate TransactionManager instance (instead of
env.txn_manager) or by starting an independent transaction on exec2 before the
UPDATE (e.g., call execute_sql(exec2, "BEGIN") to create its own transaction
context) and ensure the read SELECT runs in that separate transaction so it
cannot see the uncommitted UPDATE from the writer.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/query_executor_tests.cpp`:
- Around line 1012-1029: The two tests are currently not asserting failures: in
CreateCompositeIndexNotSupported capture the result of
env.executor.execute("CREATE INDEX comp_idx_ab ON comp_idx(a, b)") and assert
the expected rejection (e.g., use the result object's success()/error() accessor
to assert failure and check the error message), and in UnsupportedStatementType
either implement a real assertion that constructing/executing an unsupported
Statement triggers the error path (again assert via success()/error()) or remove
the empty TEST_F placeholder; locate these changes around the TestEnvironment
usage and the tests named CreateCompositeIndexNotSupported and
UnsupportedStatementType and update assertions to verify the error path rather
than discarding the result.
- Around line 952-963: The test StringExecuteWithCacheHit never triggers a cache
hit because the second execute call uses a different SQL literal; change the
test to reuse the exact same SQL string on both executions (e.g., store "INSERT
INTO cache_hit VALUES (1, 'a')" in a local const std::string sql and call
env.executor.execute(sql) twice) or alternatively replace the second call with
the identical SQL literal so the executor can hit the cache; update EXPECTs
accordingly (use TEST_F QueryExecutorTests, the env.executor.execute calls and
the initial execute_sql that creates the table to locate the change).

---

Outside diff comments:
In `@tests/query_executor_tests.cpp`:
- Around line 57-62: The TestEnvironment destructor (~TestEnvironment) only
removes three hard-coded files; change it to remove all generated artifacts
under "./test_data" instead: iterate the "./test_data" directory (e.g., using
std::filesystem::directory_iterator or std::filesystem::remove_all) and delete
all files matching test artifacts (or remove the entire directory contents)
while handling/ignoring non-existent entries and logging/reporting errors as
appropriate so reruns are deterministic and no stale .heap/.idx or other files
remain.
- Around line 473-487: The test uses QueryExecutor exec2 constructed with
env.txn_manager so it does not model an independent client; change the read to
use a distinct transaction context by constructing exec2 with a separate
TransactionManager instance (instead of env.txn_manager) or by starting an
independent transaction on exec2 before the UPDATE (e.g., call
execute_sql(exec2, "BEGIN") to create its own transaction context) and ensure
the read SELECT runs in that separate transaction so it cannot see the
uncommitted UPDATE from the writer.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aa11c616-3bf4-4045-b47f-25e3f90ce683

📥 Commits

Reviewing files that changed from the base of the PR and between 59a2461 and eae4b34.

📒 Files selected for processing (1)

tests/query_executor_tests.cpp

- Remove empty UnsupportedStatementType test - Add assertion to CreateTableLocalOnlyMode - Improve DeleteWithIndexMaintenanceFailure and UpdateIndexRebuildFailure comments - Add assertion to CreateCompositeIndexNotSupported - Test count: 108 → 107 (-1 removed empty test)

…ild gap - Add VerifyIndexInMetadata test to confirm CREATE INDEX populates table_meta->indexes correctly - Change UpdateIndexedColumn tests to not use PRIMARY KEY (same pattern as CreateIndexBasic) - Test count: 107 → 108 (+1 investigation test) - Coverage: 39.43%

…ents - Remove ~30 line number references from distributed_executor_tests.cpp - Remove ~12 line number references from query_executor_tests.cpp - Replace with functional descriptions that won't become stale - Fix unused variable warning with [[maybe_unused]] attribute

Add class and section comments explaining: - These are branch coverage tests, not behavioral correctness tests - Weak assertions (void res) are intentional - 3 tests are DISABLED due to async timing issues

- InsertRpcFailure: fixed handler setup order - set handler BEFORE pushing unique_ptr into servers_ to avoid undefined behavior - CommitPrepareFailure: same fix - set handlers before push_back - FullJoinPhase3_5: remains DISABLED with detailed comment explaining async hang in Phase 3-5 std::async RPC calls despite proper handlers

Replace repetitive inline lambda handlers for TxnPrepare/TxnCommit/TxnAbort with a reusable helper that reduces code duplication across tests.

1. TestEnvironment destructor: Use std::filesystem to iterate and remove all test_data artifacts instead of hard-coded file list 2. StringExecuteWithCacheHit: Use same SQL string variable for both execute calls to actually test cache hit behavior 3. TransactionIsolationReadBeforeCommit: Add BEGIN on exec2 to create separate transaction context before SELECT 4. CreateCompositeIndexNotSupported: Assert failure with error message instead of conditional check

std::filesystem not properly linked in CI g++ build environment. Reverting to use explicit std::remove calls for known test artifacts. Also added explicit cleanup for additional test table heap files.

poyrazK

It's okay to merge

style: automated clang-format fixes

720eb89

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

poyrazK and others added 4 commits April 25, 2026 20:16

style: automated clang-format fixes

dcd2344

style: automated clang-format fixes

abc9f21

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

poyrazK and others added 10 commits April 25, 2026 21:19

style: automated clang-format fixes

d38749f

test: re-enable MIN/MAX/AVG aggregate tests (now using simple handler)

2733766

These tests were disabled due to inline lambda crashes in std::async context. Now using set_execute_fragment_handler() which works correctly. Co-authored-by: claude-code

test: exercise leader-aware routing branch in SelectWithShardRouting

0f4ea16

Added cm_->set_leader(2, "node_1") to trigger the leader routing path (lines 549-558) where get_leader() returns non-empty and found_leader branch is exercised. Co-authored-by: claude-code

test: fix leader group_id to match actual shard for id=1

fa94912

Shard idx = hash("1") % 2 = 0, so group_id = shard_idx + 1 = 1. This triggers the leader routing path at lines 549-558. Co-authored-by: claude-code

style: automated clang-format fixes

59a2461

coderabbitai Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread tests/distributed_executor_tests.cpp Outdated

Comment thread tests/distributed_executor_tests.cpp Outdated

poyrazK and others added 3 commits April 28, 2026 14:49

style: automated clang-format fixes

eae4b34

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread tests/query_executor_tests.cpp

Comment thread tests/query_executor_tests.cpp

poyrazK and others added 6 commits April 28, 2026 17:44

style: automated clang-format fixes

a62ca4d

style: automated clang-format fixes

5d50467

docs: add branch coverage test documentation

2354d65

Add class and section comments explaining: - These are branch coverage tests, not behavioral correctness tests - Weak assertions (void res) are intentional - 3 tests are DISABLED due to async timing issues

poyrazK and others added 6 commits April 30, 2026 12:15

style: automated clang-format fixes

eb35def

refactor: extract set_success_reply_handler helper for RPC handlers

c923356

Replace repetitive inline lambda handlers for TxnPrepare/TxnCommit/TxnAbort with a reusable helper that reduces code duplication across tests.

style: automated clang-format fixes

42d4c59

fix: revert TestEnvironment destructor to use std::remove

cf9867f

std::filesystem not properly linked in CI g++ build environment. Reverting to use explicit std::remove calls for known test artifacts. Also added explicit cleanup for additional test table heap files.

poyrazK commented Apr 30, 2026

View reviewed changes

poyrazK merged commit 9fbeea9 into main Apr 30, 2026
9 checks passed

Conversation

poyrazK commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Disabled tests (known async timing issues)

Changes from initial PR

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

poyrazK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

poyrazK commented Apr 25, 2026 •

edited

Loading

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading