Skip to content

Add engine prefix to kv table name#353

Merged
lzxddz merged 1 commit intomainfrom
change_kv_tablename
Jan 13, 2026
Merged

Add engine prefix to kv table name#353
lzxddz merged 1 commit intomainfrom
change_kv_tablename

Conversation

@lzxddz
Copy link
Collaborator

@lzxddz lzxddz commented Jan 9, 2026

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/tx_service#issue_id
  • Reference the link of RFC if exists
  • Pass ./mtr --suite=mono_main,mono_multi,mono_basic

Summary by CodeRabbit

  • New Features

    • Per-engine scoping: catalog, table and database entries are isolated per engine.
    • Bucket-aware startup: node groups bind and prewarm only their owned buckets.
  • Refactor

    • Unified engine-prefixed keying across discovery, read/write and catalog flows.
    • Discovery callbacks now strip engine prefixes and use stable ranged scans.
  • Chore

    • Public interfaces updated to carry engine context and bucket info; prewarm filter hook added.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 9, 2026

Walkthrough

Engine-specific KV key prefixes were added across catalog and database operations; TableEngine was propagated into multiple data-store APIs; OpenDataStore gained bucket-aware prewarm initialization; callback/closure structs and scan continuations were updated to track and strip engine prefixes.

Changes

Cohort / File(s) Summary
Data Store Client Interface
store_handler/data_store_service_client.h, store_handler/data_store_service_client.cpp
Added txservice::TableEngine table_engine to DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, FetchAllDatabase; methods construct/use engine-prefixed KV keys. Added public static InitBucketsInfo.
Callback Data Structures & Closures
store_handler/data_store_service_client_closure.h, store_handler/data_store_service_client_closure.cpp
Added fields to store engine-prefixed keys and prefix lengths (key_str_, start_key_, end_key_, engine_prefix_len_); updated Reset/Clear and ScanNext pagination to preserve continuation keys and strip engine prefixes from returned names.
Data Store Service & Factory Flow
store_handler/eloq_data_store_service/data_store_service.h, store_handler/eloq_data_store_service/data_store_service.cpp, store_handler/eloq_data_store_service/data_store_factory.h
OpenDataStore signature now accepts std::unordered_set<uint16_t> &&bucket_ids; service calls InitializePrewarmFilter(shard_id, std::move(bucket_ids)) on factory before startup. Added virtual InitializePrewarmFilter to factory.
EloqStore Factory Implementation
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
Implemented InitializePrewarmFilter to set a prewarm_filter lambda mapping table identifiers to bucket IDs (hashing for range partition, direct for hash); made config mutable to allow runtime init.
RocksDB Handler
store_handler/rocksdb_handler.h, store_handler/rocksdb_handler.cpp
Added unused txservice::TableEngine table_engine parameter to DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, FetchAllDatabase signatures (propagated API change).
Store Handler Interface
tx_service/include/store/data_store_handler.h
Added txservice::TableEngine table_engine as first parameter to five DB/catalog-related virtual methods.
Type / Sequences / CC
tx_service/include/type.h, tx_service/include/sequences/sequences.h, tx_service/include/cc/cc_req_misc.h
Added KvTablePrefixOf(TableEngine) helper and sequence_kv_table_name_sv; updated Sequences::kv_table_name_sv_ initializer; removed redis_table_name_sv; added kv_key_ to FetchCatalogCc.

Sequence Diagram(s)

sequenceDiagram
    participant Leader as Leader/NodeGroup
    participant DataSvc as DataStoreService
    participant Factory as DataStoreFactory
    participant KV as KVStore

    Leader->>DataSvc: OnLeaderStart(bucket_ownership info)
    DataSvc->>Factory: InitializePrewarmFilter(shard_id, bucket_ids)
    Factory-->>DataSvc: prewarm_filter configured
    DataSvc->>DataSvc: OpenDataStore(shard_id, bucket_ids) (bind owned buckets)
    DataSvc->>KV: Read/Write engine-prefixed catalog entries
    KV-->>DataSvc: Ack / data
    DataSvc-->>Leader: Startup complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • liunyl

Poem

🐇 I nibbled prefixes, one by one,

Buckets warmed beneath the sun,
Callbacks trim the engine sign,
Continuations hold the line,
Catalogs hum — the rabbit's done.

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete—it contains only the repository's checklist template with all items unchecked, lacking substantive details about changes, rationale, or impact. Provide a detailed description explaining the motivation, implementation approach, files modified, and any breaking changes or API updates. Verify that test, documentation, issue reference, and test suite requirements are fulfilled.
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add engine prefix to kv table name' clearly and concisely summarizes the main change across the PR, matching the primary objective of adding TableEngine prefixes to KV table names.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch change_kv_tablename

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lzxddz lzxddz force-pushed the change_kv_tablename branch from 9b07f9f to 2d88faf Compare January 9, 2026 06:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
store_handler/rocksdb_handler.h (1)

425-455: RocksDBHandler signatures are consistent with base class, but BigTableHandler and DynamoHandler are out of sync.

RocksDBHandler correctly implements all five methods with the new txservice::TableEngine table_engine parameter, matching the updated DataStoreHandler base class interface (lines 242–271 in tx_service/include/store/data_store_handler.h).

However, BigTableHandler (store_handler/bigtable_handler.h) and DynamoHandler (store_handler/dynamo_handler.h) still have the old signatures without the table_engine parameter. This breaks the contract with the base class and violates the Liskov Substitution Principle. These implementations must be updated to include the table_engine parameter for all five methods:

  • DiscoverAllTableNames
  • UpsertDatabase
  • DropDatabase
  • FetchDatabase
  • FetchAllDatabase
store_handler/data_store_service_client.cpp (2)

2450-2463: Same potential undefined behavior as in DiscoverAllTableNames.

Lines 2461-2462 have the same pattern where end_key.back()++ is called without verifying the string is non-empty:

callback_data->end_key_ = callback_data->start_key_;
callback_data->end_key_.back()++;

This requires the same verification that KvTablePrefixOf(table_engine) never returns an empty string.


2244-2257: Add a guard against TableEngine::None or document the precondition.

The concern about KvTablePrefixOf returning an empty string is valid. At tx_service/include/type.h:174-175, the function explicitly returns "" for TableEngine::None. Calling .back()++ on an empty string at line 2256 causes undefined behavior.

Either add an assertion to prevent TableEngine::None from being passed to this method, or update the docstring to explicitly state that table_engine must not be TableEngine::None.

store_handler/data_store_service_client_closure.h (1)

2676-2684: Clear FetchTableCallbackData::key_str_ on reuse.

key_str_ is newly added but never cleared in Reset()/Clear(), which is risky for pooled objects.

Proposed change
 struct FetchTableCallbackData : public SyncCallbackData
 {
@@
     void Clear() override
     {
         SyncCallbackData::Clear();
+        key_str_.clear();
         schema_image_ = nullptr;
         found_ = nullptr;
         version_ts_ = nullptr;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
     }
@@
     // Table name with engine prefix.
     std::string key_str_;
🤖 Fix all issues with AI agents
In @store_handler/data_store_service_client_closure.h:
- Around line 2857-2858: FetchAllDatabaseCallbackData::engine_prefix_len_ and
DiscoverAllTableNamesCallbackData::engine_prefix_len_ are not initialized or
reset, risking use of garbage in callers that do key.substr(engine_prefix_len_);
initialize engine_prefix_len_ to 0 in each class's constructor (or in-class
member initializer) and reset it to 0 in their Reset()/Clear() methods so any
subsequent key.substr(engine_prefix_len_) is safe.

In @store_handler/eloq_data_store_service/data_store_service.cpp:
- Around line 2201-2208: The call to
data_store_factory_->InitializePrewarmFilter currently occurs before checking
whether the shard is in the Closed state, causing side effects (and consuming
bucket_ids) even when OpenDataStore returns early; move the
InitializePrewarmFilter call to after the shard-closed guard so it only runs
when the shard is actually being opened. Concretely, in
DataStoreService::OpenDataStore, defer invoking
InitializePrewarmFilter(shard_id, std::move(bucket_ids)) until after the code
path that verifies the shard is not Closed (the early-return guard), and apply
the same reorder for the analogous block around the other occurrence noted in
the diff (the second InitializePrewarmFilter call). Ensure bucket_ids is only
moved when you are certain the open will proceed.
🧹 Nitpick comments (4)
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (1)

141-141: Consider narrowing the mutability (avoid making the whole config writable).

If only eloqstore_configs_.prewarm_filter needs to be updated, prefer making just that field mutable / stored separately rather than making the entire EloqStoreConfig mutable (reduces accidental future mutation surface).

tx_service/include/type.h (1)

170-191: Consider returning std::string_view (and fix fatal log typo) in KvTablePrefixOf.

Returning std::string here is potentially alloc-y for a pure mapping helper; a std::string_view (or const char*) would be cheaper. Also: "Unimplemnt" typo.

Proposed change
-inline std::string KvTablePrefixOf(TableEngine engine)
+inline std::string_view KvTablePrefixOf(TableEngine engine)
 {
     switch (engine)
     {
     case txservice::TableEngine::None:
-        return "";
+        return std::string_view{};
     case txservice::TableEngine::EloqSql:
         return "eloqsql_";
     case txservice::TableEngine::EloqKv:
         return "eloqkv_";
     case txservice::TableEngine::EloqDoc:
         return "eloqdoc_";
     case txservice::TableEngine::InternalRange:
         return "irange_";
     case txservice::TableEngine::InternalHash:
         return "ihash_";
     }
 
-    LOG(FATAL) << "Unimplemnt the KvTablePrefixOf for engine type: "
+    LOG(FATAL) << "Unimplemented KvTablePrefixOf for engine type: "
                << static_cast<int>(engine);
-    return "";
+    return std::string_view{};
 }
store_handler/eloq_data_store_service/data_store_factory.h (1)

112-120: Minor: tighten doc/name to match parameters (ng_id/bucket_ids).

The method name/doc says “Set the prewarm filter function”, but the API is actually “initialize filter from node-group/shard and bucket set”. Consider updating the brief to reflect what callers provide.

tx_service/include/cc/cc_req_misc.h (1)

130-133: Optional: consider narrowing exposure / clarifying naming for FetchCatalogCc::kv_key_.

If this is only for store::DataStoreHandler, a small accessor or a more specific name (e.g., catalog_kv_key_) would make misuse less likely.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b3feed and 9b07f9f.

📒 Files selected for processing (14)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/sequences/sequences.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-16T09:47:09.981Z
Learnt from: githubzilla
Repo: eloqdata/tx_service PR: 288
File: store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h:54-76
Timestamp: 2025-12-16T09:47:09.981Z
Learning: In RocksDBCloudDataStoreFactory::CreateDataStore (store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h), when oss_url is configured, the factory always appends /ds_<shard_id> to the oss_url. Users should not include the shard path in the oss_url themselves.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/data_store_service_client_closure.cpp
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • store_handler/data_store_service_client.h
🧬 Code graph analysis (6)
store_handler/eloq_data_store_service/data_store_factory.h (1)
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (1)
  • InitializePrewarmFilter (109-138)
store_handler/rocksdb_handler.h (2)
tx_service/include/type.h (3)
  • TableEngine (160-211)
  • std (736-746)
  • string (170-191)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
store_handler/data_store_service_client_closure.cpp (1)
tx_service/src/cc/cc_shard.cpp (1)
  • key (1327-1327)
store_handler/rocksdb_handler.cpp (2)
tx_service/include/schema.h (1)
  • table_engine (129-275)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
store_handler/data_store_service_client.h (2)
store_handler/data_store_service_client.cpp (10)
  • InitBucketsInfo (3865-3930)
  • InitBucketsInfo (3865-3869)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
tx_service/include/store/data_store_handler.h (3)
  • txservice (46-447)
  • std (170-174)
  • std (425-432)
store_handler/data_store_service_client.cpp (2)
store_handler/eloq_data_store_service/data_store_service.cpp (4)
  • Read (450-483)
  • Read (450-453)
  • Read (485-528)
  • Read (485-493)
tx_service/include/schema.h (1)
  • table_engine (129-275)
🔇 Additional comments (10)
store_handler/data_store_service_client.h (2)

179-184: InitBucketsInfo as a public static helper is a nice cleanup.


358-388: Fix GetCatalogFactory() to handle TableEngine::None safely.

The method at line 754 uses static_cast<int>(table_engine) - 1 to index catalog_factory_array_, which produces -1 when table_engine = TableEngine::None (value 0). This causes an out-of-bounds access. Either add a guard to reject TableEngine::None or expand the array to accommodate it.

store_handler/rocksdb_handler.cpp (1)

1495-1554: LGTM! Signature updates support multi-engine identification.

The five database-related methods (DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, FetchAllDatabase) have been consistently updated to accept txservice::TableEngine table_engine as the first parameter. While these RocksDBHandler implementations remain stubs (not implemented), the signature changes properly align with the PR's objective to enable table engine identification at the data_store level. Based on the relevant code snippets, the actual implementations in DataStoreServiceClient use KvTablePrefixOf(table_engine) to construct engine-prefixed keys for catalog operations.

tx_service/include/store/data_store_handler.h (1)

242-271: LGTM! Interface updates properly support multi-engine catalog operations.

The DataStoreHandler interface has been consistently updated to add txservice::TableEngine table_engine as the first parameter to five database-related pure virtual methods. The signature changes properly propagate the engine context to all implementations, enabling engine-prefixed key construction for catalog and table lookups. The uniform placement of the parameter across all methods maintains a clean and predictable API.

store_handler/data_store_service_client.cpp (3)

96-107: LGTM! Clear documentation of the engine prefix strategy.

The comment block effectively explains the prefixing approach for both KvTableNames and catalog keys, with concrete examples that will help future maintainers understand the naming scheme.


5076-5133: LGTM! Memory management is safe.

The code correctly handles the lifetime of keys_with_prefix:

  1. Builds the prefixed key strings in keys_with_prefix (lines 5117-5119)
  2. Creates string_view references to these strings (lines 5130-5133)
  3. Calls callback_data->Wait() (line 5152) to ensure BatchWriteRecords completes before keys_with_prefix goes out of scope

This ensures the string views remain valid throughout the async operation.


3708-3715: No actionable feedback. The bucket-aware initialization logic is implemented correctly. The code properly iterates bucket_infos_, filters buckets by ownership, and passes them to OpenDataStore, which invokes the prewarm filter mechanism as intended.

tx_service/include/type.h (1)

597-599: Verify sequence_kv_table_name_sv naming/migration expectations.

sequence_kv_table_name_sv = "ihash_sequence_table" is now part of the public naming contract. Please double-check that this is the intended persisted KV table name (vs deriving it from KvTablePrefixOf(TableEngine::InternalHash) + sequence_table_name_sv) and that any upgrade/backward-compat story is addressed.

tx_service/include/sequences/sequences.h (1)

75-80: LGTM: kv_table_name_sv_ now uses the KV-specific sequence table name.

This aligns Sequences with the new KV naming convention.

store_handler/eloq_data_store_service/data_store_service.h (1)

651-654: All OpenDataStore(...) call sites are properly updated with correct move semantics.

Verification found only one call site in OnLeaderStart() (data_store_service_client.cpp:3717), which correctly passes bucket_ids using std::move(). The parameter is properly populated by iterating through matching buckets (empty bucket_ids is valid when no buckets belong to the node group).

Comment on lines 623 to 630
scan_next_closure->GetItem(i, key, value, ts, ttl);
fetch_data->dbnames_->emplace_back(std::move(key));
fetch_data->dbnames_->emplace_back(
key.substr(fetch_data->engine_prefix_len_));
if (i == items_size - 1)
{
fetch_data->start_key_ = std::move(key);
}
}
Copy link

@coderabbitai coderabbitai bot Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard substr(engine_prefix_len_) and fix sequence-table filter under engine-prefixed keys.

Two correctness risks:

  1. std::string::substr(pos) throws std::out_of_range if pos > size(). If engine_prefix_len_ is wrong (or uninitialized), this becomes a crash.
  2. key == txservice::Sequences::table_name_sv_ likely won’t match when key includes an engine prefix; you probably want to compare against the stripped key.
Proposed change
         for (uint32_t i = 0; i < items_size; i++)
         {
             scan_next_closure->GetItem(i, key, value, ts, ttl);
-            fetch_data->dbnames_->emplace_back(
-                key.substr(fetch_data->engine_prefix_len_));
+            const size_t prefix_len = fetch_data->engine_prefix_len_;
+            if (prefix_len > key.size())
+            {
+                fetch_data->Result().set_error_code(
+                    remote::DataStoreError::INTERNAL_ERROR);
+                fetch_data->Result().set_error_msg(
+                    "engine_prefix_len_ exceeds key size");
+                fetch_data->Notify();
+                return;
+            }
+            fetch_data->dbnames_->emplace_back(key.substr(prefix_len));
             if (i == items_size - 1)
             {
                 fetch_data->start_key_ = std::move(key);
             }
         }
@@
         for (uint32_t i = 0; i < items_size; i++)
         {
             scan_next_closure->GetItem(i, key, value, ts, ttl);
             if (i == items_size - 1)
             {
                 fetch_data->start_key_ = key;
             }
-            if (key == txservice::Sequences::table_name_sv_)
+            const size_t prefix_len = fetch_data->engine_prefix_len_;
+            if (prefix_len > key.size())
+            {
+                fetch_data->Result().set_error_code(
+                    remote::DataStoreError::INTERNAL_ERROR);
+                fetch_data->Result().set_error_msg(
+                    "engine_prefix_len_ exceeds key size");
+                fetch_data->Notify();
+                return;
+            }
+            const std::string_view stripped =
+                std::string_view(key).substr(prefix_len);
+            if (stripped == txservice::Sequences::table_name_sv_)
             {
                 continue;
             }
-            fetch_data->table_names_->emplace_back(
-                key.substr(fetch_data->engine_prefix_len_));
+            fetch_data->table_names_->emplace_back(std::string(stripped));
         }

Also applies to: 691-704

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solved by " LOG_IF(FATAL, key.size() <= fetch_data->engine_prefix_len_)"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lzxddz, thanks for adding the LOG_IF(FATAL, ...) check! That effectively prevents the substr crash in FetchAllDatabaseCallback.

Two follow-up questions:

  1. DiscoverAllTableNamesCallback (lines ~691-704): Have you also added the same bounds check there? The original review mentioned "Also applies to: 691-704" and that callback also uses key.substr(fetch_data->engine_prefix_len_).

  2. Sequence table filter: In DiscoverAllTableNamesCallback, the code compares key == txservice::Sequences::table_name_sv_ to filter out the sequence table. Since key now includes the engine prefix, should this comparison strip the prefix first (e.g., key.substr(fetch_data->engine_prefix_len_) == txservice::Sequences::table_name_sv_)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"txservice::Sequences::table_name_sv_" must not be in the results at here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the definition of Sequences::table_name_sv_ to understand if it's base or KV name
rg -nC5 'table_name_sv_\s*=' tx_service/include/sequences/

# Also search for sequence_table_name_sv and sequence_kv_table_name_sv definitions
rg -nC3 'sequence.*table_name_sv\s*=' tx_service/include/

# Check what keys are used in catalog scans
rg -nC5 'table_catalogs' store_handler/data_store_service_client_closure.cpp

Length of output: 1035


🏁 Script executed:

#!/bin/bash
# Search for Sequences class members to see table_name_sv_ vs kv_table_name_sv_
rg -nC10 'class Sequences' tx_service/include/sequences/sequences.h

# Search for sequence_table_name_sv and sequence_kv_table_name_sv definitions in type.h
rg -nC5 'sequence.*table_name' tx_service/include/type.h

# Check how the scan is performed in DiscoverAllTableNames to see what prefix is used
rg -nB20 'DiscoverAllTableNamesCallback' store_handler/data_store_service_client.cpp | rg -A20 'ScanNext.*table_catalogs'

Length of output: 2535


🏁 Script executed:

#!/bin/bash
# Check how catalog keys are constructed - search for CreateKVCatalogInfo or similar
rg -nC10 'CreateKVCatalogInfo|CreateNewKVCatalogInfo' store_handler/data_store_service_client.cpp | head -100

# Search for how sequence table is initialized and stored
rg -nC5 'InitPreBuiltTables|sequence.*kv_table_name' store_handler/data_store_service_client.cpp | head -80

# Check if there's any special handling for sequence table catalog
rg -n 'sequence_table_name|sequence_kv_table_name' store_handler/data_store_service_client.cpp

Length of output: 3687


@lzxddz, you're absolutely right. After this PR, the catalog scan returns engine-prefixed KV table names (e.g., "InternalHash_ihash_sequence_table"), while txservice::Sequences::table_name_sv_ is the base name "__sequence_table".

Since:

  1. The sequence table's KV name is "ihash_sequence_table" (stored as Sequences::kv_table_name_sv_)
  2. Catalog keys now include the engine prefix
  3. The comparison checks against the base name "__sequence_table"

The literal string "__sequence_table" cannot appear in the scan results—the filter condition will never match and is effectively dead code.

Just to confirm: Is the sequence table catalog entry still written to table_catalogs, or has it been excluded from catalog storage entirely? If it's still there, should the filter be removed or updated to check the stripped key against Sequences::kv_table_name_sv_ instead?


🧬 Code Graph Analysis Results

<File: tx_service/include/sequences/sequences.h>
78-171

inline static const std::string_view kv_table_name_sv_{
        txservice::sequence_kv_table_name_sv};
    inline static const uint64_t seq_schema_version_{100U};

    static void InitSequence(TxService *tx_service,
                             store::DataStoreHandler *storage_hd)
    {
        instance_ = std::make_unique<Sequences>(tx_service, storage_hd);
    }
    static bool Initialized()
    {
        return instance_ != nullptr;
    }

    static void Destory()
    {
        instance_ = nullptr;
    }

    // When drop a table with auto increment id and range partition, call this
    // function to remove its sequence from here.
    static bool DeleteSequence(const TableName &table,
                               SequenceType seq_type,
                               bool only_clean_cache = false)
    {
        std::string seq_name = GenSeqName(table, seq_type);
        return DeleteSequenceInternal(seq_name, only_clean_cache);
    }

    // Apply a series of ids from a sequence with name='seq_name', If
    // increment>0, the step between two neighbor ids will be increment, else
    // the step will be rec_step_ in sequence. If it has not enough ids in this
    // range, reserved_vals will be surplus ids, else 'desired_vals' ids.
    // Return the first assigned id if success, else return -1.
    static int64_t ApplyIdOfAutoIncrColumn(
        const TableName &table,
        int64_t increment,
        int64_t desired_vals,
        int64_t &reserved_vals,
        uint64_t key_schema_version,
        std::pair<const std::function<void()> *, const std::function<void()> *>
            coro_functors,
        const std::function<void()> *long_resume_func,
        int16_t thd_group_id);

    static bool ApplyIdOfTableRangePartition(const TableName &table,
                                             int64_t desired_vals,
                                             int64_t &first_reserved_id,
                                             int64_t &reserved_vals,
                                             uint64_t key_schema_version);

    static TxKey GenKey(const std::string &seq_name);
    static std::unique_ptr<TxRecord> GenRecord();

    static int UpdateAutoIncrement(
        std::string content,
        std::string dbName,
        std::pair<const std::function<void()> *, const std::function<void()> *>
            coro_functors,
        int16_t group_id);

    static int32_t InitialRangePartitionIdOf(const TableName &table);
    static bool InitIdOfTableRangePartition(const TableName &table,
                                            int32_t last_range_id);

    /**
     * @param start_val The start id of sequence.
     * @param node_step_val The size of id range that acquired from server
     * every time.
     * @param rec_step_val The step between two neighbor record.
     * @param curr_val The first unassigned value.
     */
    static std::pair<txservice::TxKey, txservice::TxRecord::Uptr>
    GetSequenceKeyAndInitRecord(const txservice::TableName &table_name,
                                SequenceType seq_type,
                                int64_t start_val = 1,
                                int64_t node_step_val = 256,
                                int64_t rec_step_val = 1,
                                int64_t curr_val = 1)
    {
        std::string seq_name = GenSeqName(table_name, seq_type);

        TxKey seq_tx_key = GenKey(seq_name);
        std::unique_ptr<TxRecord> seq_tx_rec = GenRecord();
        std::string encoded_rec =
            EncodeSeqRecord(start_val, node_step_val, rec_step_val, curr_val);
        seq_tx_rec->SetEncodedBlob(
            reinterpret_cast<const unsigned char *>(encoded_rec.data()),
            encoded_rec.size());
        // unpack info is unused in sequence table

        return std::pair<txservice::TxKey, txservice::TxRecord::Uptr>(
            std::move(seq_tx_key), std::move(seq_tx_rec));
    }

<File: store_handler/data_store_service_client.cpp>
5059-5070

bool DataStoreServiceClient::DeserializeTxRecordStr(
    const std::string_view record, bool &is_deleted, size_t &offset)
{
    if (record.size() < (offset + sizeof(bool)))
    {
        return false;
    }

    is_deleted = *reinterpret_cast<const bool *>(record.data() + offset);
    offset += sizeof(bool);
    return true;
}

<File: tx_service/include/cc/catalog_cc_map.h>
2344-2346

const TableName base_table_name{table_name.GetBaseTableNameSV(),
                                        TableType::Primary,
                                        table_name.Engine()}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @store_handler/data_store_service_client_closure.h:
- Around line 2676-2678: Several newly added pooled fields are not being reset
on reuse, causing stale engine-prefixed keys and scan bounds; update the
Poolable reset/clear path to explicitly zero/clear engine_prefix_len_,
start_key_, end_key_, and any FetchTableCallbackData::key_str_ (and the member
key_str_ where present) after use or in the object's reset/clear method so
reused instances do not carry previous engine prefixes or scan bounds; locate
methods constructing/returning pooled objects (e.g., constructors, clear/reset
or release paths tied to Poolable, and callback cleanup for
FetchTableCallbackData) and add assignments like engine_prefix_len_ = 0,
start_key_.clear(), end_key_.clear(), and key_str_.clear() to ensure a clean
state on reuse.

In @store_handler/data_store_service_client.cpp:
- Around line 2244-2245: The DynamoHandler and BigTableHandler classes have
method signatures that no longer match the base class; update the declarations
and definitions of DiscoverAllTableNames, UpsertDatabase, DropDatabase,
FetchDatabase, and FetchAllDatabase in both DynamoHandler and BigTableHandler to
include the new txservice::TableEngine table_engine parameter so they exactly
match the base DataStoreHandler/DataStoreServiceClient signatures, update the
corresponding header and source function signatures and implementations, add or
keep the override specifier, and then fix any internal call sites within those
classes to accept/forward the new table_engine argument.
- Around line 3708-3717: Collected bucket_ids can be empty before invoking
data_store_service_->OpenDataStore(ng_id, std::move(bucket_ids)); which violates
existing invariants; add an assertion to ensure bucket_ids is non-empty (e.g.,
assert(!bucket_ids.empty())) immediately before the OpenDataStore call so the
invariant enforced by other modules (cc_request.h, template_cc_map.h,
local_cc_shards.cpp) is upheld; locate the bucket_ids population loop that
iterates over bucket_infos_ and insert the assert right before calling
OpenDataStore with ng_id and the moved bucket_ids.
🧹 Nitpick comments (2)
store_handler/eloq_data_store_service/data_store_service.h (1)

651-654: Simplify the API by taking bucket_ids by value instead of rvalue reference.

The current rvalue-reference signature OpenDataStore(uint32_t, std::unordered_set<uint16_t> &&bucket_ids) forces callers to use std::move explicitly. A by-value signature OpenDataStore(uint32_t, std::unordered_set<uint16_t> bucket_ids) is simpler and more ergonomic—callers can pass lvalues or rvalues naturally, and the compiler optimizes the move efficiently. Since the implementation immediately moves the parameter away (to InitializePrewarmFilter), taking by value is the conventional modern C++ approach.

Proposed change
     void OpenDataStore(uint32_t shard_id,
-                       std::unordered_set<uint16_t> &&bucket_ids);
+                       std::unordered_set<uint16_t> bucket_ids);
store_handler/data_store_service_client.cpp (1)

2527-2531: Consider documenting the magic number 37.

The reservation size engine_prefix.size() + 37 correctly accounts for the UUID string (36 characters) plus the table type prefix (1 character). However, the constant 37 is not immediately obvious.

Consider adding a brief comment explaining the size calculation:

// Reserve: engine_prefix + type_char(1) + uuid(36)
kv_info.kv_table_name_.reserve(engine_prefix.size() + 37);

Or define a named constant:

constexpr size_t UUID_WITH_PREFIX_SIZE = 37;  // type prefix(1) + UUID(36)
kv_info.kv_table_name_.reserve(engine_prefix.size() + UUID_WITH_PREFIX_SIZE);
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b07f9f and 2d88faf.

📒 Files selected for processing (14)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/sequences/sequences.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
🚧 Files skipped from review as they are similar to previous changes (8)
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/rocksdb_handler.h
  • store_handler/data_store_service_client.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_factory.h
  • tx_service/include/type.h
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-12-16T09:47:09.981Z
Learnt from: githubzilla
Repo: eloqdata/tx_service PR: 288
File: store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h:54-76
Timestamp: 2025-12-16T09:47:09.981Z
Learning: In RocksDBCloudDataStoreFactory::CreateDataStore (store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h), when oss_url is configured, the factory always appends /ds_<shard_id> to the oss_url. Users should not include the shard path in the oss_url themselves.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.h
📚 Learning: 2025-11-11T07:10:40.346Z
Learnt from: lzxddz
Repo: eloqdata/tx_service PR: 199
File: include/cc/local_cc_shards.h:233-234
Timestamp: 2025-11-11T07:10:40.346Z
Learning: In the LocalCcShards class in include/cc/local_cc_shards.h, the EnqueueCcRequest methods use `shard_code & 0x3FF` followed by `% cc_shards_.size()` to distribute work across processor cores for load balancing. This is intentional and separate from partition ID calculation. The 0x3FF mask creates a consistent distribution range (0-1023) before modulo by actual core count.

Applied to files:

  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
🧬 Code graph analysis (3)
store_handler/eloq_data_store_service/data_store_service.h (1)
store_handler/eloq_data_store_service/data_store_service.cpp (2)
  • OpenDataStore (2201-2242)
  • OpenDataStore (2201-2202)
store_handler/data_store_service_client.cpp (1)
tx_service/include/schema.h (1)
  • table_engine (129-275)
store_handler/rocksdb_handler.cpp (2)
tx_service/include/schema.h (1)
  • table_engine (129-275)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
🔇 Additional comments (6)
tx_service/include/sequences/sequences.h (1)

75-80: Looks right; ensure all sequence KV ops use kv_table_name_sv_ (not table_name_sv_).
This change is consistent with the PR goal (distinguish KV tables at the datastore level). Just double-check that sequence read/write paths reference Sequences::kv_table_name_sv_ everywhere (and that txservice::sequence_kv_table_name_sv is defined in the intended header).

store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (1)

140-142: Mutating shared eloq_store_configs_: confirm it won’t be overwritten across shards/ngs.
Since prewarm_filter is stored in eloq_store_configs_ (shared by the factory), please confirm initialization is either (a) called once, or (b) safe to overwrite (i.e., bucket_ids is global for the process / node group), otherwise concurrent/multi-shard opens could clobber each other’s filter.

store_handler/data_store_service_client.cpp (3)

96-107: Good documentation of the engine prefix convention.

The comment clearly explains the naming convention and provides concrete examples. This will help future developers understand the design.


5117-5133: LGTM: Safe string_view usage in InitPreBuiltTables.

The code correctly uses keys_with_prefix to own the engine-prefixed key strings, then creates string_views in the keys vector that reference these strings. The lifetime management is correct since both vectors are used together in the same BatchWriteRecords call before going out of scope.


2254-2257: The 0xFF overflow concern is not applicable to this code.

All engine prefixes returned by KvTablePrefixOf are hardcoded ASCII strings ("eloqsql_", "eloqkv_", "eloqdoc_", or empty). The underscore character (ASCII 95) increments correctly to backtick (96), creating a valid exclusive upper bound. Byte value 0xFF does not occur in any of these prefixes.

However, there is a potential undefined behavior edge case: if TableEngine::None is passed to this function, KvTablePrefixOf returns an empty string, and calling .back()++ on an empty string is undefined behavior. Verify whether TableEngine::None can reach this code path and whether a guard or assertion should be added.

store_handler/rocksdb_handler.cpp (1)

1495-1554: LGTM! Signature changes support engine-prefixed KV operations.

The addition of the txservice::TableEngine table_engine parameter to these five database-related methods is consistent with the PR's objective to add engine prefixes to KV table names. The parameter is currently unused in these stub implementations (which correctly assert false if called), but when these methods are implemented, they should follow the pattern established in DataStoreServiceClient by using txservice::KvTablePrefixOf(table_engine) to construct engine-prefixed keys before performing storage operations.

Based on the relevant code snippets from DataStoreServiceClient (e.g., UpsertDatabase at lines 2290-2341), the implementation should:

  1. Construct the prefixed key: key = KvTablePrefixOf(table_engine) + db_name
  2. Use the prefixed key for all RocksDB operations

</review_comment_end>

Comment on lines 2244 to +2245
bool DataStoreServiceClient::DiscoverAllTableNames(
txservice::TableEngine table_engine,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find the header file and examine the function declarations
find . -name "data_store_service_client.h" -o -name "*data_store_service_client*" | grep -E "\.(h|hpp)$"

Repository: eloqdata/tx_service

Length of output: 158


🏁 Script executed:

# Check the current signatures in the .cpp file at the specified lines
sed -n '2244,2250p' store_handler/data_store_service_client.cpp
sed -n '2291,2297p' store_handler/data_store_service_client.cpp
sed -n '2356,2362p' store_handler/data_store_service_client.cpp
sed -n '2423,2429p' store_handler/data_store_service_client.cpp
sed -n '2450,2456p' store_handler/data_store_service_client.cpp

Repository: eloqdata/tx_service

Length of output: 1344


🏁 Script executed:

# Search for callers of these functions across the codebase
rg -n 'DiscoverAllTableNames\(|UpsertDatabase\(|DropDatabase\(|FetchDatabase\(|FetchAllDatabase\(' --type=cpp -g '!data_store_service_client.cpp' | head -50

Repository: eloqdata/tx_service

Length of output: 3515


🏁 Script executed:

# Check the header file declarations
sed -n '358,365p' store_handler/data_store_service_client.h
sed -n '365,372p' store_handler/data_store_service_client.h
sed -n '371,378p' store_handler/data_store_service_client.h
sed -n '376,383p' store_handler/data_store_service_client.h
sed -n '383,390p' store_handler/data_store_service_client.h

Repository: eloqdata/tx_service

Length of output: 1637


🏁 Script executed:

# Sample calls from rocksdb_handler to verify they pass table_engine
rg -n -B2 -A3 'DiscoverAllTableNames\(|UpsertDatabase\(|DropDatabase\(|FetchDatabase\(|FetchAllDatabase\(' store_handler/rocksdb_handler.cpp | head -80

Repository: eloqdata/tx_service

Length of output: 1518


🏁 Script executed:

# Sample calls from dynamo_handler to verify they pass table_engine
rg -n -B2 -A3 'DiscoverAllTableNames\(|UpsertDatabase\(|DropDatabase\(|FetchDatabase\(|FetchAllDatabase\(' store_handler/dynamo_handler.cpp | head -80

Repository: eloqdata/tx_service

Length of output: 1083


🏁 Script executed:

# Check bigtable_handler implementations
rg -n -B2 -A3 'bool EloqDS::BigTableHandler::DiscoverAllTableNames|bool EloqDS::BigTableHandler::UpsertDatabase|bool EloqDS::BigTableHandler::DropDatabase|bool EloqDS::BigTableHandler::FetchDatabase|bool EloqDS::BigTableHandler::FetchAllDatabase' store_handler/bigtable_handler.cpp | head -60

Repository: eloqdata/tx_service

Length of output: 1049


🏁 Script executed:

# Check int_mem_store implementations
rg -n -B2 -A3 'bool.*::DiscoverAllTableNames|bool.*::UpsertDatabase|bool.*::DropDatabase|bool.*::FetchDatabase|bool.*::FetchAllDatabase' tx_service/include/store/int_mem_store.h

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Check if there are compilation indicators or build logs showing errors
find . -name "*.log" -o -name "CMakeError*" 2>/dev/null | head -10

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Check the base class data_store_handler.h declarations
sed -n '242,270p' tx_service/include/store/data_store_handler.h

Repository: eloqdata/tx_service

Length of output: 1302


🏁 Script executed:

# Check if DynamoHandler header also declares table_engine
sed -n '138,220p' store_handler/dynamo_handler.h

Repository: eloqdata/tx_service

Length of output: 3417


🏁 Script executed:

# Check if BigTableHandler header declares table_engine
sed -n '144,175p' store_handler/bigtable_handler.h

Repository: eloqdata/tx_service

Length of output: 1377


Critical: API signature change not propagated to all implementations

The table_engine parameter was added to five database methods in the base class (DataStoreHandler) and correctly implemented in DataStoreServiceClient and RocksDBHandler. However, DynamoHandler and BigTableHandler implementations still lack this parameter in their declarations and implementations:

  • DiscoverAllTableNames
  • UpsertDatabase
  • DropDatabase
  • FetchDatabase
  • FetchAllDatabase

This creates a virtual function signature mismatch that will cause compilation errors. Both derived classes must be updated to match the base class signatures with the table_engine parameter.

🤖 Prompt for AI Agents
In @store_handler/data_store_service_client.cpp around lines 2244 - 2245, The
DynamoHandler and BigTableHandler classes have method signatures that no longer
match the base class; update the declarations and definitions of
DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, and
FetchAllDatabase in both DynamoHandler and BigTableHandler to include the new
txservice::TableEngine table_engine parameter so they exactly match the base
DataStoreHandler/DataStoreServiceClient signatures, update the corresponding
header and source function signatures and implementations, add or keep the
override specifier, and then fix any internal call sites within those classes to
accept/forward the new table_engine argument.

@lzxddz lzxddz force-pushed the change_kv_tablename branch 2 times, most recently from 2bb0166 to 3409f74 Compare January 9, 2026 11:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
store_handler/data_store_service_client.cpp (3)

5068-5163: Correctness: ok &&InitTableLastRangePartitionId(...) doesn’t update ok (failure ignored).

That expression computes a value and discards it; if InitTableLastRangePartitionId() fails, ok remains unchanged. Also, you hardcode tablename engine to EloqSql while later deriving the prefix from table_name.Engine()—please ensure these engines are always identical for prebuilt tables.

Proposed fix
@@
         if (!table_name.IsHashPartitioned())
         {
             // init table last range partition id
             bool ok = InitTableRanges(tablename, table_version);
-            ok &&InitTableLastRangePartitionId(tablename);
+            ok = ok && InitTableLastRangePartitionId(tablename);
             if (!ok)
             {
@@
-        txservice::TableName tablename(tbl_sv,
-                                       txservice::TableType::Primary,
-                                       txservice::TableEngine::EloqSql);
+        txservice::TableName tablename(tbl_sv,
+                                       txservice::TableType::Primary,
+                                       table_name.Engine());

2244-2276: Add precondition check for non-None table engine or explicitly handle empty prefix.

The code calls end_key_.back()++ without checking if start_key_ is empty. Since KvTablePrefixOf(TableEngine::None) returns an empty string, this operation causes undefined behavior if TableEngine::None is passed. Although external validation in data_substrate.cpp currently prevents TableEngine::None from reaching this path, the function itself lacks local enforcement of this precondition. Add an assertion that table_engine != TableEngine::None (or that !start_key_.empty()), or explicitly handle the TableEngine::None case to make the invariant clear and catch violations early.


2450-2482: Prefix-range scan has the same end_key_.back()++ footgun as FetchAllTableNames.

Real UB risk when prefix is empty (TableEngine::None case). Add validation that the prefix is non-empty before incrementing the last byte, or handle empty prefixes explicitly. Also verify the last byte is stable and won't overflow if it's 0xFF.

store_handler/rocksdb_handler.cpp (4)

1545-1554: Update FetchAllDatabase in BigTableHandler, DynamoHandler, and IntMemStore to match the base class signature.

The base class virtual method now requires a table_engine parameter, but BigTableHandler, DynamoHandler, and IntMemStore still have the old signature without this parameter. This creates a virtual method override mismatch that violates the Liskov Substitution Principle. All implementations must be updated to include the txservice::TableEngine table_engine parameter as the first argument to maintain interface consistency.


1508-1519: Critical interface violation: Not all derived handlers updated with new table_engine parameter.

The base class DataStoreHandler::UpsertDatabase now requires txservice::TableEngine table_engine as the first parameter, but several derived implementations have not been updated:

  • ✓ RocksDBHandler: correctly includes table_engine
  • ✓ DataStoreServiceClient: correctly includes table_engine
  • ✗ DynamoHandler: missing table_engine parameter
  • ✗ BigTableHandler: missing table_engine parameter
  • ✗ InMemStore: missing table_engine parameter

This violates the Liskov Substitution Principle and breaks the virtual interface contract. All derived classes must match the base class signature. Update DynamoHandler, BigTableHandler, and InMemStore to include the table_engine parameter.


1521-1530: Incomplete signature update – not all implementations have been updated to match the abstract interface.

The abstract base class DataStoreHandler::DropDatabase (lines 255-259 in data_store_handler.h) declares the new signature with txservice::TableEngine table_engine as the first parameter. However, three concrete implementations have not been updated:

  • DynamoHandler::DropDatabase (lines 206-209 in dynamo_handler.h) – missing table_engine
  • BigTableHandler::DropDatabase (lines 155-159 in bigtable_handler.h) – missing table_engine
  • InMemStore::DropDatabase (lines 169-173 in int_mem_store.h) – missing table_engine

While RocksDBHandler and DataStoreServiceClient have been correctly updated with the new parameter, this signature mismatch violates the Liskov Substitution Principle and will cause compilation or runtime errors. All three implementations must be updated to include the table_engine parameter to match the abstract base class.


1495-1505: Update all handler implementations to match the new base class signature with table_engine parameter.

This method's signature was changed in the base class DataStoreHandler to include the table_engine parameter, but several derived classes have not been updated:

  • RocksDBHandler — updated
  • DataStoreServiceClient — updated
  • DynamoHandler — signature still missing table_engine parameter (store_handler/dynamo_handler.h:138, dynamo_handler.cpp:2851)
  • BigTableHandler — signature still missing table_engine parameter (store_handler/bigtable_handler.h:144, bigtable_handler.cpp:1566)
  • InMemStore — signature still missing table_engine parameter (tx_service/include/store/int_mem_store.h:150)

All virtual method overrides must match the base class signature. Update the remaining implementations to include txservice::TableEngine table_engine as the first parameter.

🤖 Fix all issues with AI agents
In @store_handler/data_store_service_client_closure.h:
- Around line 3068-3070: DiscoverAllTableNamesCallbackData has start_key_,
end_key_ and engine_prefix_len_ but Clear() fails to reset engine_prefix_len_;
update the Clear() method of DiscoverAllTableNamesCallbackData to initialize
engine_prefix_len_ (set to 0 or the same default used in
FetchAllDatabaseCallbackData) alongside resetting start_key_ and end_key_,
ensuring state is fully cleared between uses.
- Around line 2856-2857: The new member engine_prefix_len_ is not reset in
Reset() or cleared in Clear(), which can cause reads of an uninitialized value;
update the Reset() method and the Clear() method to explicitly initialize
engine_prefix_len_ (e.g., set to 0) alongside the other members so the field has
a well-defined value until the caller assigns it.

In @store_handler/data_store_service_client.cpp:
- Around line 3706-3718: The code uses std::unordered_set<uint16_t> (variable
bucket_ids) in the block that builds bucket_ids and calls
data_store_service_->OpenDataStore(ng_id, std::move(bucket_ids)) but the header
<unordered_set> is not included; fix by adding #include <unordered_set> to the
top of the compilation unit alongside the other includes so the use of
std::unordered_set<uint16_t> compiles cleanly.

In @store_handler/eloq_data_store_service/eloq_store_data_store_factory.h:
- Around line 109-138: The header is missing the <unordered_set> include used by
the InitializePrewarmFilter signature (std::unordered_set<uint16_t>
&&bucket_ids) in eloq_store_data_store_factory::InitializePrewarmFilter; add
#include <unordered_set> to the header includes so the type is defined, keeping
the include alongside the existing <memory> and <vector> includes.
🧹 Nitpick comments (4)
store_handler/data_store_service_client_closure.h (1)

2676-2678: Added key_str_ to hold engine-prefixed table name.

The new key_str_ member stores the table name with the engine prefix. Note that Clear() at lines 2642-2650 does not reset key_str_. Consider adding key_str_.clear(); in the Clear() override for consistency with other callback data types.

Proposed fix
     void Clear() override
     {
         SyncCallbackData::Clear();
         schema_image_ = nullptr;
         found_ = nullptr;
         version_ts_ = nullptr;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
+        key_str_.clear();
     }
store_handler/data_store_service_client.cpp (3)

96-107: Clarify/encode the prefixing contract (esp. escaping and “None” engine).

The comment describes {EngineType}_ prefixes, but the code relies on prefix-range scans (end_key_.back()++). Please document whether txservice::KvTablePrefixOf(TableEngine::None) is possible/allowed and what it returns (empty vs sentinel), since empty prefix breaks assumptions in multiple call sites.


5346-5398: UpsertCatalog now namespaces keys correctly; consider making key storage lifetime explicit.

Prefixing table_catalogs key is correct. Minor: keys holds string_view pointing at local key_str; it’s probably safe because you synchronously Wait(), but this lifetime coupling is subtle—consider storing the key in callback data (like other paths) or using a std::string container for keys, as done in InitPreBuiltTables().


5400-5447: DeleteCatalog key prefixing matches UpsertCatalog; same lifetime subtlety.

Same note: keys.emplace_back(key_str) relies on Wait() for lifetime safety.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d88faf and 2bb0166.

📒 Files selected for processing (14)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/sequences/sequences.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
🚧 Files skipped from review as they are similar to previous changes (7)
  • store_handler/eloq_data_store_service/data_store_factory.h
  • tx_service/include/type.h
  • tx_service/include/sequences/sequences.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-16T09:47:09.981Z
Learnt from: githubzilla
Repo: eloqdata/tx_service PR: 288
File: store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h:54-76
Timestamp: 2025-12-16T09:47:09.981Z
Learning: In RocksDBCloudDataStoreFactory::CreateDataStore (store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h), when oss_url is configured, the factory always appends /ds_<shard_id> to the oss_url. Users should not include the shard path in the oss_url themselves.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.h
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • store_handler/data_store_service_client.h
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/data_store_service_client.cpp
🧬 Code graph analysis (2)
store_handler/eloq_data_store_service/data_store_service.h (1)
store_handler/eloq_data_store_service/data_store_service.cpp (2)
  • OpenDataStore (2201-2242)
  • OpenDataStore (2201-2202)
store_handler/data_store_service_client.h (1)
store_handler/data_store_service_client.cpp (2)
  • InitBucketsInfo (3865-3930)
  • InitBucketsInfo (3865-3869)
🔇 Additional comments (18)
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (2)

112-137: Prewarm filter logic is well-structured.

The lambda correctly:

  • Captures bucket_ids by move for ownership
  • Distinguishes RangePartition engines (eloqdoc_, eloqsql_, irange_) using MurmurHash3 on partition_id_
  • Uses direct partition_id_ cast for HashPartition engines
  • Returns whether the computed bucket_id exists in the provided set

140-141: Intentional removal of const qualifier enables runtime configuration.

The member eloq_store_configs_ was changed from const EloqStoreConfig to EloqStoreConfig to allow InitializePrewarmFilter to modify the configuration after construction. This is a necessary trade-off for the dynamic prewarm filter assignment.

tx_service/include/store/data_store_handler.h (1)

242-271: Consistent addition of TableEngine parameter to catalog/database operations.

The new table_engine parameter is added as the first argument to all five methods:

  • DiscoverAllTableNames
  • UpsertDatabase
  • DropDatabase
  • FetchDatabase
  • FetchAllDatabase

This consistent placement enables engine-scoped key prefixing for KV catalog operations. Since these are pure virtual methods, all implementations must be updated accordingly.

store_handler/data_store_service_client.h (2)

179-183: New static InitBucketsInfo method for bucket-to-node-group mapping.

This static method initializes bucket ownership information using consistent hashing with 64 virtual nodes per node group. The implementation (from the relevant snippet) uses deterministic random generation seeded by node group ID, ensuring reproducible bucket assignment across restarts.


358-387: Interface implementations correctly propagate TableEngine parameter.

All five methods (DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, FetchAllDatabase) now accept txservice::TableEngine table_engine as the first parameter, matching the updated pure virtual signatures in DataStoreHandler. This enables engine-prefixed key construction via KvTablePrefixOf(engine) in the implementations.

store_handler/data_store_service_client_closure.h (3)

2737-2744: FetchDatabaseCallbackData::Clear() properly resets key_str_.

The key_str_.clear() is correctly placed before SyncCallbackData::Clear() to ensure the engine-prefixed database name is cleaned up when the callback data is reused.


2890-2896: UpsertDatabaseCallbackData::Clear() properly resets key_str_.

The implementation correctly clears the engine-prefixed key string.


2948-2954: DropDatabaseCallbackData::Clear() properly resets key_str_.

The implementation correctly clears the engine-prefixed key string.

store_handler/eloq_data_store_service/data_store_service.h (1)

652-653: Public API change: OpenDataStore now requires bucket ownership information.

The signature change from void OpenDataStore(uint32_t shard_id) to void OpenDataStore(uint32_t shard_id, std::unordered_set<uint16_t> &&bucket_ids) is a breaking change. Callers must now provide bucket IDs and use std::move() due to the rvalue reference parameter, as shown in the current usage in data_store_service_client.cpp.

Likely an incorrect or invalid review comment.

store_handler/data_store_service_client.cpp (8)

622-633: Engine-prefixed table_catalogs lookup looks correct; guard against empty prefix.

fetch_cc->kv_key_ = KvTablePrefixOf(engine) + table_name is the right direction for disambiguation. Please add an assert/guard if an empty prefix is possible, otherwise this becomes a silent “scan/lookup across all engines” behavior.


2204-2227: Same prefix safety concern for synchronous FetchTable().

Good propagation of KvTablePrefixOf(table_name.Engine()). If KvTablePrefixOf() can ever return empty, this turns into a non-namespaced lookup in table_catalogs. Please confirm invariants or handle TableEngine::None.


2291-2342: DB keys correctly namespaced; keep the “empty prefix” invariant explicit.

key_str_ = KvTablePrefixOf(table_engine) + db is consistent with the goal. Please confirm whether db can contain arbitrary bytes (including the separator) and whether TableEngine::None is a valid input here.


2356-2406: DropDatabase prefixing is consistent; verify behavior for TableEngine::None.

Same caveat: if KvTablePrefixOf(None) is empty, this becomes cross-engine deletion for a shared db_catalogs namespace.


2423-2448: FetchDatabase prefixing is consistent; ensure callers pass correct engine.

This API now requires the engine to find a DB record. Please ensure higher layers don’t accidentally pass the wrong engine (would look like “not found”).


2518-2553: Generated KV table/index names: prefixing is good; consider asserting engine consistency.

Using the base table’s engine prefix for both base KV table and index KV table names seems intended—please assert/guarantee that all table_schema->IndexNames() share the same engine as the base table (otherwise kv_index_names_ becomes inconsistent with the index’s TableName::Engine()).


2628-2673: New index KV names are correctly prefixed; verify mixed-engine index support is not expected.

engine_prefix = KvTablePrefixOf(table_name.Engine()) is good. If the system ever supports indexes stored in a different engine than the base table, this needs revisiting; otherwise add an invariant/assert to codify “same engine”.


3698-3723: Verify if OpenDataStore should reject empty bucket_ids or explicitly document prewarming behavior.

When bucket_ids is empty (if ng_id temporarily owns no buckets), the prewarm filter silently disables table prewarming with no error or logging. Related code in scan operations contains assertions expecting bucket_ids to be non-empty. Either add validation to reject empty sets at call time, or explicitly document that empty sets are valid and intended to disable prewarming.

store_handler/rocksdb_handler.cpp (1)

1532-1543: Verify that all callers of FetchDatabase have been updated to pass the table_engine parameter.

The table_engine parameter addition is consistent with the other database methods and supports future engine-aware catalog queries. However, this is a breaking API change that requires all call sites to be updated with the new signature.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
store_handler/data_store_service_client_closure.h (3)

2623-2684: FetchTableCallbackData::key_str_ is Poolable but never cleared → stale prefixed table name on reuse.

Add key_str_.clear() in Clear() (and optionally in Reset() for symmetry).

Proposed fix
 struct FetchTableCallbackData : public SyncCallbackData
 {
@@
     void Reset(std::string &schema_image,
                bool &found,
                uint64_t &version_ts,
                const std::function<void()> *yield_fptr = nullptr,
                const std::function<void()> *resume_fptr = nullptr)
     {
         SyncCallbackData::Reset();
+        key_str_.clear();
         schema_image_ = &schema_image;
@@
     void Clear() override
     {
         SyncCallbackData::Clear();
+        key_str_.clear();
         schema_image_ = nullptr;
         found_ = nullptr;
         version_ts_ = nullptr;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
     }
@@
     // Table name with engine prefix.
     std::string key_str_;

2796-2859: FetchAllDatabaseCallbackData::engine_prefix_len_ is uninitialized/unreset.

This is especially risky if it’s used to strip prefixes during pagination. Default-initialize and reset it alongside the other pagination fields.

Proposed fix
 struct FetchAllDatabaseCallbackData : public SyncCallbackData
 {
@@
     void Reset(std::vector<std::string> &dbnames,
                const std::function<void()> *yield_fptr,
                const std::function<void()> *resume_fptr)
     {
         SyncCallbackData::Reset();
         dbnames_ = &dbnames;
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
         session_id_.clear();
         start_key_.clear();
         end_key_.clear();
+        engine_prefix_len_ = 0;
     }
 
     void Clear() override
     {
         SyncCallbackData::Clear();
         dbnames_ = nullptr;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
         session_id_.clear();
         start_key_.clear();
         end_key_.clear();
+        engine_prefix_len_ = 0;
     }
@@
-    uint32_t engine_prefix_len_;
+    uint32_t engine_prefix_len_{0};
 };

3014-3071: DiscoverAllTableNamesCallbackData doesn’t clear the new range fields.

start_key_, end_key_, and engine_prefix_len_ should be reset/cleared in both Reset() and Clear() to avoid carrying pagination state between pooled uses.

Proposed fix
 struct DiscoverAllTableNamesCallbackData : public SyncCallbackData
 {
@@
     void Reset(std::vector<std::string> &table_names,
                const std::function<void()> *yield_fptr,
                const std::function<void()> *resume_fptr)
     {
         SyncCallbackData::Reset();
         table_names_ = &table_names;
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
         session_id_.clear();
+        start_key_.clear();
+        end_key_.clear();
+        engine_prefix_len_ = 0;
     }
 
     void Clear() override
     {
         SyncCallbackData::Clear();
         table_names_ = nullptr;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
         session_id_.clear();
+        start_key_.clear();
+        end_key_.clear();
+        engine_prefix_len_ = 0;
     }
@@
     std::string session_id_;
     std::string start_key_;
     std::string end_key_;
-    uint32_t engine_prefix_len_;
+    uint32_t engine_prefix_len_{0};
 };
store_handler/eloq_data_store_service/data_store_factory.h (1)

26-120: Avoid -Wunused-parameter/-Werror risk in the default no-op InitializePrewarmFilter.

Since this is an inline definition, compilers can warn on unused ng_id/bucket_ids. Easiest fix: omit parameter names or cast to void.

Proposed fix
-    virtual void InitializePrewarmFilter(
-        uint32_t ng_id, std::unordered_set<uint16_t> &&bucket_ids)
+    virtual void InitializePrewarmFilter(
+        uint32_t /*ng_id*/, std::unordered_set<uint16_t> && /*bucket_ids*/)
     {
     }
store_handler/data_store_service_client.cpp (3)

2244-2276: Guard end_key_.back()++ against empty prefixes to prevent undefined behavior.

KvTablePrefixOf(TableEngine::None) returns an empty string, making the subsequent .back()++ operation undefined behavior. Add defensive checks to both DiscoverAllTableNames() and FetchAllDatabase() before modifying end_key_.

Proposed fix
 bool DataStoreServiceClient::DiscoverAllTableNames(
     txservice::TableEngine table_engine,
     std::vector<std::string> &norm_name_vec,
     const std::function<void()> *yield_fptr,
     const std::function<void()> *resume_fptr)
 {
     DiscoverAllTableNamesCallbackData *callback_data =
         discover_all_tables_callback_data_pool_.NextObject();
     PoolableGuard guard(callback_data);
     callback_data->Reset(norm_name_vec, yield_fptr, resume_fptr);
     callback_data->start_key_ = txservice::KvTablePrefixOf(table_engine);
+    if (callback_data->start_key_.empty())
+    {
+        LOG(ERROR) << "DiscoverAllTableNames: empty engine prefix for engine="
+                   << static_cast<int>(table_engine);
+        return false;
+    }
     callback_data->end_key_ = callback_data->start_key_;
     callback_data->end_key_.back()++;
     callback_data->engine_prefix_len_ = callback_data->start_key_.size();
 bool DataStoreServiceClient::FetchAllDatabase(
     txservice::TableEngine table_engine,
     std::vector<std::string> &dbnames,
     const std::function<void()> *yield_fptr,
     const std::function<void()> *resume_fptr)
 {
     FetchAllDatabaseCallbackData *callback_data =
         fetch_all_dbs_callback_data_pool_.NextObject();
     PoolableGuard guard(callback_data);
     callback_data->Reset(dbnames, yield_fptr, resume_fptr);
     callback_data->start_key_ = txservice::KvTablePrefixOf(table_engine);
+    if (callback_data->start_key_.empty())
+    {
+        LOG(ERROR) << "FetchAllDatabase: empty engine prefix for engine="
+                   << static_cast<int>(table_engine);
+        return false;
+    }
     callback_data->end_key_ = callback_data->start_key_;
     callback_data->end_key_.back()++;
     callback_data->engine_prefix_len_ = callback_data->start_key_.size();

5068-5163: Use the actual table engine instead of hardcoding EloqSql for FetchTable.

Line 5085 hardcodes TableEngine::EloqSql when creating tablename, but the catalog key prefix (line 5118) correctly uses table_name.Engine(). If pre_built_table_names_ contains non-SQL engines (e.g., EloqKv), FetchTable will search with the wrong engine, causing false "not found" results, while the catalog write uses the correct engine. Replace TableEngine::EloqSql with table_name.Engine() on line 5085 to match the prefix.


3698-3723: Consider skipping OpenDataStore call when no buckets are owned, or verify bucket ownership is updated when leadership transitions occur.

The empty bucket case (OpenDataStore(ng_id, {})) is valid—the prewarm filter lambda handles empty sets gracefully by returning false for all lookups. However, bucket_infos_ is initialized once at construction and never refreshed. Since BucketInfo objects support ownership updates via SetBucketOwner() during bucket migration, calling OnLeaderStart after a migration could use stale bucket ownership data. Either refresh bucket_infos_ when ownership changes or verify that OnLeaderStart is synchronized with bucket ownership updates.

🤖 Fix all issues with AI agents
In @store_handler/data_store_service_client.cpp:
- Around line 96-107: The catalog key change adds an engine prefix via
KvTablePrefixOf(), but FetchTableCatalog, FetchTable, and FetchDatabase
currently only try the prefixed key and will miss legacy entries; update those
functions to first try the new prefixed lookup and if not found perform a
fallback read of the legacy non-prefixed key (or scan table_catalogs/db_catalogs
for matching record) and if found either return it and write back the prefixed
key (online migration) or record it for an offline migration process;
alternatively implement a one-time migration utility that rewrites legacy keys
into the prefixed form before rollout — ensure the fallback logic uses the same
key derivation code paths and that any write-back uses KvTablePrefixOf() to
create the canonical key.

In @store_handler/eloq_data_store_service/data_store_service.cpp:
- Around line 2201-2208: The call to
data_store_factory_->InitializePrewarmFilter(...) in OpenDataStore should be
moved so it executes only after verifying the shard is in Closed state (i.e.,
after the "shard is Closed" no-op check and any early returns), preventing
mutation of the shared prewarm_filter on no-op opens and avoiding races between
concurrent opens; update OpenDataStore to first check shard state and return if
not Closed, then (and only then) call InitializePrewarmFilter while still
guarding for data_store_factory_ != nullptr.

In @tx_service/include/store/data_store_handler.h:
- Around line 242-271: The base class API added a txservice::TableEngine
parameter but several implementations still use the old signatures; update all
overrides to match exactly: change the method signatures for
DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, and
FetchAllDatabase to include the first parameter txservice::TableEngine
table_engine in BigTableHandler (store_handler/bigtable_handler.h/.cpp),
DynamoHandler (store_handler/dynamo_handler.h/.cpp), and IntMemStore
(tx_service/include/store/int_mem_store.h), ensuring parameter names and
const/default pointer params match the base declaration so the virtual overrides
resolve correctly and adjust any internal calls/uses to pass through the
table_engine where needed.
🧹 Nitpick comments (3)
tx_service/include/cc/cc_req_misc.h (1)

130-133: Prefer keeping FetchCatalogCc::kv_key_ private (or at least consistently named) to avoid misuse/stale state.

Since this is “only used in DataStoreHandler”, consider (a) making it private with an accessor, or (b) renaming to kv_table_name_ to align with other CC request structs and reduce ambiguity.

Proposed tweak (encapsulation + naming)
 public:
-    // Table name with engine prefix, only used in DataStoreHandler.
-    std::string kv_key_;
+    const std::string &KvTableName() const { return kv_table_name_; }
+    void SetKvTableName(std::string v) { kv_table_name_ = std::move(v); }
 
 private:
+    // Table name with engine prefix, only used in DataStoreHandler.
+    std::string kv_table_name_;
store_handler/data_store_service_client.h (1)

179-184: Public static InitBucketsInfo(...): confirm API intent (or make it private).
If this helper is only used internally (constructor / setup flows), keeping it private (or moving to an internal helper in the .cpp) avoids widening the public surface. If it’s needed by other modules (e.g., prewarm/open flows), consider documenting the contract (e.g., deterministic mapping, requires non-empty node_groups).

store_handler/rocksdb_handler.cpp (1)

1495-1505: Add (void)table_engine; to suppress unused-parameter warnings in stub implementations.

These five methods have the table_engine parameter but don't use it. While the build doesn't treat warnings as errors, adding an explicit cast suppresses compiler warnings and serves as documentation that the parameter is intentionally unused until these stubs are fully implemented.

Proposed patch
 bool RocksDBHandler::DiscoverAllTableNames(
     txservice::TableEngine table_engine,
     std::vector<std::string> &norm_name_vec,
     const std::function<void()> *yield_fptr,
     const std::function<void()> *resume_fptr)
 {
+    (void)table_engine;
     LOG(ERROR) << "RocksDBHandler::DiscoverAllTableNames not implemented";
     // Not implemented
     assert(false);
     return true;
 }

 bool RocksDBHandler::UpsertDatabase(txservice::TableEngine table_engine,
                                     std::string_view db,
                                     std::string_view definition,
                                     const std::function<void()> *yield_fptr,
                                     const std::function<void()> *resume_fptr)
 {
+    (void)table_engine;
     DLOG(ERROR) << "RocksDBHandler::UpsertDatabase not implemented";
     // Not implemented
     assert(false);
     DLOG(INFO) << "RocksDBHandler::UpsertDatabase finished";
     return true;
 }

 bool RocksDBHandler::DropDatabase(txservice::TableEngine table_engine,
                                   std::string_view db,
                                   const std::function<void()> *yield_fptr,
                                   const std::function<void()> *resume_fptr)
 {
+    (void)table_engine;
     LOG(ERROR) << "RocksDBHandler::DropDatabase not implemented";
     // Not implemented
     assert(false);
     return true;
 }

 bool RocksDBHandler::FetchDatabase(txservice::TableEngine table_engine,
                                    std::string_view db,
                                    std::string &definition,
                                    bool &found,
                                    const std::function<void()> *yield_fptr,
                                    const std::function<void()> *resume_fptr)
 {
+    (void)table_engine;
     LOG(ERROR) << "RocksDBHandler::FetchDatabase not implemented";
     // Not implemented
     assert(false);
     return true;
 }

 bool RocksDBHandler::FetchAllDatabase(txservice::TableEngine table_engine,
                                       std::vector<std::string> &dbnames,
                                       const std::function<void()> *yield_fptr,
                                       const std::function<void()> *resume_fptr)
 {
+    (void)table_engine;
     LOG(ERROR) << "RocksDBHandler::FetchAllDatabase not implemented";
     // Not implemented
     assert(false);
     return true;
 }

Also applies to: 1508–1519, 1521–1530, 1532–1543, 1545–1554

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bb0166 and 3409f74.

📒 Files selected for processing (14)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/sequences/sequences.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
🚧 Files skipped from review as they are similar to previous changes (6)
  • store_handler/eloq_data_store_service/data_store_service.h
  • tx_service/include/type.h
  • store_handler/rocksdb_handler.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/data_store_service_client_closure.cpp
  • tx_service/include/sequences/sequences.h
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • store_handler/data_store_service_client.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
📚 Learning: 2025-12-16T09:47:09.981Z
Learnt from: githubzilla
Repo: eloqdata/tx_service PR: 288
File: store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h:54-76
Timestamp: 2025-12-16T09:47:09.981Z
Learning: In RocksDBCloudDataStoreFactory::CreateDataStore (store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h), when oss_url is configured, the factory always appends /ds_<shard_id> to the oss_url. Users should not include the shard path in the oss_url themselves.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.cpp
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/data_store_service_client.cpp
🧬 Code graph analysis (3)
store_handler/data_store_service_client.h (1)
store_handler/data_store_service_client.cpp (10)
  • InitBucketsInfo (3865-3930)
  • InitBucketsInfo (3865-3869)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
store_handler/eloq_data_store_service/data_store_factory.h (1)
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (1)
  • InitializePrewarmFilter (109-138)
store_handler/rocksdb_handler.cpp (2)
tx_service/include/schema.h (1)
  • table_engine (129-275)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2291-2342)
  • UpsertDatabase (2291-2296)
  • DropDatabase (2356-2406)
  • DropDatabase (2356-2360)
  • FetchDatabase (2423-2448)
  • FetchDatabase (2423-2429)
  • FetchAllDatabase (2450-2482)
  • FetchAllDatabase (2450-2454)
🔇 Additional comments (3)
store_handler/data_store_service_client.h (1)

358-388: Signature propagation for engine-scoped operations looks consistent with DataStoreHandler.
Adding txservice::TableEngine table_engine to DiscoverAllTableNames + database CRUD aligns with the updated interface and is the right direction for engine isolation.

store_handler/data_store_service_client.cpp (2)

618-633: Prefixed key lifetime is OK here, but ensure all call paths agree on “key = prefix + baseName”.
These writes/reads now use KvTablePrefixOf(engine) + <db/table string>. Please verify the corresponding callbacks and any code consuming scan results consistently strip engine_prefix_len_ (where applicable) and don’t accidentally treat the prefix as part of the logical name.

Also applies to: 2196-2227, 2291-2342, 2356-2406, 2423-2448, 5346-5398, 5400-5447


2518-2553: KV table / index name prefixing is the right place to enforce engine isolation.
Prefixing generated KV table names (base + indexes) with KvTablePrefixOf(engine) cleanly prevents cross-engine collisions at the KV layer.

Also applies to: 2629-2670

Comment on lines +96 to +107
// Notice:
// In order to distinguish the data tables of different engines at the
// KV layer, we added the prefix "{EngineType}_" when generating the
// KvTableName, such as eloqkv_, eloqsql_, eloqdoc_, ihash_, irange_, etc. So,
// the KvTableName of the system table should not conflict with the names of
// these data tables.
// Also, we add the prefix "{EngineType}_" to BaseTableName
// as the key of schemas records stored in "table_catalogs" table, such as
// "eloqkv_data_table_0", "eloqsql_t1", "eloqdoc_c1", "ihash_sequence...", etc.
// And, add the prefix "{EngineType}_" to the key of db catalog records stored
// in "db_catalogs".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and the specific lines mentioned
wc -l store_handler/data_store_service_client.cpp

Repository: eloqdata/tx_service

Length of output: 111


🏁 Script executed:

# Read the file around lines 96-107 to see the actual code
head -120 store_handler/data_store_service_client.cpp | tail -40

Repository: eloqdata/tx_service

Length of output: 1848


🏁 Script executed:

# Search for migration-related functions or comments
rg -i "migrat" store_handler/ -A 3 -B 1

Repository: eloqdata/tx_service

Length of output: 44333


🏁 Script executed:

# Search for fallback or legacy read patterns
rg -i "fallback|legacy|old.*key|backward.*compat" store_handler/ -A 2

Repository: eloqdata/tx_service

Length of output: 9123


🏁 Script executed:

# Search for dual-read or dual-scan patterns
rg "dual.*read|dual.*scan|read.*both|check.*both" store_handler/ -A 2

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Look for table_catalogs and db_catalogs lookup logic
rg "table_catalogs|db_catalogs" store_handler/ -B 2 -A 5

Repository: eloqdata/tx_service

Length of output: 8096


🏁 Script executed:

# Search for key construction patterns in table_catalogs and db_catalogs
rg "table_catalogs|db_catalogs" store_handler/data_store_service_client.cpp -B 3 -A 8 | head -100

Repository: eloqdata/tx_service

Length of output: 3701


🏁 Script executed:

# Look for functions that read from or write to these catalogs
rg -A 10 "FetchTableCatalog|FetchDatabase|UpsertTable" store_handler/data_store_service_client.cpp | head -150

Repository: eloqdata/tx_service

Length of output: 5876


🏁 Script executed:

# Search for any prefix-related code or key transformation logic
rg "engine.*prefix|prefix.*key|engine_type_prefix" store_handler/ -B 2 -A 2

Repository: eloqdata/tx_service

Length of output: 10960


🏁 Script executed:

# Check if there's any code that tries to read both prefixed and non-prefixed keys
rg "\.append\(|engine_type|EngineType" store_handler/data_store_service_client.cpp | head -40

Repository: eloqdata/tx_service

Length of output: 2494


🏁 Script executed:

# Look at the actual implementation of how keys are built for catalog reads
ast-grep --pattern 'fn.*Fetch.*{
  $$$
  append
  $$$
}'

Repository: eloqdata/tx_service

Length of output: 45


🏁 Script executed:

# Search for any version checking or compatibility layer
rg "version|upgrade|compat" store_handler/data_store_service_client.cpp -i | head -20

Repository: eloqdata/tx_service

Length of output: 1316


Engine-prefix key scheme: missing migration or fallback strategy for existing persisted data.

The code at lines 96-107 documents that keys in table_catalogs and db_catalogs now include an engine prefix (e.g., eloqkv_, eloqsql_). All catalog read operations—including FetchTableCatalog, FetchTable, and FetchDatabase—unconditionally construct keys using KvTablePrefixOf() before lookup. If an upgrade occurs against existing persisted data where keys lack the prefix, those entries will not be found. Either implement an offline migration step to rewrite legacy keys with prefixes, or add a temporary dual-read/scan fallback during rollout that checks both prefixed and non-prefixed keyspaces.

🤖 Prompt for AI Agents
In @store_handler/data_store_service_client.cpp around lines 96 - 107, The
catalog key change adds an engine prefix via KvTablePrefixOf(), but
FetchTableCatalog, FetchTable, and FetchDatabase currently only try the prefixed
key and will miss legacy entries; update those functions to first try the new
prefixed lookup and if not found perform a fallback read of the legacy
non-prefixed key (or scan table_catalogs/db_catalogs for matching record) and if
found either return it and write back the prefixed key (online migration) or
record it for an offline migration process; alternatively implement a one-time
migration utility that rewrites legacy keys into the prefixed form before
rollout — ensure the fallback logic uses the same key derivation code paths and
that any write-back uses KvTablePrefixOf() to create the canonical key.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @store_handler/data_store_service_client_closure.h:
- Around line 2676-2678: FetchTableCallbackData adds a std::string key_str_ but
the Clear() method (used for Poolable reuse) doesn't reset it; modify
FetchTableCallbackData::Clear() to reset key_str_ (e.g., call key_str_.clear()
or assign empty string) along with the existing field resets so recycled
instances don't retain stale table name data.

In @store_handler/eloq_data_store_service/eloq_store_data_store_factory.h:
- Line 123: Update the inline comment that currently reads "this table engine
use RangeParition." to correct grammar and typo: change "use" to "uses" and
"RangeParition" to "RangePartition", so the comment becomes "this table engine
uses RangePartition." Ensure only the comment text is edited.
- Line 133: Fix the typo in the comment inside eloq_store_data_store_factory
(the comment referencing HashPartition) by changing "hash_parition_id" to
"hash_partition_id" so the comment reads "use HashPartition, hash_partition_id
equals to bucket_id"; update the single-line comment near the HashPartition
reference accordingly.
🧹 Nitpick comments (1)
store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (1)

125-129: Extract magic numbers to named constants.

The hash seed 9001 and modulo value 1024 should be extracted as named constants (e.g., kPrewarmHashSeed and kMaxBucketCount) to improve maintainability and document their significance in the bucket calculation algorithm.

♻️ Proposed refactor

Add constants at the top of the class or in an appropriate header:

// In class or namespace scope
static constexpr uint32_t kPrewarmHashSeed = 9001;
static constexpr uint16_t kMaxBucketCount = 1024;

Then use them in the method:

                 uint32_t hash_val;
                 butil::MurmurHash3_x86_32(&table_ident.partition_id_,
                                           sizeof(table_ident.partition_id_),
-                                          9001,
+                                          kPrewarmHashSeed,
                                           &hash_val);
-                bucket_id = hash_val % 1024;
+                bucket_id = hash_val % kMaxBucketCount;
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3409f74 and 3b70f6d.

📒 Files selected for processing (4)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • store_handler/data_store_service_client_closure.cpp
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/data_store_service_client.cpp
🔇 Additional comments (23)
store_handler/data_store_service_client_closure.h (5)

2743-2743: LGTM!

The key_str_ field is properly cleared in the Clear() method and well-documented.

Also applies to: 2775-2777


2801-2808: LGTM!

The engine_prefix_len_ field is properly initialized in Reset() and cleared in Clear(). The implementation is consistent with the PR's objective to track engine prefixes.

Also applies to: 2820-2820, 2859-2860


3022-3029: LGTM!

The additions support ranged discovery with engine-prefixed keys. The engine_prefix_len_ field is properly initialized in Reset() and cleared in Clear(). The start_key_ and end_key_ fields are appropriately managed for pagination scenarios.

Also applies to: 3039-3039, 3074-3076


2898-2898: LGTM!

The key_str_ field is properly cleared in the Clear() method and well-documented.

Also applies to: 2927-2929


2956-2956: LGTM!

The key_str_ field is properly cleared in the Clear() method and well-documented.

Also applies to: 2985-2987

store_handler/data_store_service_client.cpp (14)

96-106: Clear documentation of the engine prefix scheme.

The comment block provides good documentation explaining how engine prefixes are used to distinguish data tables of different engines at the KV layer. This helps future maintainers understand the key structure.


624-630: LGTM!

The engine prefix is correctly prepended to the table name for the catalog lookup key.


2209-2215: LGTM!

Consistent use of engine prefix for the table catalog key lookup.


2244-2277: LGTM!

The method now correctly scopes table discovery to a specific engine by using the engine prefix as the key range. The prefix size is passed to the callback for proper name extraction.


2292-2315: LGTM!

UpsertDatabase correctly applies the engine prefix to the database catalog key.


2357-2379: LGTM!

DropDatabase correctly applies the engine prefix to the database catalog key for deletion.


2424-2443: LGTM!

FetchDatabase correctly applies the engine prefix for the database lookup.


2451-2484: LGTM!

FetchAllDatabase correctly uses engine-scoped key ranges, following the same pattern as DiscoverAllTableNames.


2524-2553: LGTM!

The CreateKVCatalogInfo correctly prepends engine prefixes to both the main KV table name and all index KV names. The memory reservation calculation (prefix + 37 for type char + 36-char UUID) is appropriate.


2631-2652: LGTM!

The CreateNewKVCatalogInfo consistently applies engine prefixes to new index KV names, matching the pattern in CreateKVCatalogInfo.


3710-3721: LGTM!

The bucket ownership collection and prewarm initialization logic is correct. The LOG_IF(FATAL, bucket_ids.empty()) appropriately guards against a critical invariant violation where a node group becomes leader without owning any buckets.


5080-5137: LGTM!

The InitPreBuiltTables correctly applies engine prefixes to pre-built table catalog keys. The keys_with_prefix vector owns the strings while keys holds string_views that remain valid until the batch write completes.


5369-5378: LGTM!

UpsertCatalog correctly applies the engine prefix to the catalog key for table schema storage.


5420-5424: LGTM!

DeleteCatalog correctly applies the engine prefix for catalog deletion, consistent with UpsertCatalog.

store_handler/eloq_data_store_service/eloq_store_data_store_factory.h (4)

25-25: LGTM!

The include for <unordered_set> is correctly added to support the new method parameter.


131-135: Verify potential data truncation in partition_id cast.

The cast from partition_id_ to uint16_t at line 134 may truncate data if partition_id_ exceeds 65535. Confirm that for HashPartition tables, partition_id_ values are guaranteed to fit within the uint16_t range, or add appropriate bounds checking.

#!/bin/bash
# Description: Find the type definition of partition_id_ in TableIdent and check for any range constraints.

# Search for TableIdent struct/class definition
ast-grep --pattern $'struct TableIdent {
  $$$
}'

# Also search for the field definition
rg -nP -C3 '\bpartition_id_\b.*(?:uint|int)'

# Look for any documentation or constraints on partition_id values
rg -nP -C3 'partition_id.*range|partition_id.*max|partition.*65535|partition.*uint16'

110-111: Unused ng_id parameter in InitializePrewarmFilter.

The ng_id parameter is accepted but never referenced in the method body. Only bucket_ids is used within the prewarm filter lambda. Either remove the parameter from the virtual interface and all implementations, or incorporate ng_id into the filter logic if it should be used.


142-142: Consider documenting the threading assumptions for InitializePrewarmFilter.

The removal of const allows mutation of eloq_store_configs_ via InitializePrewarmFilter, which is called from OpenDataStore. Based on the codebase analysis, InitializePrewarmFilter is invoked once per shard from OnLeaderStart callbacks, and the subsequent data store creation is protected by an atomic status check in ConnectAndStartDataStore. However, the prewarm filter setup itself lacks explicit synchronization. While current usage appears safe due to callback serialization, consider documenting that InitializePrewarmFilter must be called serially per shard, or add explicit synchronization if this assumption could change.

table_name.compare(0, 8, "eloqsql_") == 0 ||
table_name.compare(0, 7, "irange_") == 0)
{
// this table engine use RangeParition.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammatical error and typo in comment.

Comment should read "uses RangePartition" (not "use RangeParition").

📝 Proposed fix
-                // this table engine use RangeParition.
+                // this table engine uses RangePartition.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// this table engine use RangeParition.
// this table engine uses RangePartition.
🤖 Prompt for AI Agents
In @store_handler/eloq_data_store_service/eloq_store_data_store_factory.h at
line 123, Update the inline comment that currently reads "this table engine use
RangeParition." to correct grammar and typo: change "use" to "uses" and
"RangeParition" to "RangePartition", so the comment becomes "this table engine
uses RangePartition." Ensure only the comment text is edited.

…that kv_table belongs to at data_store level.

update rocksdb handler

Initialize prewarm_filter
@lzxddz lzxddz force-pushed the change_kv_tablename branch from 3b70f6d to 40bd124 Compare January 12, 2026 03:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
store_handler/data_store_service_client_closure.h (3)

2727-2780: Clear FetchDatabaseCallbackData::key_str_ in Reset() (pooled object stale-key risk).

Clear() wipes key_str_, but Reset() doesn’t. Since this is Poolable-reused, it’s safer/consistent to clear it in Reset() too (like FetchTableCallbackData).

Proposed change
     void Reset(std::string &definition,
                bool &found,
                const std::function<void()> *yield_fptr,
                const std::function<void()> *resume_fptr)
     {
         SyncCallbackData::Reset();
+        key_str_.clear();
         db_definition_ = &definition;
         found_ = &found;
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
     }

2887-2990: Clear UpsertDatabaseCallbackData::key_str_ / DropDatabaseCallbackData::key_str_ in Reset() (future-proof pooled reuse).

Today call sites appear to always assign key_str_ after Reset(), but leaving it uncleared makes it easy for future call paths to accidentally reuse a stale prefixed key.

Proposed change
 struct UpsertDatabaseCallbackData : public SyncCallbackData
 {
@@
     void Reset(const std::function<void()> *yield_fptr = nullptr,
                const std::function<void()> *resume_fptr = nullptr)
     {
         SyncCallbackData::Reset();
+        key_str_.clear();
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
     }
@@
 };

 struct DropDatabaseCallbackData : public SyncCallbackData
 {
@@
     void Reset(const std::function<void()> *yield_fptr = nullptr,
                const std::function<void()> *resume_fptr = nullptr)
     {
         SyncCallbackData::Reset();
+        key_str_.clear();
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
     }
@@
 };

3024-3079: DiscoverAllTableNamesCallbackData should also clear start_key_/end_key_ in Reset/Clear.

You clear session_id_ but leave start_key_/end_key_ untouched in both Reset() and Clear(). If pagination retries/restarts reuse this object, stale range bounds can leak into the next scan.

Proposed change
     void Reset(uint32_t engine_prefix_len,
                std::vector<std::string> &table_names,
                const std::function<void()> *yield_fptr,
                const std::function<void()> *resume_fptr)
     {
         SyncCallbackData::Reset();
         table_names_ = &table_names;
         engine_prefix_len_ = engine_prefix_len;
         yield_fptr_ = yield_fptr;
         resume_fptr_ = resume_fptr;
         session_id_.clear();
+        start_key_.clear();
+        end_key_.clear();
     }

     void Clear() override
     {
         SyncCallbackData::Clear();
         table_names_ = nullptr;
         engine_prefix_len_ = 0;
         yield_fptr_ = nullptr;
         resume_fptr_ = nullptr;
         session_id_.clear();
+        start_key_.clear();
+        end_key_.clear();
     }
store_handler/rocksdb_handler.h (1)

425-455: Add explicit handling for the unused table_engine parameter in implementations to avoid compiler warnings.

All five method implementations (DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, FetchAllDatabase) accept table_engine but don't use it. Since these are stub implementations with assert(false), add either [[maybe_unused]] attribute in the .cpp definitions or (void)table_engine; statements to suppress warnings and clarify that the parameter is intentionally unused at this stage.

store_handler/data_store_service_client.cpp (3)

2244-2277: Guard against empty prefix before end_key_.back()++ (potential UB).

Both scans build end_key_ by incrementing the last byte; if KvTablePrefixOf(table_engine) can ever be empty (e.g., TableEngine::None or similar), back() is undefined behavior.

Proposed fix
 bool DataStoreServiceClient::DiscoverAllTableNames(
     txservice::TableEngine table_engine,
     std::vector<std::string> &norm_name_vec,
     const std::function<void()> *yield_fptr,
     const std::function<void()> *resume_fptr)
 {
     DiscoverAllTableNamesCallbackData *callback_data =
         discover_all_tables_callback_data_pool_.NextObject();
     PoolableGuard guard(callback_data);
     std::string engine_prefix = txservice::KvTablePrefixOf(table_engine);
+    if (engine_prefix.empty())
+    {
+        LOG(ERROR) << "DiscoverAllTableNames: empty engine prefix, engine="
+                   << static_cast<int>(table_engine);
+        return false;
+    }
     callback_data->Reset(
         engine_prefix.size(), norm_name_vec, yield_fptr, resume_fptr);
     callback_data->start_key_ = std::move(engine_prefix);
     callback_data->end_key_ = callback_data->start_key_;
     callback_data->end_key_.back()++;
 bool DataStoreServiceClient::FetchAllDatabase(
     txservice::TableEngine table_engine,
     std::vector<std::string> &dbnames,
     const std::function<void()> *yield_fptr,
     const std::function<void()> *resume_fptr)
 {
     FetchAllDatabaseCallbackData *callback_data =
         fetch_all_dbs_callback_data_pool_.NextObject();
     PoolableGuard guard(callback_data);
     std::string engine_prefix = txservice::KvTablePrefixOf(table_engine);
+    if (engine_prefix.empty())
+    {
+        LOG(ERROR) << "FetchAllDatabase: empty engine prefix, engine="
+                   << static_cast<int>(table_engine);
+        return false;
+    }
     callback_data->Reset(
         engine_prefix.size(), dbnames, yield_fptr, resume_fptr);
     callback_data->start_key_ = std::move(engine_prefix);
     callback_data->end_key_ = callback_data->start_key_;
     callback_data->end_key_.back()++;

Also applies to: 2451-2484


3700-3727: Avoid hard-crashing on empty bucket set; ensure bucket mapping stays in sync with topology.

LOG_IF(FATAL, bucket_ids.empty()) will take the process down if bucket_infos_ is stale/mismatched (or if this NG legitimately owns none due to config). Prefer returning false (or rebuilding bucket mapping from current shard IDs/config) with a clear error.

Proposed fix
     if (data_store_service_ != nullptr)
     {
         std::unordered_set<uint16_t> bucket_ids;
         for (auto &[bucket_id, bucket_info] : bucket_infos_)
         {
             if (bucket_info->BucketOwner() == ng_id)
             {
                 bucket_ids.insert(bucket_id);
             }
         }
-        LOG_IF(FATAL, bucket_ids.empty())
-            << "bucket_ids is empty, ng_id: " << ng_id;
+        if (bucket_ids.empty())
+        {
+            LOG(ERROR) << "OnLeaderStart: bucket_ids is empty, ng_id: " << ng_id
+                       << " (bucket_infos_ size=" << bucket_infos_.size() << ")";
+            return false;
+        }
         // Binded data store shard with ng.
         data_store_service_->OpenDataStore(ng_id, std::move(bucket_ids));
     }

5080-5138: Prebuilt initialization likely broken for non-EloqSql engines after key prefixing.

InitPreBuiltTables() checks existence via FetchTable(tablename, ...), but tablename is constructed with TableEngine::EloqSql (Line 5088-5090). After this PR, catalog keys are engine-prefixed, so any prebuilt table whose engine != EloqSql will be looked up under the wrong prefix and treated as missing.

Proposed fix
     for (const auto &[table_name, kv_table_name] : pre_built_table_names_)
     {
         auto tbl_sv = table_name.StringView();
         // check if the table is initialized
-        txservice::TableName tablename(tbl_sv,
-                                       txservice::TableType::Primary,
-                                       txservice::TableEngine::EloqSql);
+        txservice::TableName tablename(tbl_sv,
+                                       txservice::TableType::Primary,
+                                       table_name.Engine());
         std::string catalog_image;
         bool found = false;
         uint64_t version_ts = 0;
         if (!FetchTable(tablename, catalog_image, found, version_ts))
         {
             LOG(WARNING) << "InitPreBuiltTables failed on fetching table.";
             return false;
         }
🤖 Fix all issues with AI agents
In @store_handler/eloq_data_store_service/data_store_service.cpp:
- Around line 2201-2208: DataStoreService::OpenDataStore currently calls
data_store_factory_->InitializePrewarmFilter(shard_id, std::move(bucket_ids))
before checking whether the shard is already open, causing unwanted mutation and
consumption of bucket_ids on no-op paths; move the InitializePrewarmFilter call
so it executes only after the shard-status check that determines we will
actually open the shard (i.e., after the early-return guard for already-open
shards), ensuring InitializePrewarmFilter is invoked only on the real open path
and that bucket_ids is not consumed when OpenDataStore exits early.

In @tx_service/include/store/data_store_handler.h:
- Around line 242-271: Update the method signatures in the three implementers so
they match the interface by adding txservice::TableEngine table_engine as the
first parameter to each of the five methods: DiscoverAllTableNames,
UpsertDatabase, DropDatabase, FetchDatabase, and FetchAllDatabase in classes
IntMemoryStore, BigTableHandler, and DynamoHandler; ensure you change both the
declaration and any corresponding definitions/overrides to include the new first
parameter (txservice::TableEngine table_engine) and rebuild to fix the
mismatched signature errors.
🧹 Nitpick comments (2)
store_handler/data_store_service_client.h (1)

179-184: Make the header self-contained + call static method explicitly for clarity.

InitBucketsInfo takes a std::set, but this header doesn’t visibly include <set>; please include it directly to avoid brittle transitive-include builds. Also consider calling it as DataStoreServiceClient::InitBucketsInfo(...) (e.g., at Line 148-150) now that it’s static.

store_handler/data_store_service_client.cpp (1)

3869-3934: InitBucketsInfo is deterministic (good); minor cleanup on redundant Set().

Since ng_bucket_infos.clear() happens up front, try_emplace should always insert; Set() inside if (insert_res.second) is redundant, and if you ever remove the clear(), you probably want to Set() even when the key already existed.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b70f6d and 40bd124.

📒 Files selected for processing (14)
  • store_handler/data_store_service_client.cpp
  • store_handler/data_store_service_client.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/data_store_service_client_closure.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • store_handler/rocksdb_handler.cpp
  • store_handler/rocksdb_handler.h
  • tx_service/include/cc/cc_req_misc.h
  • tx_service/include/sequences/sequences.h
  • tx_service/include/store/data_store_handler.h
  • tx_service/include/type.h
🚧 Files skipped from review as they are similar to previous changes (8)
  • store_handler/eloq_data_store_service/eloq_store_data_store_factory.h
  • tx_service/include/sequences/sequences.h
  • store_handler/eloq_data_store_service/data_store_factory.h
  • store_handler/data_store_service_client_closure.cpp
  • store_handler/eloq_data_store_service/data_store_service.h
  • tx_service/include/type.h
  • store_handler/rocksdb_handler.cpp
  • tx_service/include/cc/cc_req_misc.h
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-12-02T10:43:27.431Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 254
File: tx_service/src/cc/local_cc_shards.cpp:2949-3188
Timestamp: 2025-12-02T10:43:27.431Z
Learning: In tx_service/src/cc/local_cc_shards.cpp, whenever TryPinNodeGroupData is used, only call Sharder::Instance().UnpinNodeGroupData(node_group) if the recorded term is >= 0 (i.e., pin succeeded). Example: LocalCcShards::PostProcessFlushTaskEntries guards the unpin with `if (term >= 0)`.

Applied to files:

  • store_handler/data_store_service_client.h
  • store_handler/eloq_data_store_service/data_store_service.cpp
📚 Learning: 2025-12-16T09:47:09.981Z
Learnt from: githubzilla
Repo: eloqdata/tx_service PR: 288
File: store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h:54-76
Timestamp: 2025-12-16T09:47:09.981Z
Learning: In RocksDBCloudDataStoreFactory::CreateDataStore (store_handler/eloq_data_store_service/rocksdb_cloud_data_store_factory.h), when oss_url is configured, the factory always appends /ds_<shard_id> to the oss_url. Users should not include the shard path in the oss_url themselves.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.cpp
📚 Learning: 2025-10-09T03:56:58.811Z
Learnt from: thweetkomputer
Repo: eloqdata/tx_service PR: 150
File: include/cc/local_cc_shards.h:626-631
Timestamp: 2025-10-09T03:56:58.811Z
Learning: For the LocalCcShards class in include/cc/local_cc_shards.h: Writer locks (unique_lock) should continue using the original meta_data_mux_ (std::shared_mutex) rather than fast_meta_data_mux_ (FastMetaDataMutex) at this stage. Only reader locks may use the FastMetaDataMutex wrapper.

Applied to files:

  • store_handler/eloq_data_store_service/data_store_service.cpp
📚 Learning: 2025-10-21T06:46:53.700Z
Learnt from: lokax
Repo: eloqdata/tx_service PR: 149
File: src/remote/cc_stream_receiver.cpp:1066-1075
Timestamp: 2025-10-21T06:46:53.700Z
Learning: In src/remote/cc_stream_receiver.cpp, for ScanNextRequest handling, BucketIds() on RemoteScanNextBatch should never be empty—this is an expected invariant of the scan protocol.

Applied to files:

  • store_handler/data_store_service_client.cpp
🧬 Code graph analysis (4)
store_handler/data_store_service_client.h (2)
store_handler/data_store_service_client.cpp (6)
  • InitBucketsInfo (3869-3934)
  • InitBucketsInfo (3869-3873)
  • UpsertDatabase (2292-2343)
  • UpsertDatabase (2292-2297)
  • DropDatabase (2357-2407)
  • DropDatabase (2357-2361)
tx_service/include/type.h (2)
  • TableEngine (160-211)
  • string (170-191)
tx_service/include/store/data_store_handler.h (2)
tx_service/include/type.h (3)
  • TableEngine (160-211)
  • std (736-746)
  • string (170-191)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2292-2343)
  • UpsertDatabase (2292-2297)
  • DropDatabase (2357-2407)
  • DropDatabase (2357-2361)
  • FetchDatabase (2424-2449)
  • FetchDatabase (2424-2430)
  • FetchAllDatabase (2451-2484)
  • FetchAllDatabase (2451-2455)
store_handler/rocksdb_handler.h (2)
tx_service/include/type.h (3)
  • TableEngine (160-211)
  • std (736-746)
  • string (170-191)
store_handler/data_store_service_client.cpp (8)
  • UpsertDatabase (2292-2343)
  • UpsertDatabase (2292-2297)
  • DropDatabase (2357-2407)
  • DropDatabase (2357-2361)
  • FetchDatabase (2424-2449)
  • FetchDatabase (2424-2430)
  • FetchAllDatabase (2451-2484)
  • FetchAllDatabase (2451-2455)
store_handler/data_store_service_client_closure.h (1)
tx_service/include/type.h (2)
  • std (736-746)
  • string (170-191)
🔇 Additional comments (6)
store_handler/data_store_service_client_closure.h (2)

2628-2686: FetchTableCallbackData::key_str_ lifecycle looks correct (cleared in Reset/Clear).

Clearing key_str_ in both Reset() and Clear() is the right pattern for pooled callback data that carries an engine-prefixed identifier.


2803-2863: FetchAllDatabaseCallbackData prefix-length + start/end key state reset looks consistent.

Reset/Clear both handle engine_prefix_len_ and the pagination state (session_id_, start_key_, end_key_) cleanly.

store_handler/data_store_service_client.cpp (4)

36-107: Engine prefixing doc + include look good.

The comment clearly explains the new invariant (prefixed KvTableName/BaseTableName/db keys), and <unordered_set> is justified by the new bucket set usage.


622-633: Catalog keys are now consistently engine-prefixed (good), but keep an eye on empty/invalid engine.

FetchTableCatalog/FetchTable and UpsertCatalog/DeleteCatalog now agree on the key format (KvTablePrefixOf(engine) + name), which is the important consistency point for this PR.

Also applies to: 2204-2227, 5369-5402, 5420-5451


2292-2343: DB catalog keys are correctly engine-prefixed.

Upsert/Drop/Fetch now use KvTablePrefixOf(table_engine) + db, which should prevent cross-engine name collisions in db_catalogs.

Also applies to: 2357-2407, 2424-2449


2524-2555: KV table/index name generation with engine prefix looks correct.

Prefixing the generated kv table and index names makes engine ownership visible at the KV layer without changing the UUID-based uniqueness strategy.

Also applies to: 2632-2672

Comment on lines 358 to 387
bool DiscoverAllTableNames(
txservice::TableEngine table_engine,
std::vector<std::string> &norm_name_vec,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) override;

//-- database
bool UpsertDatabase(
txservice::TableEngine table_engine,
std::string_view db,
std::string_view definition,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) override;
bool DropDatabase(
txservice::TableEngine table_engine,
std::string_view db,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) override;
bool FetchDatabase(
txservice::TableEngine table_engine,
std::string_view db,
std::string &definition,
bool &found,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) override;
bool FetchAllDatabase(
txservice::TableEngine table_engine,
std::vector<std::string> &dbnames,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) override;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Call sites (should now pass a TableEngine as first arg)
echo "=== Searching for call sites of modified methods ==="
rg -n --hidden --glob '!*_test*' --glob '!**/third_party/**' \
  '\b(DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)\s*\(' . | head -100

echo ""
echo "=== Searching for InitBucketsInfo references ==="
# Ensure removed non-static method isn't referenced anymore
rg -n --hidden --glob '!**/third_party/**' '\bInitBucketsInfo\s*\(' . || echo "No InitBucketsInfo found (as expected)"

Repository: eloqdata/tx_service

Length of output: 4479


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Checking rocksdb_handler.cpp for actual calls to these methods ==="
rg -n -A 5 'DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase' \
  ./store_handler/rocksdb_handler.cpp | head -80

echo ""
echo "=== Checking for actual invocation patterns with TableEngine ==="
rg -n 'handler->\s*(DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)\s*\(' . --type cpp | head -40

echo ""
echo "=== Check rocksdb_handler.cpp implementation signatures ==="
sed -n '1495,1510p' ./store_handler/rocksdb_handler.cpp

echo ""
echo "=== Check if InitBucketsInfo is supposed to be static or removed ==="
rg -B 2 -A 10 'InitBucketsInfo' ./store_handler/data_store_service_client.h | head -30

Repository: eloqdata/tx_service

Length of output: 2770


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for any calls to database methods (broader search) ==="
rg -n '\.(DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)\s*\(' . | head -50

echo ""
echo "=== Check InitBucketsInfo declaration in header ==="
sed -n '175,185p' ./store_handler/data_store_service_client.h

echo ""
echo "=== Check InitBucketsInfo call context ==="
sed -n '145,155p' ./store_handler/data_store_service_client.h

echo ""
echo "=== Search for test files that might call these methods ==="
fd -e '_test.cpp' -o -e 'test_*.cpp' | xargs grep -l 'DiscoverAllTableNames\|UpsertDatabase' 2>/dev/null | head -5

echo ""
echo "=== Check bigtable_handler.cpp for implementations ==="
rg -A 5 'bool EloqDS::BigTableHandler::DiscoverAllTableNames' ./store_handler/bigtable_handler.cpp | head -20

Repository: eloqdata/tx_service

Length of output: 128


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check InitBucketsInfo full declaration and usage ==="
rg -B 3 -A 15 'static void InitBucketsInfo' ./store_handler/data_store_service_client.h

echo ""
echo "=== Check InitBucketsInfo call at line 149 ==="
sed -n '145,155p' ./store_handler/data_store_service_client.h

echo ""
echo "=== Search for all references to these methods in non-test code ==="
rg -c 'DiscoverAllTableNames\|UpsertDatabase\|DropDatabase\|FetchDatabase\|FetchAllDatabase' \
  --glob '!*_test*' --glob '!**/third_party/**' . 2>/dev/null | grep -v ':0$'

echo ""
echo "=== Check main query handler or service files ==="
fd -type f -name '*query*handler*' -o -name '*service*.cpp' | grep -v test | head -10

echo ""
echo "=== Look in query execution or transaction code ==="
rg -l 'DiscoverAllTableNames\|UpsertDatabase' . --type cpp | grep -v _test | grep -v third_party

Repository: eloqdata/tx_service

Length of output: 1321


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Find all .cpp files and search for invocations ==="
fd -type f -name '*.cpp' | xargs grep -n 'handler.*->.*DiscoverAllTableNames\|handler.*->.*UpsertDatabase' 2>/dev/null | head -20

echo ""
echo "=== Search in transaction or query execution files ==="
find . -type f -name '*.cpp' -path '*/tx_service/*' ! -path '*/third_party/*' | head -20

echo ""
echo "=== Check if these methods are called anywhere in the codebase ==="
rg -U 'DiscoverAllTableNames\s*\(' . --type cpp -A 3 | grep -v 'bool.*DiscoverAllTableNames' | head -30

echo ""
echo "=== Verify the InitBucketsInfo claim - is it really removed? ==="
git log --all --oneline -S 'InitBucketsInfo' -- store_handler/data_store_service_client.h | head -5

Repository: eloqdata/tx_service

Length of output: 307


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for actual invocations of these methods ==="
find . -name '*.cpp' -type f | xargs grep -n '\->DiscoverAllTableNames\|\->UpsertDatabase\|\->DropDatabase\|\->FetchDatabase\|\->FetchAllDatabase' 2>/dev/null | head -30

echo ""
echo "=== Check git history for InitBucketsInfo changes ==="
cd . && git log --oneline -S 'InitBucketsInfo' --all | head -10

echo ""
echo "=== Verify all implementations have table_engine parameter ==="
echo "BigTable DiscoverAllTableNames:"
sed -n '1566,1567p' ./store_handler/bigtable_handler.cpp

echo ""
echo "Dynamo DiscoverAllTableNames:"
sed -n '2851,2852p' ./store_handler/dynamo_handler.cpp

echo ""
echo "=== Check if there are any compilation instructions or CMakeLists ==="
fd -type f -name CMakeLists.txt | head -3

Repository: eloqdata/tx_service

Length of output: 119


🏁 Script executed:

#!/bin/bash

echo "=== Search for actual method calls ==="
grep -r "->DiscoverAllTableNames\|->UpsertDatabase\|->DropDatabase\|->FetchDatabase\|->FetchAllDatabase" . --include="*.cpp" 2>/dev/null | head -30

echo ""
echo "=== Check BigTable implementation signature ==="
head -1575 ./store_handler/bigtable_handler.cpp | tail -15

echo ""
echo "=== Check Dynamo implementation signature ==="
head -2860 ./store_handler/dynamo_handler.cpp | tail -15

echo ""
echo "=== Check if InitBucketsInfo still exists in implementation ==="
grep -n "void DataStoreServiceClient::InitBucketsInfo" ./store_handler/data_store_service_client.cpp

echo ""
echo "=== Check InitBucketsInfo context - static or instance method? ==="
grep -B 5 "InitBucketsInfo" ./store_handler/data_store_service_client.h | head -20

Repository: eloqdata/tx_service

Length of output: 1730


🏁 Script executed:

#!/bin/bash

echo "=== Compare header declaration vs implementation for BigTableHandler ==="
echo "Header:"
sed -n '144,148p' ./store_handler/bigtable_handler.h

echo ""
echo "Implementation:"
sed -n '1566,1570p' ./store_handler/bigtable_handler.cpp

echo ""
echo "=== Compare header declaration vs implementation for DynamoHandler ==="
echo "Header:"
sed -n '138,142p' ./store_handler/dynamo_handler.h

echo ""
echo "Implementation:"
sed -n '2851,2855p' ./store_handler/dynamo_handler.cpp

echo ""
echo "=== Check RocksDBHandler implementation ==="
sed -n '1495,1500p' ./store_handler/rocksdb_handler.cpp

echo ""
echo "=== Check DataStoreServiceClient implementation ==="
sed -n '2244,2248p' ./store_handler/data_store_service_client.cpp

Repository: eloqdata/tx_service

Length of output: 1633


🏁 Script executed:

#!/bin/bash

echo "=== Check all handler headers for inconsistent signatures ==="
echo "=== BigTableHandler header ==="
sed -n '144,148p' ./store_handler/bigtable_handler.h

echo ""
echo "=== DynamoHandler header ==="
sed -n '138,142p' ./store_handler/dynamo_handler.h

echo ""
echo "=== RocksDBHandler header ==="
sed -n '425,430p' ./store_handler/rocksdb_handler.h

echo ""
echo "=== DataStoreServiceClient header ==="
sed -n '358,363p' ./store_handler/data_store_service_client.h

echo ""
echo "=== Base class DataStoreHandler ==="
sed -n '242,248p' ./tx_service/include/store/data_store_handler.h

echo ""
echo "=== Summary: Check if all implementations match the base class signature ==="
echo "Checking which implementations have table_engine parameter..."
grep -A 5 'bool.*DiscoverAllTableNames' ./store_handler/*.h | grep -E '(\.h.*bool|table_engine)'

Repository: eloqdata/tx_service

Length of output: 2097


🏁 Script executed:

#!/bin/bash

echo "=== Check UpsertDatabase signatures across all handlers ==="
echo "Base class:"
sed -n '249,255p' ./tx_service/include/store/data_store_handler.h

echo ""
echo "BigTableHandler header:"
sed -n '150,154p' ./store_handler/bigtable_handler.h

echo ""
echo "DynamoHandler header:"
sed -n '201,205p' ./store_handler/dynamo_handler.h

echo ""
echo "RocksDBHandler header:"
sed -n '432,437p' ./store_handler/rocksdb_handler.h

echo ""
echo "DataStoreServiceClient header:"
sed -n '365,370p' ./store_handler/data_store_service_client.h

echo ""
echo "=== Check all other database methods for same issue ==="
echo "Checking DropDatabase..."
grep -A 4 'bool.*DropDatabase' ./store_handler/bigtable_handler.h | head -5
grep -A 4 'bool.*DropDatabase' ./tx_service/include/store/data_store_handler.h | head -5

echo ""
echo "=== Verify InitBucketsInfo is still present and check if it should be removed ==="
grep -C 3 'InitBucketsInfo' ./store_handler/data_store_service_client.h | head -20

Repository: eloqdata/tx_service

Length of output: 2589


🏁 Script executed:

#!/bin/bash

echo "=== Check all database methods in BigTableHandler to see which are missing table_engine ==="
grep -E 'bool (DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)' ./store_handler/bigtable_handler.h -A 4 | head -50

echo ""
echo "=== Check all database methods in DynamoHandler ==="
grep -E 'bool (DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)' ./store_handler/dynamo_handler.h -A 4 | head -50

echo ""
echo "=== Check InMemStore which might have the old signature ==="
grep -E 'bool (DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)' ./tx_service/include/store/int_mem_store.h -A 4 | head -50

Repository: eloqdata/tx_service

Length of output: 3286


🏁 Script executed:

#!/bin/bash

echo "=== Final verification: Check if code compiles with mismatched signatures ==="
echo "Summary of signature inconsistencies:"
echo ""
echo "Base class (DataStoreHandler) requires:"
echo "  virtual bool DiscoverAllTableNames(txservice::TableEngine table_engine, ...)"
echo ""
echo "But these implementations are missing table_engine parameter:"
echo "  - BigTableHandler"
echo "  - DynamoHandler"
echo "  - InMemStore"
echo ""
echo "These implementations have table_engine (correct):"
echo "  - DataStoreServiceClient"
echo "  - RocksDBHandler"
echo ""
echo "=== Check if InMemStore.h is still used ==="
grep -r 'InMemStore' --include='*.h' --include='*.cpp' | grep -v third_party | grep -E '(new |std::make|include)' | head -10

echo ""
echo "=== Verify InitBucketsInfo claim - is it really removed? ==="
echo "InitBucketsInfo is STILL present as static method in DataStoreServiceClient"
echo "Location: data_store_service_client.h:179 and called at line 149"
echo "It was NOT removed."

Repository: eloqdata/tx_service

Length of output: 780


Signature changes are incomplete: three implementations missing the table_engine parameter.

The base class DataStoreHandler now requires txservice::TableEngine table_engine as the first parameter for all five database methods, but three implementations have not been updated:

  • BigTableHandler (bigtable_handler.h:144-165)
  • DynamoHandler (dynamo_handler.h:138-216)
  • InMemStore (int_mem_store.h:150-187)

Only DataStoreServiceClient and RocksDBHandler have the correct signature. This will cause compilation failures due to virtual function signature mismatch.

Additionally, InitBucketsInfo has not been removed—it remains as a static method in DataStoreServiceClient (declared at line 179, called at line 149, implemented at line 3869 in the .cpp file).

Comment on lines +2201 to +2208
void DataStoreService::OpenDataStore(uint32_t shard_id,
std::unordered_set<uint16_t> &&bucket_ids)
{
if (data_store_factory_ != nullptr)
{
data_store_factory_->InitializePrewarmFilter(shard_id,
std::move(bucket_ids));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the full OpenDataStore function
cd store_handler/eloq_data_store_service && wc -l data_store_service.cpp

Repository: eloqdata/tx_service

Length of output: 90


🏁 Script executed:

# Read the function from lines 2201 to approximately 2242
sed -n '2201,2242p' store_handler/eloq_data_store_service/data_store_service.cpp

Repository: eloqdata/tx_service

Length of output: 1846


🏁 Script executed:

# Let's also check what InitializePrewarmFilter does
rg -n "InitializePrewarmFilter" store_handler/eloq_data_store_service/ -A 10

Repository: eloqdata/tx_service

Length of output: 3231


🏁 Script executed:

# Check if there are other callers of OpenDataStore to understand the context better
rg -n "OpenDataStore" store_handler/eloq_data_store_service/ --type cpp -B 2 -A 2

Repository: eloqdata/tx_service

Length of output: 2738


🏁 Script executed:

# Verify if InitializePrewarmFilter is called elsewhere
rg -n "InitializePrewarmFilter" store_handler/eloq_data_store_service/ --type cpp --type h | head -20

Repository: eloqdata/tx_service

Length of output: 404


Move InitializePrewarmFilter after the shard-status check to avoid side effects on no-op paths.

InitializePrewarmFilter modifies the factory's prewarm_filter configuration and consumes the bucket_ids rvalue reference, but this currently happens even when OpenDataStore returns early because the shard is already open. The side effect should only occur when actually opening the shard.

Proposed change
 void DataStoreService::OpenDataStore(uint32_t shard_id,
                                      std::unordered_set<uint16_t> &&bucket_ids)
 {
-    if (data_store_factory_ != nullptr)
-    {
-        data_store_factory_->InitializePrewarmFilter(shard_id,
-                                                     std::move(bucket_ids));
-    }
-
     auto start_time = std::chrono::steady_clock::now();
     auto &ds_ref = data_shards_.at(shard_id);
     if (ds_ref.shard_status_.load() != DSShardStatus::Closed)
     {
         LOG(INFO) << "OpenDataStore no-op for DSS shard status is not closed, "
                   << shard_id << ", shard_id_: " << ds_ref.shard_id_
                   << ", shard_status_: "
                   << static_cast<int>(ds_ref.shard_status_.load());
         return;
     }
+
+    if (data_store_factory_ != nullptr)
+    {
+        data_store_factory_->InitializePrewarmFilter(shard_id,
+                                                     std::move(bucket_ids));
+    }
🤖 Prompt for AI Agents
In @store_handler/eloq_data_store_service/data_store_service.cpp around lines
2201 - 2208, DataStoreService::OpenDataStore currently calls
data_store_factory_->InitializePrewarmFilter(shard_id, std::move(bucket_ids))
before checking whether the shard is already open, causing unwanted mutation and
consumption of bucket_ids on no-op paths; move the InitializePrewarmFilter call
so it executes only after the shard-status check that determines we will
actually open the shard (i.e., after the early-return guard for already-open
shards), ensuring InitializePrewarmFilter is invoked only on the real open path
and that bucket_ids is not consumed when OpenDataStore exits early.

Comment on lines 242 to 271
virtual bool DiscoverAllTableNames(
txservice::TableEngine table_engine,
std::vector<std::string> &norm_name_vec,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) = 0;

//-- database
virtual bool UpsertDatabase(
txservice::TableEngine table_engine,
std::string_view db,
std::string_view definition,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) = 0;
virtual bool DropDatabase(
txservice::TableEngine table_engine,
std::string_view db,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) = 0;
virtual bool FetchDatabase(
txservice::TableEngine table_engine,
std::string_view db,
std::string &definition,
bool &found,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) = 0;
virtual bool FetchAllDatabase(
txservice::TableEngine table_engine,
std::vector<std::string> &dbnames,
const std::function<void()> *yield_fptr = nullptr,
const std::function<void()> *resume_fptr = nullptr) = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all classes inheriting from DataStoreHandler
echo "=== Classes inheriting from DataStoreHandler ==="
rg -n -C2 --type=cpp --type=cc --type=h --type=hpp \
  ':\s*public\s+(txservice::store::)?DataStoreHandler\b'

echo ""
echo "=== Method overrides for the 5 updated methods ==="
rg -n --type=cpp --type=cc --type=h --type=hpp \
  'bool\s+(DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)\s*\(' | head -50

Repository: eloqdata/tx_service

Length of output: 234


🏁 Script executed:

#!/bin/bash
# Find C++ files
echo "=== C++ file extensions in repo ==="
git ls-files | grep -E '\.(cpp|cc|cxx|c\+\+|h|hpp|hxx|h\+\+)$' | head -20

echo ""
echo "=== Search for DataStoreHandler without file type restrictions ==="
rg -n 'DataStoreHandler' --type-not=json | head -30

echo ""
echo "=== Search for class definitions with DataStoreHandler inheritance ==="
rg -n 'class\s+\w+.*:\s*(public\s+)?(txservice::store::)?DataStoreHandler' | head -20

Repository: eloqdata/tx_service

Length of output: 4586


🏁 Script executed:

#!/bin/bash
# Check each implementer for the 5 updated methods
for file in "store_handler/rocksdb_handler.h" "store_handler/data_store_service_client.h" "tx_service/include/store/int_mem_store.h" "store_handler/bigtable_handler.h" "store_handler/dynamo_handler.h"; do
  echo "=== Checking $file ==="
  if [ -f "$file" ]; then
    # Search for the 5 method signatures
    rg -n -A3 '(DiscoverAllTableNames|UpsertDatabase|DropDatabase|FetchDatabase|FetchAllDatabase)' "$file" | head -50
  else
    echo "File not found"
  fi
  echo ""
done

Repository: eloqdata/tx_service

Length of output: 5397


Three DataStoreHandler implementers were not updated with the new txservice::TableEngine table_engine parameter and will cause build failures.

The following classes must be updated:

  • IntMemoryStore (tx_service/include/store/int_mem_store.h, lines 150–190)
  • BigTableHandler (store_handler/bigtable_handler.h, lines 144–168)
  • DynamoHandler (store_handler/dynamo_handler.h, lines 138–219)

Add txservice::TableEngine table_engine as the first parameter to all five methods: DiscoverAllTableNames, UpsertDatabase, DropDatabase, FetchDatabase, and FetchAllDatabase.

🤖 Prompt for AI Agents
In @tx_service/include/store/data_store_handler.h around lines 242 - 271, Update
the method signatures in the three implementers so they match the interface by
adding txservice::TableEngine table_engine as the first parameter to each of the
five methods: DiscoverAllTableNames, UpsertDatabase, DropDatabase,
FetchDatabase, and FetchAllDatabase in classes IntMemoryStore, BigTableHandler,
and DynamoHandler; ensure you change both the declaration and any corresponding
definitions/overrides to include the new first parameter (txservice::TableEngine
table_engine) and rebuild to fix the mismatched signature errors.

@lzxddz lzxddz requested a review from liunyl January 12, 2026 03:37
@lzxddz lzxddz merged commit 6e15523 into main Jan 13, 2026
4 checks passed
@lzxddz lzxddz deleted the change_kv_tablename branch January 14, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants