Conversation
WalkthroughThis PR comprehensively restructures the project from a monolithic Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes This PR introduces substantial architectural changes across multiple subsystems with high interconnection complexity:
Areas requiring extra attention during review:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
821dd5f to
85235fc
Compare
bfe5559 to
afbec58
Compare
4f36bee to
8caa739
Compare
* Enable data_substrate share libs build * Initialized g_data_substrate at the beginning of DataSubstrate::Initialize
* Install absl and yaml-cpp libs * Rename txlog_rocksdb_cloud_endpoint_url
f3ddf3c to
d0f0e12
Compare
There was a problem hiding this comment.
Actionable comments posted: 21
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (36)
store_handler/eloq_data_store_service/data_store_fault_inject.h-57-73 (1)
57-73: Static map in header causes ODR violation and memory waste.Defining a
staticvariable in a header file creates a separate copy in each translation unit that includes this header. This violates the One Definition Rule (ODR) and wastes memory.Option 1 (C++17): Use
inline:-static std::unordered_map<std::string, FaultAction> action_name_to_enum_map{ +inline std::unordered_map<std::string, FaultAction> action_name_to_enum_map{Option 2: Move definition to
.cppfile and declareexternin header:// In header: extern std::unordered_map<std::string, FaultAction> action_name_to_enum_map; // In .cpp: std::unordered_map<std::string, FaultAction> action_name_to_enum_map{...};store_handler/eloq_data_store_service/data_store_fault_inject.h-163-176 (1)
163-176: Race condition: returned pointer can be invalidated by concurrent operations.
Entry()returns a raw pointer to a map element after releasing the lock. If another thread callsInjectFault(..., "remove")or modifies the map, the pointer may become dangling, leading to use-after-free.The same issue exists in
TriggerAction(std::string fault_name)at lines 178-195, where the pointer is used after the lock is released.Consider:
- Returning a copy of
FaultEntryinstead of a pointer- Using
std::shared_ptr<FaultEntry>for entries- Holding the lock for the entire duration of action triggering (may impact concurrency)
store_handler/README.md-1-2 (1)
1-2: Expand documentation for the store_handler module.This README is minimal (just a title and one-line description) for what appears to be a core module based on the PR context. Consider adding:
- Architecture overview
- Supported storage backends
- Configuration examples
- API usage examples
- Integration guide
eloq_metrics/src/metrics_app.cc-40-51 (1)
40-51: Memory leak: dynamically allocatedExecutorobjects are never deleted.Raw pointers are stored in
exec_vecbut never freed. AfterJoin(), the objects remain allocated.Apply this diff to fix the memory leak:
- std::vector<eloq_metrics_app::Executor *> exec_vec; + std::vector<std::unique_ptr<eloq_metrics_app::Executor>> exec_vec; for (int i = 0; i < 3; ++i) { std::string task_name = "Task_" + std::to_string(i); - exec_vec.push_back(new Executor(task_name, + exec_vec.push_back(std::make_unique<Executor>(task_name, i, cpu_set, metrics_registry.get(), - {{"task", task_name}})); + {{"task", task_name}})); }Also add
#include <memory>if not already transitively included.Committable suggestion skipped: line range outside the PR's diff.
eloq_metrics/src/metrics_registry_impl.cpp-71-77 (1)
71-77: Usefind()instead ofoperator[]for lookup.
collectors_[key]inserts a default-constructed (null) entry if the key doesn't exist, silently modifying the map. Usingfind()is safer and more explicit:void MetricsRegistryImpl::Collect(metrics::MetricKey key, const metrics::Value &val) { - auto collector = collectors_[key].get(); - assert(collector); + auto it = collectors_.find(key); + assert(it != collectors_.end() && it->second); + auto collector = it->second.get(); collector->Collect(val); }eloq_metrics/CMakeLists.txt-11-14 (1)
11-14: Undefined variableELOQ_METRICS_ROOT_DIRwill cause empty paths.Line 11 defines
METRICS_ROOT_DIR, but lines 12-13 reference the undefinedELOQ_METRICS_ROOT_DIR. This will result in emptyMETRICS_SRC_DIRandMETRICS_INCLUDE_DIRvariables.set(METRICS_ROOT_DIR ${CMAKE_CURRENT_SOURCE_DIR}) -set(METRICS_SRC_DIR "${ELOQ_METRICS_ROOT_DIR}/src") -set(METRICS_INCLUDE_DIR "${ELOQ_METRICS_ROOT_DIR}/include") +set(METRICS_SRC_DIR "${METRICS_ROOT_DIR}/src") +set(METRICS_INCLUDE_DIR "${METRICS_ROOT_DIR}/include")eloq_metrics/src/dummy_executor.cc-43-54 (1)
43-54: CPU affinity is set after thread has already started - race condition.The constructor delegates to
Executor(task_name, metrics_registry, common_labels)which starts the thread at line 63-64. After delegation completes, lines 50-53 attempt to set CPU affinity, but the thread is already running. This is a race condition and the affinity may not take effect as intended.Consider restructuring to set affinity before starting the thread, or use a two-phase initialization pattern.
eloq_metrics/src/dummy_executor.cc-77-99 (1)
77-99: Busy loop without sleep will consume 100% CPU.The
while (true)loop has no sleep or yield, causing the thread to spin at full CPU utilization. This is problematic for a benchmark/demo executor. Consider adding a small sleep interval.while (true) { metrics::Value val = metrics::Value(max_range(rng)); meter_->Collect(dummy_executor_hist_metric, val); auto tick = boost::posix_time::second_clock::local_time(); boost::posix_time::time_duration diff = tick - start; + boost::this_thread::sleep_for(boost::chrono::milliseconds(1)); if (diff.total_seconds() > 0 && diff.total_seconds() % 5 == 0)eloq_metrics/src/dummy_executor.cc-106-108 (1)
106-108: Destructor doesn't join the thread, risking undefined behavior.If the
Executoris destroyed without callingJoin(), the thread may still be running and accessing member variables that have been destroyed.Executor::~Executor() { + if (inner_thread_ && inner_thread_->joinable()) + { + inner_thread_->join(); + } }eloq_metrics/src/dummy_executor.cc-31-41 (1)
31-41: Move constructor and assignment operator are incomplete.The move constructor only moves
inner_thread_but nottask_name_,meter_, orrunning_state_. The move assignment copiestask_name_but doesn't movemeter_orrunning_state_.Executor::Executor(eloq_metrics_app::Executor &&executor) noexcept - : inner_thread_(std::move(executor.inner_thread_)) + : inner_thread_(std::move(executor.inner_thread_)), + task_name_(std::move(executor.task_name_)), + meter_(std::move(executor.meter_)), + running_state_(executor.running_state_) { } Executor &Executor::operator=(Executor &&executor_) noexcept { this->inner_thread_ = std::move(executor_.inner_thread_); - this->task_name_ = executor_.task_name_; + this->task_name_ = std::move(executor_.task_name_); + this->meter_ = std::move(executor_.meter_); + this->running_state_ = executor_.running_state_; return *this; }core/src/log_init.cpp-141-148 (1)
141-148:std::stoican throw on malformed port values.If the config contains a non-numeric value after the colon,
std::stoiwill throwstd::invalid_argumentorstd::out_of_range. Consider adding exception handling or validation.size_t c_idx = token.find_first_of(':'); if (c_idx != std::string::npos) { txlog_ips.emplace_back(token.substr(0, c_idx)); - uint16_t pt = std::stoi(token.substr(c_idx + 1)); - txlog_ports.emplace_back(pt); + try + { + uint16_t pt = std::stoi(token.substr(c_idx + 1)); + txlog_ports.emplace_back(pt); + } + catch (const std::exception &e) + { + LOG(ERROR) << "Invalid port in txlog_service_list: " << token; + return false; + } }core/src/log_init.cpp-387-399 (1)
387-399: Invalid time format logs error but doesn't fail initialization.When
std::get_timefails to parse the log purger schedule, the code logs an error but continues with uninitializedlog_purger_tmvalues. This could lead to unexpected purger scheduling behavior.if (iss.fail()) { - LOG(ERROR) << "Invalid time format." << std::endl; + LOG(ERROR) << "Invalid time format for txlog_rocksdb_cloud_log_purger_schedule: " + << log_purger_schedule; + return false; } else {eloq_metrics/benchmark/CMakeLists.txt-29-32 (1)
29-32: Test command references wrong executable.The test is named
eloq-metrics-benchbut theCOMMANDspecifiesmono-bench, which doesn't match any executable defined in this file. This will cause test execution to fail.add_test( NAME eloq-metrics-bench - COMMAND mono-bench + COMMAND eloq-metrics-bench )core/include/glog_error_logging.h-80-87 (1)
80-87: Add error handling forreadlinkfailure.
readlinkreturns-1on failure, which would result in undefined behavior when constructing the string with a negative length. Additionally, this approach is Linux-specific and won't work on macOS or other platforms.// Get the absolute path of the bin directory char bin_path[PATH_MAX]; ssize_t len = readlink("/proc/self/exe", bin_path, PATH_MAX); + if (len == -1 || len >= PATH_MAX) + { + // Fallback to current directory if readlink fails + FLAGS_log_dir = "./logs"; + } + else + { - std::filesystem::path fullPath(std::string(bin_path, len)); - std::filesystem::path dir_path = - fullPath.parent_path().parent_path(); - FLAGS_log_dir = dir_path.string() + "/logs"; + std::filesystem::path fullPath(std::string(bin_path, len)); + std::filesystem::path dir_path = + fullPath.parent_path().parent_path(); + FLAGS_log_dir = dir_path.string() + "/logs"; + }core/src/data_substrate.cpp-443-448 (1)
443-448: Logic appears inverted forenable_walconfiguration.The ternary condition checks if the flag is default, but then uses
FLAGS_enable_walin both branches instead of reading fromconfig_readerwhen the flag is default. This means the INI config value is never used.core_config_.enable_wal = !core_config_.bootstrap && - (CheckCommandLineFlagIsDefault("enable_wal") - ? FLAGS_enable_wal - : config_reader.GetBoolean( - "local", "enable_wal", FLAGS_enable_wal)); + (!CheckCommandLineFlagIsDefault("enable_wal") + ? FLAGS_enable_wal + : config_reader.GetBoolean( + "local", "enable_wal", FLAGS_enable_wal));core/src/data_substrate.cpp-216-216 (1)
216-216: Method name "Destory" is misspelled in the Sequences class definition—it should be "Destroy".The typo exists not only in the call at line 216 but also in the method definition itself (tx_service/include/sequences/sequences.h:92). Both the definition and all call sites must be updated consistently:
- txservice::Sequences::Destory(); + txservice::Sequences::Destroy();Additionally, update the method definition in tx_service/include/sequences/sequences.h:92 from
Destory()toDestroy().build_tx_service.cmake-212-280 (1)
212-280: Remove large commented-out code block.This 68-line commented block duplicates the active
FORK_HM_PROCESSlogic below it. Keeping dead code increases maintenance burden and causes confusion.Delete lines 212-280 entirely, as the active implementation exists in lines 282-354.
eloq_metrics/include/prometheus_collector.h-42-50 (1)
42-50:static constvector in header causes ODR violation and code bloat.Each translation unit that includes this header will get its own copy of
PROMETHEUS_HISTOGRAM_DEF_BUCKETS. This violates ODR and wastes memory.Move to an
inlinevariable (C++17) or define in the.ccfile:-static const std::vector<double> PROMETHEUS_HISTOGRAM_DEF_BUCKETS = { +inline const std::vector<double> PROMETHEUS_HISTOGRAM_DEF_BUCKETS = { 1e+1, 2e+1, 4e+1, 6e+1, 8e+1, // < 100 us ... };Or declare
externin header and define in.cc:// In header: extern const std::vector<double> PROMETHEUS_HISTOGRAM_DEF_BUCKETS; // In .cc: const std::vector<double> PROMETHEUS_HISTOGRAM_DEF_BUCKETS = {...};store_handler/data_store_service_scanner.cpp-102-118 (1)
102-118:ScanClosenot called when partition has no more data.When
last_batch_size_ < batch_size, the partition is exhausted butScanCloseis never sent to the server. However,IncrementInFlightFetchCount()is called unconditionally. This causes asymmetric increment/decrement when the condition on line 107 is false.Consider restructuring to either:
- Only increment the counter when actually sending a close request
- Or always send close and let the server handle the no-op case
bool SinglePartitionScanner::ScanClose() { ... - scanner_->IncrementInFlightFetchCount(); - assert(last_batch_size_ <= scanner_->GetBatchSize()); // If parition scanner is not run out of data, // we need to close the scanner if (last_batch_size_ == scanner_->GetBatchSize()) { + scanner_->IncrementInFlightFetchCount(); client->ScanClose(scanner_->GetTableName(), partition_id_, data_shard_id_, session_id_, this, ProcessScanCloseResult); } return true; }build_tx_service.cmake-291-303 (1)
291-303:BRAFT_INCLUDE_PATHused before it's defined.
HOST_MANAGER_INCLUDE_DIRreferencesBRAFT_INCLUDE_PATHon line 293, butfind_path(BRAFT_INCLUDE_PATH ...)is called on line 299. This will result in an empty include path.Move the
find_pathandfind_librarycalls before theHOST_MANAGER_INCLUDE_DIRassignment:if (FORK_HM_PROCESS) + find_path(BRAFT_INCLUDE_PATH NAMES braft/raft.h) + find_library(BRAFT_LIB NAMES braft) + if ((NOT BRAFT_INCLUDE_PATH) OR (NOT BRAFT_LIB)) + message (FATAL_ERROR "Fail to find braft") + endif() + SET (HOST_MANAGER_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR}/tx_service/raft_host_manager) set(HOST_MANAGER_INCLUDE_DIR ${HOST_MANAGER_SOURCE_DIR}/include ... ${BRAFT_INCLUDE_PATH} ...) - - find_path(BRAFT_INCLUDE_PATH NAMES braft/raft.h) - find_library(BRAFT_LIB NAMES braft) - if ((NOT BRAFT_INCLUDE_PATH) OR (NOT BRAFT_LIB)) - message (FATAL_ERROR "Fail to find braft") - endif()Committable suggestion skipped: line range outside the PR's diff.
store_handler/data_store_service_scanner.cpp-76-80 (1)
76-80: Search conditions are built but not used.The constructor (lines 250-258) builds
search_conditions_frompushdown_cond, butScanNext()is called withnullptrinstead of the search conditions. This means pushdown predicates are never applied.client->ScanNext( scanner_->GetTableName(), partition_id_, data_shard_id_, last_key_, scanner_->GetEndKey(), session_id_, true, // only set inclusive_start for the first batch first_batch_fetched_ ? false : scanner_->IsInclusiveStart(), scanner_->IsInclusiveEnd(), scanner_->IsScanForward(), scanner_->GetBatchSize(), - // scanner_->SearchConditions(), - nullptr, + scanner_->SearchConditions(), this, ProcessScanNextResult);Ensure
SearchConditions()accessor exists inDataStoreServiceScannerbase class.Committable suggestion skipped: line range outside the PR's diff.
store_handler/data_store_service_client_closure.cpp-372-379 (1)
372-379: Same issue:std::abort()on deserialization failure.This is the same pattern as in
FetchRecordCallback. Consider consistent error handling that returns an error code rather than crashing.store_handler/data_store_service_client_closure.cpp-1231-1239 (1)
1231-1239: Anotherstd::abort()on deserialization failure.Consistent with earlier findings, this should return an error rather than terminating the process.
eloq_metrics/include/metrics_manager.h-121-121 (1)
121-121:std::stoimay throw on invalid input.If
ELOQ_METRICS_PORTcontains a non-numeric value,std::stoiwill throwstd::invalid_argumentorstd::out_of_range, crashing initialization.- auto bind_port = std::stoi(bind_port_str); + int bind_port; + try + { + bind_port = std::stoi(bind_port_str); + } + catch (const std::exception &e) + { +#ifdef WITH_GLOG + LOG(ERROR) << "Invalid port value: " << bind_port_str; +#endif + mgr_init_err_ = MetricsMgrErrors::CollectorOpenErr; + return; + }store_handler/data_store_service_client_closure.cpp-196-203 (1)
196-203: Callingstd::abort()on deserialization failure may be too aggressive.If
DeserializeTxRecordStrfails, the process terminates immediately. Consider returning an error code to allow graceful error handling and recovery.if (!DataStoreServiceClient::DeserializeTxRecordStr( val, is_deleted, offset)) { LOG(ERROR) << "====fetch record===decode error==" << " key: " << read_closure->Key() << " status: " << (int) fetch_cc->rec_status_; - std::abort(); + fetch_cc->SetFinish( + static_cast<int>(txservice::CcErrorCode::DATA_STORE_ERR)); + return; }store_handler/dynamo_scanner.h-132-144 (1)
132-144: Potential dangling pointer:start_keypassed by reference but stored as pointer.
HashPartitionDynamoScannertakesstart_keyby const reference but the base class stores&start_keyas a pointer. If the caller'sstart_keyobject goes out of scope before the scanner, this becomes a dangling pointer. Consider storing a copy or documenting the lifetime requirement clearly.Either store a copy of the key:
- const EloqKV::EloqKey *start_key_; + std::unique_ptr<EloqKV::EloqKey> start_key_;Or document the lifetime requirement if the caller guarantees validity:
// NOTE: start_key must remain valid for the lifetime of this scanner HashPartitionDynamoScanner( ... const EloqKV::EloqKey &start_key, // must outlive this scanner ...store_handler/dynamo_handler_typed.cpp-159-193 (1)
159-193: Multiple unimplemented methods withassert(false)stubs.
NewTxRecord,NegInfKey,PosInfKey, andColumnsToScanWhenBuildSkare all stubs. These are likely called at runtime based on the scanner implementations. Either implement them or document that this handler doesn't support these operations by throwing descriptive exceptions.Would you like me to help identify where these methods are called from to understand the expected behavior?
store_handler/dynamo_handler_typed.cpp-36-53 (1)
36-53: Stub implementations withassert(false)will cause runtime failures.These methods (
SetSequenceTableSchema,GetSequenceTableSchema,GetIndexes) unconditionally triggerassert(false). In debug builds, this crashes; in release builds, behavior is undefined. If these are intended to be overridden by derived classes, consider making them pure virtual in the base class. Otherwise, throw a proper exception with a meaningful message.void DynamoHandlerTyped::SetSequenceTableSchema( const txservice::TableSchema *tbl_schema) { - assert(false); + throw std::logic_error("SetSequenceTableSchema not implemented for DynamoHandlerTyped"); } const txservice::TableSchema *DynamoHandlerTyped::GetSequenceTableSchema() { - assert(false); + throw std::logic_error("GetSequenceTableSchema not implemented for DynamoHandlerTyped"); + return nullptr; }CMakeLists.txt-215-219 (1)
215-219: Proto compilation at configure time may cause stale artifacts.Using
execute_process()runs protoc during CMake configure, not during build. If.protofiles change, the generated files won't update until CMake is re-run.Consider using
add_custom_command()for build-time generation:add_custom_command( OUTPUT ${DS_PROTO_DIR}/ds_request.pb.cc ${DS_PROTO_DIR}/ds_request.pb.h COMMAND protoc --cpp_out=${DS_PROTO_DIR} -I${DS_PROTO_DIR} ${DS_PROTO_DIR}/ds_request.proto DEPENDS ${DS_PROTO_DIR}/ds_request.proto COMMENT "Generating protobuf files for ds_request" )store_handler/dynamo_handler.h-584-588 (1)
584-588: Copy constructor doesn't copyttl_setmember.The copy constructor copies
kv_table_name_andkv_index_names_but omitsttl_set, which will default tofalsein copied objects regardless of the source value.DynamoCatalogInfo(const DynamoCatalogInfo &kv_info) { kv_table_name_ = kv_info.kv_table_name_; kv_index_names_ = kv_info.kv_index_names_; + ttl_set = kv_info.ttl_set; }store_handler/dynamo_handler.h-388-406 (1)
388-406: Memberold_table_schema_is declared but never initialized in constructors.
UpsertTableData::old_table_schema_(line 391) is declared but none of the three constructors initialize it, leaving it with indeterminate value.Initialize the member in all constructors:
UpsertTableData( DynamoHandler *dynamo_hd, const txservice::TableName *table_name, const txservice::TableSchema *old_schema, const txservice::TableSchema *schema, txservice::OperationType op_type, uint64_t write_time, std::shared_ptr<void> defer_unpin, txservice::CcHandlerResult<txservice::Void> *hd_res, const txservice::AlterTableInfo *alter_table_info = nullptr) : hd_res_(hd_res), dynamo_hd_(dynamo_hd), table_name_(table_name), + old_table_schema_(old_schema), table_schema_(schema),And similarly for the copy constructors, copying from
other.old_table_schema_.Committable suggestion skipped: line range outside the PR's diff.
store_handler/data_store_service_scanner.h-173-177 (1)
173-177: Dangling reference risk: storing reference to external string.
table_name_is stored asconst std::string &, creating a lifetime dependency on the caller's string. If the source string is destroyed while the scanner is active, this becomes a dangling reference.Consider storing by value to avoid lifetime issues:
protected: // initialized parameters DataStoreServiceClient *client_; - const std::string &table_name_; + std::string table_name_; std::string start_key_;And update the constructor initialization:
- table_name_(table_name), + table_name_(table_name), // Now copies the stringstore_handler/bigtable_handler.cpp-1926-1927 (1)
1926-1927: Potential data race in async callback modifyingfuturesvector.The callback at line 1926-1927 modifies the
futuresvector (swap and pop_back) while holdingflight_bulks_mux_. However, thefuturesvector is passed by reference from the caller (PutAll,PutArchivesAll) and new elements are added at line 1932 after releasing the lock. This creates a race between the callback modifying the vector and the main thread adding to it.The
idx = futures.size()capture at line 1892 becomes invalid after the swap/pop_back operations from other callbacks.Consider using an index-stable container or tracking completed futures separately rather than modifying the vector in-place.
store_handler/bigtable_handler.cpp-1863-1867 (1)
1863-1867: Return value ofPutArchivesAllis ignored, potentially masking failures.The variable
retis assigned the result ofPutArchivesAllbut is never checked or used. If archiving fails, the function still returnstrue, which could lead to data inconsistency.Apply this diff to fix the issue:
- bool ret = PutArchivesAll(node_group, + if (!PutArchivesAll(node_group, table_name, table_schema->GetKVCatalogInfo(), - archive_vec); + archive_vec)) + { + return false; + } }store_handler/bigtable_handler.h-35-35 (1)
35-35: Avoidusing namespacein header files.
using namespace MyEloq;in a header file pollutes the global namespace for all files that include this header, potentially causing name collisions and making code harder to reason about.-using namespace MyEloq; namespace EloqDS { +using MyEloq::EloqKey; +using MyEloq::EloqRecord; +// Add other specific using declarations as neededstore_handler/bigtable_handler.h-350-371 (1)
350-371:strtokis not thread-safe andconst_castviolates const correctness.The
Deserializemethods useconst_castto remove const frompayload_and then usestrtok, which modifies the string in place and uses static internal state, making it not thread-safe.Consider using a safer alternative:
void Deserialize(std::vector<long> &nums) const { - char *payload = const_cast<char *>(payload_.c_str()); - char *token = strtok(payload, ","); - while (token) + std::string_view sv(payload_); + size_t pos = 0; + while (pos < sv.size()) { - long num = strtol(token, nullptr, 10); - assert(errno != ERANGE); - nums.push_back(num); - token = strtok(nullptr, ","); + size_t next = sv.find(',', pos); + if (next == std::string_view::npos) next = sv.size(); + long num = std::strtol(sv.data() + pos, nullptr, 10); + nums.push_back(num); + pos = next + 1; } }
🟡 Minor comments (35)
store_handler/eloq_data_store_service/RocksDB_Configuration_Flags.md-82-82 (1)
82-82: Fix typo: "dialy" should be "daily". Line 82 contains a spelling error in the flag namerocksdb_dialy_offpeak_time_utc.-| `rocksdb_dialy_offpeak_time_utc` | No | Local time zone's 00:00-05:00 converted to UTC | Time range in format "HH:MM-HH:MM" | Daily off-peak time in UTC for scheduling compactions | +| `rocksdb_daily_offpeak_time_utc` | No | Local time zone's 00:00-05:00 converted to UTC | Time range in format "HH:MM-HH:MM" | Daily off-peak time in UTC for scheduling compactions |store_handler/eloq_data_store_service/data_store_fault_inject.h-106-106 (1)
106-106:assertsilently passes in release builds if parameter format is invalid.If
parascontains a malformed key-value pair without=,assert(pos3 != std::string::npos)will crash in debug builds but continue with undefined behavior (usingnposas index) in release builds.Consider using explicit error handling consistent with line 94-96:
std::string sbs = paras.substr(pos1, pos2 - pos1); size_t pos3 = sbs.find('='); - assert(pos3 != std::string::npos); + if (pos3 == std::string::npos) + { + LOG(ERROR) << "Error parameters for fault inject: missing '=' in name=" + << fault_name << ", segment=" << sbs; + abort(); + }store_handler/eloq_data_store_service/data_store_fault_inject.cpp-100-109 (1)
100-109:std::stoican throw ifparacontains invalid input.If the parameter string is not a valid integer (e.g.,
"abc"or overflow),std::stoithrowsstd::invalid_argumentorstd::out_of_range, causing an unhandled exception and process termination.Consider wrapping with try-catch or validating input:
case FaultAction::SLEEP: { int secs = 1; if (!para.empty()) { - secs = std::stoi(para); + try + { + secs = std::stoi(para); + } + catch (const std::exception &e) + { + LOG(ERROR) << "Invalid sleep parameter: " << para + << ", using default 1s"; + } } sleep(secs); break; }store_handler/.clang-format-9-9 (1)
9-9: Fix typo in comment."sepecified" should be "specified".
-# When an option is not sepecified, it defaults to Google's style (this can be +# When an option is not specified, it defaults to Google's style (this can bestore_handler/.clang-format-6-6 (1)
6-6: Fix typo in comment."Advacned" should be "Advanced".
-# compiled by the Advacned Technology department. +# compiled by the Advanced Technology department.eloq_metrics/.clang-format-6-6 (1)
6-6: Fix typo in comment."Advacned" should be "Advanced".
-# compiled by the Advacned Technology department. +# compiled by the Advanced Technology department.eloq_metrics/.clang-format-9-9 (1)
9-9: Fix typo in comment."sepecified" should be "specified".
-# When an option is not sepecified, it defaults to Google's style (this can be +# When an option is not specified, it defaults to Google's style (this can bestore_handler/eloq_data_store_service/.gitignore-58-60 (1)
58-60: Remove duplicate "bin" entry.The pattern
binappears on both Line 58 and Line 60.bin bld -bin cmake/eloq_metrics/README.md-94-102 (1)
94-102: Complete the cmake command.Lines 94-102 show an incomplete cmake command - the initial
cmake -E make_directory build && cmake -S . -B build \is missing.cd eloq-metrics +cmake -E make_directory build && \ +cmake -S . -B build \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_CXX_FLAGS="${CXXFLAGS} -fPIC" \ -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake \ -DENABLE_BENCHMARK=ON \ -DWITH_ABSEIL=ON cd build && cmake --build . -j4store_handler/eloq_data_store_service/build_eloq_store.cmake-28-28 (1)
28-28: Optional zstd is linked unconditionally.
find_library(ZSTD_LIBRARY zstd)doesn't useREQUIRED, making zstd optional. However, line 110 links against${ZSTD_LIBRARY}unconditionally. If zstd isn't found, this will be empty orNOTFOUND.Either make it required or link conditionally:
find_library(ZSTD_LIBRARY zstd) +if(ZSTD_LIBRARY) + set(ZSTD_LINK_LIB ${ZSTD_LIBRARY}) +else() + set(ZSTD_LINK_LIB "") + message(STATUS "zstd not found, compression support disabled") +endif()Then on line 110, use
${ZSTD_LINK_LIB}instead of${ZSTD_LIBRARY}.Committable suggestion skipped: line range outside the PR's diff.
log_service-1-1 (1)
1-1: Improve submodule initialization documentation; correct verification command syntax.The
.gitmodulesconfiguration forlog_serviceis correctly set up. However:
The verification command in your comment is syntactically incorrect.
git ls-remotedoes not accept commit hashes as arguments—it only works with refs (branches/tags). To verify a commit exists, usegit ls-remote --heads --tags git@github.com:eloqdata/log_service.gitand search the output, or clone the repository and check withgit cat-file -t <hash>.The SSH URL (
git@github.com:...) may create friction for developers who prefer HTTPS. Consider documenting both protocols or using HTTPS by default.Add explicit submodule initialization instructions to
README.mdor a setup guide (e.g.,git clone --recursiveorgit submodule update --init --recursive). Currently, the README only mentions that transaction logs are stored in the Log Service but lacks setup steps.eloq_metrics/tests/metrics_collector_test.cc-111-137 (1)
111-137: Test logic issue: shared collector state across WHEN blocks.The two
WHENblocks at lines 118 and 127 share the samecollectorinstance. When tests run, the firstWHENopens the collector, and the secondWHEN's behavior depends on whether both blocks execute independently (fresh collector) or sequentially (already opened). Catch2's BDD macros create separate execution paths, but this structure is confusing and may not test the intended scenario.Consider restructuring to make each scenario independent:
-SCENARIO("Metrics collector call Open several times", - "[MCCallOpenSeveralTimes]") +SCENARIO("Metrics collector Open returns true on first call", + "[MCOpenOnce]") { - INFO("Run unit test MCCallOpenSeveralTimes"); - WHEN("init prometheus collector") + GIVEN("A fresh prometheus collector") { metrics::PrometheusCollector collector{"0.0.0.0", 18082}; WHEN("The open method is called") { auto open_one = collector.Open(); THEN("open one will return success") { REQUIRE(open_one == true); } } - - WHEN("The open method is call twice") - { - collector.Open(); - auto open_two = collector.Open(); - THEN("open two will return false") - { - REQUIRE(open_two == false); - } - } } } + +SCENARIO("Metrics collector Open returns false on second call", + "[MCOpenTwice]") +{ + GIVEN("A prometheus collector that has been opened") + { + metrics::PrometheusCollector collector{"0.0.0.0", 18083}; + collector.Open(); + WHEN("Open is called again") + { + auto open_two = collector.Open(); + THEN("it returns false") + { + REQUIRE(open_two == false); + } + } + } +}eloq_metrics/tests/metrics_http_service_test.cc-83-90 (1)
83-90: Reorder assertions to avoid potential UB.Line 86 constructs a
std::stringfrommgr_result.not_ok_before asserting it's non-null on line 88. If the test assumption is violated andnot_ok_is null, this causes undefined behavior.THEN("port conflict MetricsMgr init failure") { auto mgr_result = metrics::MetricsMgr::GetMetricMgrInstance(); - std::string err_msg(mgr_result.not_ok_); - INFO("MetricsMgrResult ErrMsg: " << err_msg); REQUIRE(mgr_result.not_ok_ != nullptr); + std::string err_msg(mgr_result.not_ok_); + INFO("MetricsMgrResult ErrMsg: " << err_msg); REQUIRE(mgr_result.mgr_ == nullptr); }eloq_metrics/tests/metrics_manager_test.cc-41-52 (1)
41-52: Add null-checks before dereferencingmgr_.The test asserts
mgr_res_1.not_ok_ == mgr_res_2.not_ok_but does not verify thatmgr_pointers are non-null before dereferencing them on lines 49-50. If both results havenot_ok_ != nullptr, dereferencingmgr_would cause undefined behavior.THEN("Two MetricsMgr instance equal") { INFO("check MetricsMgrResult"); REQUIRE(mgr_res_1.not_ok_ == mgr_res_2.not_ok_); + REQUIRE(mgr_res_1.mgr_ != nullptr); + REQUIRE(mgr_res_2.mgr_ != nullptr); metrics::MetricsMgr &mgr_1 = *mgr_res_1.mgr_; metrics::MetricsMgr &mgr_2 = *mgr_res_2.mgr_; auto mgr_instance_equals = mgr_1 == mgr_2; REQUIRE(mgr_instance_equals); }eloq_metrics/tests/metrics_http_service_test.cc-46-64 (1)
46-64: Ensure curl cleanup occurs unconditionally.
curl_easy_cleanup(curl)is only called inside theTHENblock. If theREQUIREon line 57 or 61 fails, the curl handle leaks. Also,curl_global_cleanup()is never called.Consider restructuring to ensure cleanup:
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &rsp_buffer); CURLcode curl_response = curl_easy_perform(curl); + long http_status = 0; + curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &http_status); + curl_easy_cleanup(curl); + curl_global_cleanup(); THEN("access to eloq_metrics http service return 200.") { REQUIRE(curl_response == CURLE_OK); - long http_status; - curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &http_status); INFO("/eloq_metrics response \n" << rsp_buffer); REQUIRE(http_status == 200); - curl_easy_cleanup(curl); }eloq_metrics/benchmark/tx_service_metrics_bench.cpp-31-31 (1)
31-31: Typo: "guage" should be "gauge".- metrics::Name name{"guage"}; + metrics::Name name{"gauge"};eloq_metrics/include/metrics_collector.h-24-26 (1)
24-26: Use angle brackets for standard library includes.Line 24 uses
"memory"which searches the local directory first. Standard library headers should use angle brackets.-#include "memory" +#include <memory> #include "metrics.h" #include "prometheus/client_metric.h"eloq_metrics/benchmark/tx_service_metrics_bench.cpp-19-25 (1)
19-25: Unused variabletime_startwill cause compiler warning.The variable
time_startis assigned but never used. Either use it (e.g., withbenchmark::DoNotOptimize) or remove the assignment.static void BM_get_time_point(benchmark::State &state) { while (state.KeepRunning()) { - auto time_start = metrics::Clock::now(); + benchmark::DoNotOptimize(metrics::Clock::now()); } }eloq_metrics/CMakeLists.txt-43-49 (1)
43-49: UndefinedDEF_INSTALL_CMAKEDIRand loop variable mismatch.
- Line 43:
DEF_INSTALL_CMAKEDIRis not defined, soINSTALL_CMAKEDIRwill be empty.- Line 45: The loop uses
"include"but the variable is namedINSTALL_INCLUDEDIR(withDIRsuffix), causing a mismatch in theINSTALL_${p}DIRpattern.-set(INSTALL_CMAKEDIR ${DEF_INSTALL_CMAKEDIR} CACHE PATH "Installation directory for CMake files") +set(DEF_INSTALL_CMAKEDIR "${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}" CACHE PATH "Default CMake install directory") +set(INSTALL_CMAKEDIR ${DEF_INSTALL_CMAKEDIR} CACHE PATH "Installation directory for CMake files") -foreach(p LIB BIN include CMAKE) +foreach(p LIB BIN INCLUDE CMAKE)core/src/metrics_init.cpp-14-14 (1)
14-14: Typo in comment: "TDOO" should be "TODO".-#include "mysql_metrics.h" //TDOO(liunyl): move this file to correct location +#include "mysql_metrics.h" //TODO(liunyl): move this file to correct locationeloq_metrics/benchmark/redis_metrics_bench.cpp-325-331 (1)
325-331: Unused variable may be optimized away.The benchmark measures time to obtain a time point, but
time_startis unused. The compiler may optimize away the call entirely, making the benchmark meaningless.Apply this diff to prevent optimization:
static void BM_get_time_point(benchmark::State &state) { while (state.KeepRunning()) { auto time_start = metrics::Clock::now(); + benchmark::DoNotOptimize(time_start); } }eloq_metrics/benchmark/redis_metrics_bench.cpp-19-123 (1)
19-123: Duplicate entries incommand_typesvector.
"dump"and"restore"appear twice in the list (lines 28-29 and 122-123). This will cause duplicate metric registrations in the benchmark.Remove the duplicate entries at lines 122-123:
"sunionstore", "sort", - "sort_ro", - "dump", - "restore"}; + "sort_ro"};store_handler/eloq_data_store_service/CMakeLists.txt-201-223 (1)
201-223: Duplicate AWS Kinesis discovery blocks.The
find_pathandfind_libraryfor AWS Kinesis are duplicated (lines 201-211 and 213-223). Remove the duplicate block.find_library(AWS_KINESIS_LIB aws-cpp-sdk-kinesis) if((NOT AWS_KINESIS_LIB)) message(FATAL_ERROR "Fail to find aws-cpp-sdk-kinesis lib") endif() message(STATUS "aws-cpp-sdk-kinesis library: ${AWS_KINESIS_LIB}") - find_path(AWS_KINESIS_INCLUDE_PATH aws/kinesis/KinesisClient.h) - if((NOT AWS_KINESIS_INCLUDE_PATH)) - message(FATAL_ERROR "Fail to find aws/kinesis include path") - endif() - message(STATUS "aws/kinesis include path: ${AWS_KINESIS_INCLUDE_PATH}") - - find_library(AWS_KINESIS_LIB aws-cpp-sdk-kinesis) - if((NOT AWS_KINESIS_LIB)) - message(FATAL_ERROR "Fail to find aws-cpp-sdk-kinesis lib") - endif() - message(STATUS "aws-cpp-sdk-kinesis library: ${AWS_KINESIS_LIB}") - find_path(AWS_S3_INCLUDE_PATH aws/s3/S3Client.h)store_handler/eloq_data_store_service/data_store.h-106-110 (1)
106-110: Typo in documentation comment.The parameter name in the comment says "drop_talbe_req" but the actual parameter is
drop_table_req./** * @brief Drop kv table. - * @param drop_talbe_req The pointer of the request. + * @param drop_table_req The pointer of the request. */ virtual void DropTable(DropTableRequest *drop_table_req) = 0;core/src/data_substrate.cpp-579-584 (1)
579-584: Incorrect error message text.The error message mentions
enable_io_uringtwice, but should referenceraft_log_async_fsyncas the condition being validated.if (raft_log_async_fsync && !enable_io_uring) { - LOG(ERROR) << "Invalid config: when set `enable_io_uring`, " - "should also set `enable_io_uring`."; + LOG(ERROR) << "Invalid config: when set `raft_log_async_fsync`, " + "should also set `enable_io_uring`."; return false; }eloq_metrics/src/prometheus_collector.cc-241-245 (1)
241-245: Logging prints pointer address instead of enum value.
&metric_typelogs the address of the enum variable, not its value.#ifdef WITH_GLOG - LOG(ERROR) << "ERR: For now not support metrics_type " << &metric_type; + LOG(ERROR) << "ERR: For now not support metrics_type " << static_cast<int>(metric_type); #endif - ; + break;Also, the trailing
;after#endifis unnecessary—replace withbreak;for consistency.store_handler/data_store_service_scanner.cpp-181-189 (1)
181-189: Incorrect error message inProcessScanCloseResult.The error message says "Failed to fetch next batch" but this is the close result callback.
if (result.error_code() != remote::DataStoreError::NO_ERROR) { - LOG(ERROR) << "Failed to fetch next batch: " << result.error_msg(); + LOG(ERROR) << "Failed to close scan: " << result.error_msg(); sp_scanner->scanner_->SetError(build_eloq_log_service.cmake-51-51 (1)
51-51:find_librarydoes not supportVERSIONargument.The
VERSIONparameter is not a valid argument forfind_library(). This line may silently ignore the version constraint.- find_library(GLOG_LIB NAMES glog VERSION ">=0.6.0" REQUIRED) + find_library(GLOG_LIB NAMES glog REQUIRED)Consider using
find_package(glog 0.6.0 REQUIRED)if version checking is needed.eloq_metrics/include/metrics_manager.h-128-128 (1)
128-128: Unused variableregistry.The
prometheus::Registryis created but never passed toPrometheusCollectoror used elsewhere.Either remove the unused variable or pass it to the collector if intended:
- auto registry = std::make_shared<prometheus::Registry>(); - metrics_collector_ = - std::make_shared<PrometheusCollector>(bind_addr, bind_port); + std::make_shared<PrometheusCollector>(bind_addr, bind_port, registry);Committable suggestion skipped: line range outside the PR's diff.
build_eloq_log_service.cmake-134-157 (1)
134-157: Duplicate AWS Kinesis detection code.The AWS Kinesis include path and library detection (lines 134-144) is duplicated exactly at lines 147-157. This appears to be a copy-paste error.
Remove the duplicate block:
message(STATUS "aws-cpp-sdk-kinesis library: ${AWS_KINESIS_LIB}") - - find_path(AWS_KINESIS_INCLUDE_PATH aws/kinesis/KinesisClient.h) - if((NOT AWS_KINESIS_INCLUDE_PATH)) - message(FATAL_ERROR "Fail to find aws/kinesis include path") - endif() - message(STATUS "aws/kinesis include path: ${AWS_KINESIS_INCLUDE_PATH}") - - find_library(AWS_KINESIS_LIB aws-cpp-sdk-kinesis) - if((NOT AWS_KINESIS_LIB)) - message(FATAL_ERROR "Fail to find aws-cpp-sdk-kinesis lib") - endif() - message(STATUS "aws-cpp-sdk-kinesis library: ${AWS_KINESIS_LIB}") - find_path(AWS_S3_INCLUDE_PATH aws/s3/S3Client.h)eloq_metrics/include/metrics_manager.h-32-34 (1)
32-34: Inconsistent environment variable naming.
MOMO_METRICS_PUSH_ADDRuses a different prefix than the otherELOQ_*variables. This appears to be a typo or legacy naming.-const std::string BIND_ADDR_ENV = "MOMO_METRICS_PUSH_ADDR"; +const std::string BIND_ADDR_ENV = "ELOQ_METRICS_BIND_ADDR";store_handler/dynamo_handler_typed.cpp-113-128 (1)
113-128: Type mismatch assertion could fail at runtime.Line 124 asserts that
typed_rec->ObjectType() == obj_type, but if the stored object type doesn't match the provided record's type, this will crash in debug builds. Consider handling this mismatch gracefully with an error return or exception.- assert(typed_rec->ObjectType() == obj_type); + if (typed_rec->ObjectType() != obj_type) + { + LOG(ERROR) << "Object type mismatch: expected " + << static_cast<int>(typed_rec->ObjectType()) + << " but found " << static_cast<int>(obj_type); + return; // or throw + }Committable suggestion skipped: line range outside the PR's diff.
store_handler/data_store_service_scanner.h-207-221 (1)
207-221:Reset()should clearscan_cache_before reserving.When reusing a pooled scanner,
Reset()only reserves capacity but doesn't clear existing cache entries. IfClear()wasn't called beforeReset(), stale data could remain.void Reset(DataStoreServiceScanner *scanner, int32_t partition_id, uint32_t data_shard_id, const std::string_view start_key) { scanner_ = scanner; partition_id_ = partition_id; data_shard_id_ = data_shard_id; last_key_ = start_key; session_id_ = ""; last_batch_size_ = scanner_->GetBatchSize(); first_batch_fetched_ = false; head_ = 0; + scan_cache_.clear(); scan_cache_.reserve(scanner_->GetBatchSize()); }store_handler/data_store_service_scanner.h-117-136 (1)
117-136: Potential data race when reading error message.While
SetErrorcorrectly uses CAS to ensure only one thread writeserror_msg_,GetErrorMsg()readserror_msg_without memory ordering guarantees relative toerror_code_. A caller might observe the new error code but stale/partial error message.Consider adding a memory fence or using
error_code_.load(std::memory_order_acquire)before readingerror_msg_:const std::string &GetErrorMsg() { + // Ensure we see error_msg_ written by SetError + (void)error_code_.load(std::memory_order_acquire); return error_msg_; }Or document that callers must check
GetErrorCode()first (which already uses acquire semantics implicitly via the atomic load).store_handler/bigtable_handler.cpp-2718-2718 (1)
2718-2718:substrresult is discarded; trailing space is not removed.The call
index_names.substr(0, index_names.size() - 1)returns a new string but the result is not assigned back toindex_names. The trailing space remains in the serialized output.Apply this diff to fix the issue:
- index_names.substr(0, index_names.size() - 1); + index_names.pop_back(); // Remove trailing space
| FOREACH(PROTO_FILE ${PROTO_FILES}) | ||
| STRING(REGEX REPLACE "[^/]proto" "" PROTO_NAME ${PROTO_FILE}) | ||
| LIST(APPEND PROTO_CC_FILES ${PROTO_SRC}/${PROTO_NAME}.pb.cc) | ||
| ADD_CUSTOM_COMMAND( | ||
| OUTPUT "${PROTO_SRC}/${PROTO_NAME}.pb.cc" "${PROTO_SRC}/${PROTO_NAME}.pb.h" | ||
| DEPENDS ${PROTO_SRC}/${PROTO_FILE} | ||
| COMMAND protoc ${PROTO_FILE} --proto_path=./ --cpp_out=./ | ||
| WORKING_DIRECTORY ${PROTO_SRC} | ||
| ) | ||
| ENDFOREACH(PROTO_FILE) |
There was a problem hiding this comment.
Incorrect regex pattern for extracting proto name.
The regex [^/]proto matches any character (except /) followed by proto, which is not the intended behavior. This will likely produce malformed filenames.
To strip the .proto extension correctly:
FOREACH(PROTO_FILE ${PROTO_FILES})
- STRING(REGEX REPLACE "[^/]proto" "" PROTO_NAME ${PROTO_FILE})
+ STRING(REGEX REPLACE "\\.proto$" "" PROTO_NAME ${PROTO_FILE})
LIST(APPEND PROTO_CC_FILES ${PROTO_SRC}/${PROTO_NAME}.pb.cc)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| FOREACH(PROTO_FILE ${PROTO_FILES}) | |
| STRING(REGEX REPLACE "[^/]proto" "" PROTO_NAME ${PROTO_FILE}) | |
| LIST(APPEND PROTO_CC_FILES ${PROTO_SRC}/${PROTO_NAME}.pb.cc) | |
| ADD_CUSTOM_COMMAND( | |
| OUTPUT "${PROTO_SRC}/${PROTO_NAME}.pb.cc" "${PROTO_SRC}/${PROTO_NAME}.pb.h" | |
| DEPENDS ${PROTO_SRC}/${PROTO_FILE} | |
| COMMAND protoc ${PROTO_FILE} --proto_path=./ --cpp_out=./ | |
| WORKING_DIRECTORY ${PROTO_SRC} | |
| ) | |
| ENDFOREACH(PROTO_FILE) | |
| FOREACH(PROTO_FILE ${PROTO_FILES}) | |
| STRING(REGEX REPLACE "\\.proto$" "" PROTO_NAME ${PROTO_FILE}) | |
| LIST(APPEND PROTO_CC_FILES ${PROTO_SRC}/${PROTO_NAME}.pb.cc) | |
| ADD_CUSTOM_COMMAND( | |
| OUTPUT "${PROTO_SRC}/${PROTO_NAME}.pb.cc" "${PROTO_SRC}/${PROTO_NAME}.pb.h" | |
| DEPENDS ${PROTO_SRC}/${PROTO_FILE} | |
| COMMAND protoc ${PROTO_FILE} --proto_path=./ --cpp_out=./ | |
| WORKING_DIRECTORY ${PROTO_SRC} | |
| ) | |
| ENDFOREACH(PROTO_FILE) |
🤖 Prompt for AI Agents
In build_tx_service.cmake around lines 98-107, the current STRING(REGEX REPLACE
"[^/]proto" "" PROTO_NAME ${PROTO_FILE}) uses an incorrect regex and produces
malformed names; replace it with a reliable extraction of the proto basename
without extension — for example use CMake's GET_FILENAME_COMPONENT(PROTO_NAME
${PROTO_FILE} NAME_WE) (or, if you prefer regex, use STRING(REGEX REPLACE
"\\.proto$" "" PROTO_NAME ${PROTO_FILE})), keeping the rest of the loop (APPEND,
ADD_CUSTOM_COMMAND outputs/depends) unchanged so the generated .pb.cc/.pb.h
filenames are correct.
| ${HOST_MANAGER_SOURCE_DIR}/src/raft_host_manager.cpp | ||
| ${HOST_MANAGER_SOURCE_DIR}/src/ini.c | ||
| ${HOST_MANAGER_SOURCE_DIR}/src/INIReader.cpp | ||
| ${PROTO_SRC}/${PROTO_NAME}.pb.cc |
There was a problem hiding this comment.
Undefined variable ${PROTO_NAME} in host_manager sources.
PROTO_NAME is defined inside the FOREACH loop (lines 98-107) and is not accessible in this scope. This will result in an empty or undefined path.
Either use ${PROTO_CC_FILES} which contains all compiled proto sources, or explicitly list the required proto source files:
SET(RaftHM_SOURCES
${HOST_MANAGER_SOURCE_DIR}/src/main.cpp
${HOST_MANAGER_SOURCE_DIR}/src/raft_host_manager_service.cpp
${HOST_MANAGER_SOURCE_DIR}/src/raft_host_manager.cpp
${HOST_MANAGER_SOURCE_DIR}/src/ini.c
${HOST_MANAGER_SOURCE_DIR}/src/INIReader.cpp
- ${PROTO_SRC}/${PROTO_NAME}.pb.cc
+ ${PROTO_CC_FILES}
${LOG_PROTO_SRC}/log_agent.cpp
${LOG_PROTO_SRC}/${LOG_PROTO_NAME}.pb.cc
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ${PROTO_SRC}/${PROTO_NAME}.pb.cc | |
| SET(RaftHM_SOURCES | |
| ${HOST_MANAGER_SOURCE_DIR}/src/main.cpp | |
| ${HOST_MANAGER_SOURCE_DIR}/src/raft_host_manager_service.cpp | |
| ${HOST_MANAGER_SOURCE_DIR}/src/raft_host_manager.cpp | |
| ${HOST_MANAGER_SOURCE_DIR}/src/ini.c | |
| ${HOST_MANAGER_SOURCE_DIR}/src/INIReader.cpp | |
| ${PROTO_CC_FILES} | |
| ${LOG_PROTO_SRC}/log_agent.cpp | |
| ${LOG_PROTO_SRC}/${LOG_PROTO_NAME}.pb.cc | |
| ) |
🤖 Prompt for AI Agents
In build_tx_service.cmake around line 325, the line referencing
${PROTO_NAME}.pb.cc uses PROTO_NAME which was only defined inside the FOREACH at
lines 98-107 and is out of scope here, producing an empty/invalid path; replace
that reference by using the aggregate variable ${PROTO_CC_FILES} (which contains
all generated proto cc files) or explicitly add the specific proto source(s)
required here, or move the ${PROTO_NAME}-dependent logic inside the FOREACH so
the variable is in scope.
| if(NOT ROCKSDB_CLOUD_LIB) | ||
| message(FATAL_ERROR "Fail to find RocksDB Cloud lib path") | ||
| endif() | ||
| message(STATUS "ROCKSDB_CLOUD_LIB: ${ROCKSDB_CLOUD_LIB}") | ||
| set(ROCKSDB_LIBRARIES ${ROCKSDB_LIBRARIES} ${ROCKSDB_CLOUD_LIB}) |
There was a problem hiding this comment.
ROCKSDB_CLOUD_LIB check may fail for GCS configuration.
ROCKSDB_CLOUD_LIB is only set for S3 configurations (line 190), but the check at line 202 is common to both S3 and GCS. GCS configurations will hit this FATAL_ERROR.
Add find_library for GCS:
find_library(ROCKSDB_CLOUD_LIB NAMES rocksdb-cloud-aws)
endif()
+
+ if((WITH_DATA_STORE STREQUAL "ROCKSDB_CLOUD_GCS") OR (WITH_DATA_STORE STREQUAL "ELOQDSS_ROCKSDB_CLOUD_GCS"))
+ find_library(ROCKSDB_CLOUD_LIB NAMES rocksdb-cloud-gcs)
+ endif()
# Common RocksDB Cloud setup📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if(NOT ROCKSDB_CLOUD_LIB) | |
| message(FATAL_ERROR "Fail to find RocksDB Cloud lib path") | |
| endif() | |
| message(STATUS "ROCKSDB_CLOUD_LIB: ${ROCKSDB_CLOUD_LIB}") | |
| set(ROCKSDB_LIBRARIES ${ROCKSDB_LIBRARIES} ${ROCKSDB_CLOUD_LIB}) | |
| find_library(ROCKSDB_CLOUD_LIB NAMES rocksdb-cloud-aws) | |
| endif() | |
| if((WITH_DATA_STORE STREQUAL "ROCKSDB_CLOUD_GCS") OR (WITH_DATA_STORE STREQUAL "ELOQDSS_ROCKSDB_CLOUD_GCS")) | |
| find_library(ROCKSDB_CLOUD_LIB NAMES rocksdb-cloud-gcs) | |
| endif() | |
| # Common RocksDB Cloud setup | |
| if(NOT ROCKSDB_CLOUD_LIB) | |
| message(FATAL_ERROR "Fail to find RocksDB Cloud lib path") | |
| endif() | |
| message(STATUS "ROCKSDB_CLOUD_LIB: ${ROCKSDB_CLOUD_LIB}") | |
| set(ROCKSDB_LIBRARIES ${ROCKSDB_LIBRARIES} ${ROCKSDB_CLOUD_LIB}) |
🤖 Prompt for AI Agents
In CMakeLists.txt around lines 202-206, the current unconditional check for
ROCKSDB_CLOUD_LIB will FATAL_ERROR for GCS builds because ROCKSDB_CLOUD_LIB is
only set for S3; update the logic to also attempt to locate the GCS cloud lib
(use find_library to set ROCKSDB_CLOUD_LIB_GCS or similar) and only require
ROCKSDB_CLOUD_LIB when S3 is enabled, or merge the found GCS/S3 library into
ROCKSDB_CLOUD_LIB before the FATAL_ERROR check so that the fatal branch is only
triggered when neither cloud provider library could be found.
| install(TARGETS logservice | ||
| ARCHIVE DESTINATION lib | ||
| LIBRARY DESTINATION lib) |
There was a problem hiding this comment.
Unconditional install of logservice will fail when WITH_LOG_SERVICE is OFF.
The logservice target is only created when WITH_LOG_SERVICE is ON (lines 251-257), but the install rule is unconditional.
-install(TARGETS logservice
- ARCHIVE DESTINATION lib
- LIBRARY DESTINATION lib)
+if(WITH_LOG_SERVICE)
+ install(TARGETS logservice
+ ARCHIVE DESTINATION lib
+ LIBRARY DESTINATION lib)
+endif()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| install(TARGETS logservice | |
| ARCHIVE DESTINATION lib | |
| LIBRARY DESTINATION lib) | |
| if(WITH_LOG_SERVICE) | |
| install(TARGETS logservice | |
| ARCHIVE DESTINATION lib | |
| LIBRARY DESTINATION lib) | |
| endif() |
🤖 Prompt for AI Agents
In CMakeLists.txt around lines 387-389, the install(TARGETS logservice ...)
stanza is unconditional but the logservice target is only created when
WITH_LOG_SERVICE is ON; guard the install so it only runs when the target exists
by wrapping the install() block in the same condition (if(WITH_LOG_SERVICE) ...
endif()) or by checking target existence (if(TARGET logservice) ... endif()) so
installation does not fail when WITH_LOG_SERVICE is OFF.
| // remote request metrics | ||
| metrics::enable_remote_request_metrics = config_reader.GetBoolean( | ||
| "metrics", "enable_busy_round_metrics", metrics::enable_tx_metrics); | ||
| LOG(INFO) << "enable_remote_request_metrics: " | ||
| << (metrics::enable_remote_request_metrics ? "ON" : "OFF"); |
There was a problem hiding this comment.
Wrong config key used for enable_remote_request_metrics.
The code reads "enable_busy_round_metrics" but assigns to enable_remote_request_metrics. This appears to be a copy-paste error that will cause the remote request metrics to be incorrectly configured based on busy round settings.
// remote request metrics
metrics::enable_remote_request_metrics = config_reader.GetBoolean(
- "metrics", "enable_busy_round_metrics", metrics::enable_tx_metrics);
+ "metrics", "enable_remote_request_metrics", metrics::enable_tx_metrics);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // remote request metrics | |
| metrics::enable_remote_request_metrics = config_reader.GetBoolean( | |
| "metrics", "enable_busy_round_metrics", metrics::enable_tx_metrics); | |
| LOG(INFO) << "enable_remote_request_metrics: " | |
| << (metrics::enable_remote_request_metrics ? "ON" : "OFF"); | |
| // remote request metrics | |
| metrics::enable_remote_request_metrics = config_reader.GetBoolean( | |
| "metrics", "enable_remote_request_metrics", metrics::enable_tx_metrics); | |
| LOG(INFO) << "enable_remote_request_metrics: " | |
| << (metrics::enable_remote_request_metrics ? "ON" : "OFF"); |
🤖 Prompt for AI Agents
In core/src/metrics_init.cpp around lines 91 to 95, the code mistakenly reads
the "enable_busy_round_metrics" key when setting
metrics::enable_remote_request_metrics; change the config key to the correct
"enable_remote_request_metrics" when calling config_reader.GetBoolean and keep
the same default (metrics::enable_remote_request_metrics) so the remote request
metric flag is read from the correct config entry and logged accurately.
| uint16_t port, | ||
| uint64_t shard_version) | ||
| { | ||
| assert(expired_ts_.load(std::memory_order_acquire) == 0); |
There was a problem hiding this comment.
Assertion in DssNode::Reset will always fail for newly constructed nodes.
expired_ts_ is initialized to 1U (line 811), but Reset() asserts expired_ts_.load() == 0 (line 782). This assertion will always fail for nodes that haven't been explicitly marked as expired.
- std::atomic<uint64_t> expired_ts_{1U};
+ std::atomic<uint64_t> expired_ts_{0};Or remove the assertion if nodes can be reused:
void Reset(...)
{
- assert(expired_ts_.load(std::memory_order_acquire) == 0);
+ expired_ts_.store(0, std::memory_order_release);
host_name_ = hostname;Also applies to: 811-811
🤖 Prompt for AI Agents
In store_handler/data_store_service_client.h around lines 782 and 811, the
Reset() assertion expects expired_ts_ == 0 but expired_ts_ is initialized to 1U,
so the assert always fails for newly constructed nodes; fix by making the
initial value and the Reset() invariant consistent: either initialize
expired_ts_ to 0 (change the initializer at line ~811 to 0U) or adjust/remove
the assertion in Reset() to match the intended initial state (e.g.,
assert(expired_ts_.load(...) == 1U) or drop the assert if nodes may be reused).
| template <typename KeyT> | ||
| void DynamoHandlerTyped::BindDynamoReqForKeyPredecessor( | ||
| Aws::DynamoDB::Model::AttributeValue &sk, const KeyT &key) | ||
| { | ||
| assert(false); | ||
|
|
||
| std::string_view predecessor_key_str = key.KVSerialize(); | ||
| auto data_ptr = | ||
| reinterpret_cast<const unsigned char *>(predecessor_key_str.data()); | ||
| predecessor(const_cast<unsigned char *>(data_ptr), | ||
| predecessor_key_str.size()); | ||
|
|
||
| sk.SetB(Aws::Utils::ByteBuffer( | ||
| reinterpret_cast<const unsigned char *>(predecessor_key_str.data()), | ||
| predecessor_key_str.size())); | ||
| } |
There was a problem hiding this comment.
Critical: Undefined behavior via const_cast on read-only data and dead code after assert(false).
This function has two serious issues:
-
Dead code: Lines 166-174 execute after
assert(false)which will always abort in debug builds. In release builds where asserts are disabled, the function proceeds to execute potentially dangerous code. -
Undefined behavior:
key.KVSerialize()returns astd::string_viewto potentially read-only memory. Usingconst_castto modify this data (line 169) is undefined behavior.
If this function is intentionally unimplemented, consider:
template <typename KeyT>
void DynamoHandlerTyped::BindDynamoReqForKeyPredecessor(
Aws::DynamoDB::Model::AttributeValue &sk, const KeyT &key)
{
- assert(false);
-
- std::string_view predecessor_key_str = key.KVSerialize();
- auto data_ptr =
- reinterpret_cast<const unsigned char *>(predecessor_key_str.data());
- predecessor(const_cast<unsigned char *>(data_ptr),
- predecessor_key_str.size());
-
- sk.SetB(Aws::Utils::ByteBuffer(
- reinterpret_cast<const unsigned char *>(predecessor_key_str.data()),
- predecessor_key_str.size()));
+ // TODO: Implement predecessor key binding
+ (void)sk;
+ (void)key;
+ throw std::logic_error("BindDynamoReqForKeyPredecessor not implemented");
}If implementation is needed, copy to a mutable buffer first:
std::string key_copy(key.KVSerialize());
predecessor(reinterpret_cast<unsigned char*>(key_copy.data()), key_copy.size());
sk.SetB(Aws::Utils::ByteBuffer(
reinterpret_cast<const unsigned char*>(key_copy.data()), key_copy.size()));🤖 Prompt for AI Agents
In store_handler/dynamo_handler_typed.h around lines 160-175, the function
contains an assert(false) followed by code that uses const_cast on data from a
std::string_view (undefined behavior) and therefore is both dead/unsafe; either
mark the function as unimplemented (e.g., remove the body and throw or return)
or properly implement it by making a mutable copy of the serialized key
(std::string key_copy(key.KVSerialize())), call predecessor using
key_copy.data() and key_copy.size(), and pass a const pointer to
Aws::Utils::ByteBuffer created from key_copy.data() and key_copy.size(),
avoiding any const_cast on read-only memory.
| include_directories( | ||
| ${PROJECT_SOURCE_DIR}/ | ||
| if(WITH_DATA_STORE STREQUAL "ELOQDSS_ELOQSTORE") | ||
| ${PROJECT_SOURCE_DIR}/eloqstore/ | ||
| endif () | ||
| ) |
There was a problem hiding this comment.
Invalid CMake syntax: if() inside include_directories().
CMake does not support conditional statements inside include_directories(). This will cause a parse error.
Apply this diff to fix the syntax:
-include_directories(
- ${PROJECT_SOURCE_DIR}/
-if(WITH_DATA_STORE STREQUAL "ELOQDSS_ELOQSTORE")
- ${PROJECT_SOURCE_DIR}/eloqstore/
-endif ()
-)
+include_directories(${PROJECT_SOURCE_DIR}/)
+if(WITH_DATA_STORE STREQUAL "ELOQDSS_ELOQSTORE")
+ include_directories(${PROJECT_SOURCE_DIR}/eloqstore/)
+endif()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| include_directories( | |
| ${PROJECT_SOURCE_DIR}/ | |
| if(WITH_DATA_STORE STREQUAL "ELOQDSS_ELOQSTORE") | |
| ${PROJECT_SOURCE_DIR}/eloqstore/ | |
| endif () | |
| ) | |
| include_directories(${PROJECT_SOURCE_DIR}/) | |
| if(WITH_DATA_STORE STREQUAL "ELOQDSS_ELOQSTORE") | |
| include_directories(${PROJECT_SOURCE_DIR}/eloqstore/) | |
| endif() |
🤖 Prompt for AI Agents
In store_handler/eloq_data_store_service/CMakeLists.txt around lines 404-409,
the include_directories() call contains an if() block which is invalid CMake
syntax; to fix it, close include_directories() after the unconditional path(s)
and move the conditional outside so you either call
include_directories(${PROJECT_SOURCE_DIR}/eloqstore/) only when WITH_DATA_STORE
STREQUAL "ELOQDSS_ELOQSTORE" or build a list variable (e.g., set(INCLUDE_DIRS
${PROJECT_SOURCE_DIR}/) and if(CONDITION) list(APPEND INCLUDE_DIRS
${PROJECT_SOURCE_DIR}/eloqstore/) then call
include_directories(${INCLUDE_DIRS})), ensuring no if() appears inside
include_directories().
| #include "data_store.h" | ||
|
|
||
| namespace EloqDS | ||
| { | ||
|
|
||
| class DataStoreFactory | ||
| { | ||
| public: | ||
| virtual ~DataStoreFactory() = default; | ||
|
|
||
| virtual std::unique_ptr<DataStore> CreateDataStore( | ||
| bool create_if_missing, | ||
| uint32_t shard_id, | ||
| DataStoreService *data_store_service, | ||
| bool start_db = true) = 0; |
There was a problem hiding this comment.
Missing forward declaration for DataStoreService.
DataStoreService* is used as a parameter in CreateDataStore() but is not forward-declared or included. This will cause a compilation error.
#include "data_store.h"
namespace EloqDS
{
+class DataStoreService;
+
class DataStoreFactory
{📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include "data_store.h" | |
| namespace EloqDS | |
| { | |
| class DataStoreFactory | |
| { | |
| public: | |
| virtual ~DataStoreFactory() = default; | |
| virtual std::unique_ptr<DataStore> CreateDataStore( | |
| bool create_if_missing, | |
| uint32_t shard_id, | |
| DataStoreService *data_store_service, | |
| bool start_db = true) = 0; | |
| #include "data_store.h" | |
| namespace EloqDS | |
| { | |
| class DataStoreService; | |
| class DataStoreFactory | |
| { | |
| public: | |
| virtual ~DataStoreFactory() = default; | |
| virtual std::unique_ptr<DataStore> CreateDataStore( | |
| bool create_if_missing, | |
| uint32_t shard_id, | |
| DataStoreService *data_store_service, | |
| bool start_db = true) = 0; |
🤖 Prompt for AI Agents
In store_handler/eloq_data_store_service/data_store_factory.h around lines 27 to
41, the header uses DataStoreService* in the CreateDataStore signature but does
not forward-declare or include DataStoreService, which will cause a compilation
error; fix by adding a forward declaration “class DataStoreService;” above the
DataStoreFactory declaration (or include the DataStoreService header if full
type is required), and ensure no circular include is introduced.
| if (entry->start_strike_ >= 0 && entry->end_strike_ >= 0) | ||
| { | ||
| entry->count_strike_++; | ||
| if (entry->start_strike_ < entry->count_strike_ || | ||
| entry->end_strike_ > entry->count_strike_) | ||
| { | ||
| return; | ||
| } | ||
| } |
There was a problem hiding this comment.
Strike-based gating condition is inverted — actions will never trigger.
The condition on lines 39-40 returns early (skips execution) when it should proceed. With the current logic:
- If
start_strike_=2,end_strike_=5,count_strike_=3:2 < 3is true → returns, action skipped - This means actions are skipped when they should be triggered
The intended behavior is to skip when count_strike_ is outside the [start_strike_, end_strike_] range.
Apply this diff to fix the condition:
if (entry->start_strike_ >= 0 && entry->end_strike_ >= 0)
{
entry->count_strike_++;
- if (entry->start_strike_ < entry->count_strike_ ||
- entry->end_strike_ > entry->count_strike_)
+ if (entry->count_strike_ < entry->start_strike_ ||
+ entry->count_strike_ > entry->end_strike_)
{
return;
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (entry->start_strike_ >= 0 && entry->end_strike_ >= 0) | |
| { | |
| entry->count_strike_++; | |
| if (entry->start_strike_ < entry->count_strike_ || | |
| entry->end_strike_ > entry->count_strike_) | |
| { | |
| return; | |
| } | |
| } | |
| if (entry->start_strike_ >= 0 && entry->end_strike_ >= 0) | |
| { | |
| entry->count_strike_++; | |
| if (entry->count_strike_ < entry->start_strike_ || | |
| entry->count_strike_ > entry->end_strike_) | |
| { | |
| return; | |
| } | |
| } |
🤖 Prompt for AI Agents
In store_handler/eloq_data_store_service/data_store_fault_inject.cpp around
lines 36 to 44, the strike-range gate is inverted: after incrementing
count_strike_ the code currently returns when count is inside the
[start_strike_, end_strike_] range, preventing intended actions; change the
condition to return only when count_strike_ is outside the range (i.e., if
count_strike_ < start_strike_ || count_strike_ > end_strike_) so actions proceed
when count_strike_ falls within the inclusive bounds.
Here are some reminders before you submit the pull request
fixes eloqdb/tx_service#issue_id./mtr --suite=mono_main,mono_multi,mono_basicSummary by CodeRabbit
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.