Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: optimize the loading index performance (#29894) #30018

Merged
merged 1 commit into from
Jan 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 15 additions & 21 deletions internal/core/src/index/VectorMemIndex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,6 @@
.empty()) { // load with the slice meta info, then we can load batch by batch
std::string index_file_prefix = slice_meta_filepath.substr(
0, slice_meta_filepath.find_last_of('/') + 1);
std::vector<std::string> batch{};
batch.reserve(parallel_degree);

auto result = file_manager_->LoadIndexToMemory({slice_meta_filepath});
auto raw_slice_meta = result[INDEX_FILE_SLICE_META];
Expand All @@ -161,30 +159,26 @@

auto new_field_data =
milvus::storage::CreateFieldData(DataType::INT8, 1, total_len);
auto HandleBatch = [&](int index) {
auto batch_data = file_manager_->LoadIndexToMemory(batch);
for (int j = index - batch.size() + 1; j <= index; j++) {
std::string file_name = GenSlicedFileName(prefix, j);
AssertInfo(batch_data.find(file_name) != batch_data.end(),
"lost index slice data");
auto data = batch_data[file_name];
new_field_data->FillFieldData(data->Data(), data->Size());
}
for (auto& file : batch) {
pending_index_files.erase(file);
}
batch.clear();
};

std::vector<std::string> batch;
batch.reserve(slice_num);

Check warning on line 164 in internal/core/src/index/VectorMemIndex.cpp

View check run for this annotation

Codecov / codecov/patch

internal/core/src/index/VectorMemIndex.cpp#L163-L164

Added lines #L163 - L164 were not covered by tests
for (auto i = 0; i < slice_num; ++i) {
std::string file_name = GenSlicedFileName(prefix, i);
batch.push_back(index_file_prefix + file_name);
if (batch.size() >= parallel_degree) {
HandleBatch(i);
}
}
if (batch.size() > 0) {
HandleBatch(slice_num - 1);

auto batch_data = file_manager_->LoadIndexToMemory(batch);
for (const auto& file_path : batch) {

Check warning on line 171 in internal/core/src/index/VectorMemIndex.cpp

View check run for this annotation

Codecov / codecov/patch

internal/core/src/index/VectorMemIndex.cpp#L170-L171

Added lines #L170 - L171 were not covered by tests
const std::string file_name =
file_path.substr(file_path.find_last_of('/') + 1);
AssertInfo(batch_data.find(file_name) != batch_data.end(),

Check warning on line 174 in internal/core/src/index/VectorMemIndex.cpp

View check run for this annotation

Codecov / codecov/patch

internal/core/src/index/VectorMemIndex.cpp#L173-L174

Added lines #L173 - L174 were not covered by tests
"lost index slice data: {}",
file_name);
auto data = batch_data[file_name];
new_field_data->FillFieldData(data->Data(), data->Size());

Check warning on line 178 in internal/core/src/index/VectorMemIndex.cpp

View check run for this annotation

Codecov / codecov/patch

internal/core/src/index/VectorMemIndex.cpp#L177-L178

Added lines #L177 - L178 were not covered by tests
}
for (auto& file : batch) {
pending_index_files.erase(file);

Check warning on line 181 in internal/core/src/index/VectorMemIndex.cpp

View check run for this annotation

Codecov / codecov/patch

internal/core/src/index/VectorMemIndex.cpp#L180-L181

Added lines #L180 - L181 were not covered by tests
}

AssertInfo(
Expand Down
Loading