Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export mulit_get interface #59

Closed
wants to merge 15 commits into from
Closed

export mulit_get interface #59

wants to merge 15 commits into from

Conversation

BusyJon
Copy link

@BusyJon BusyJon commented May 23, 2017

@siddontang
Copy link

Please add test.

Copy link

@andelf andelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubt about error handing, return types, and Vec<&[u8]> as parameter type.

num_keys: size_t,
keys_list: *const *const u8,
keys_list_sizes: *const size_t,
values_list: *mut *mut u8,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be *mut *const u8 ?

Copy link
Author

@BusyJon BusyJon May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DBVector type use *mut u8.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be *mut *mut u8. Both the values_list and its content needs to be dropped explicitly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. value_list is created by crocksdb, malloc().
so we need to own it in DBVector and drop() it by free() via ffi.

src/rocksdb.rs Outdated
cf: &CFHandle,
keys_vec: Vec<&[u8]>,
readopts: &ReadOptions)
-> Result<Option<Vec<DBVector>>, String> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be Result<Vec<Option<DBVector>>, _>
or more precisely
Vec<Result<Option<DBVector>, String>>

since we use Option<> to denote Status::NotFound

src/rocksdb.rs Outdated
@@ -513,6 +513,81 @@ impl DB {
self.get_cf_opt(cf, key, &ReadOptions::new())
}

pub fn multi_get_cf_opt(&self,
cf: &CFHandle,
keys_vec: Vec<&[u8]>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be &[&[u8]] ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, i'm not sure, maybe &[Vec].

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&[Vec] is kind of strange. we've already use &[u8] to denote Slice.
I think &[&[u8]] is ok.
@siddontang

src/rocksdb.rs Outdated
let keys_list_sizes = keys_list_sizes.as_ptr();
let mut value_list = Vec::<*mut u8>::with_capacity(num_keys);
let mut value_list_sizes = Vec::<size_t>::with_capacity(num_keys);
ffi_try!(crocksdb_multi_get_cf(self.inner,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong.
since errs is c array of error string, not a pointer to a c string

src/rocksdb.rs Outdated
pub fn multi_get_opt(&self,
keys_vec: Vec<&[u8]>,
readopts: &ReadOptions)
-> Result<Option<Vec<DBVector>>, String> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you export DB::DefaultColumnFamily()
you can impl all the other fn by multi_get_cf_opt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use cf_handle("default") to get the default CF handle.

keys_list: *const *const u8,
keys_list_sizes: *const size_t,
values_list: *mut *mut u8,
values_list_sizes: *const size_t,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be *mut.

num_keys: size_t,
keys_list: *const *const u8,
keys_list_sizes: *const size_t,
values_list: *mut *mut u8,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be *mut *mut u8. Both the values_list and its content needs to be dropped explicitly.

src/rocksdb.rs Outdated
}
let keys_list = keys_list.as_ptr();
let keys_list_sizes = keys_list_sizes.as_ptr();
let mut value_list = Vec::<*mut u8>::with_capacity(num_keys);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation is redundant.

@BusyJon
Copy link
Author

BusyJon commented May 24, 2017

all comments considered, PTAL again. @zhangjinpeng1987 @siddontang @andelf @BusyJay

src/rocksdb.rs Outdated
@@ -513,6 +513,71 @@ impl DB {
self.get_cf_opt(cf, key, &ReadOptions::new())
}

pub fn multi_get_cf_opt(&self,
cf: &CFHandle,
keys_vec: &[Vec<u8>],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/keys/keys_vec/g

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&[&[u8]] may be better?

/cc @BusyJay

lishihai9017 added 2 commits May 24, 2017 14:32
@@ -513,6 +513,71 @@ impl DB {
self.get_cf_opt(cf, key, &ReadOptions::new())
}

pub fn multi_get_cf_opt(&self,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for easy use, we can declare the return type as Result<Vec<Option<DBVector>>, String>, If we meet any error in multi_get, we will return error directly.

/cc @BusyJay @andelf @zhangjinpeng1987

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case get will return error?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to �name a few, db corruption, unsupported compression, io error, db busy?

src/rocksdb.rs Outdated
@@ -513,6 +513,71 @@ impl DB {
self.get_cf_opt(cf, key, &ReadOptions::new())
}

pub fn multi_get_cf_opt(&self,
cf: &CFHandle,
keys: &[Vec<u8>],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be &[&[u8]], so we don't need to move the key when query.

src/rocksdb.rs Outdated
let mut value_vec = Vec::with_capacity(num_keys);
for i in 0..num_keys {
if !err_list[i].is_null() {
return Err(crocksdb_ffi::error_message(err_list[i]));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory leaks?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strdup without free

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still memory leaks. But the content in value_list and err_list should be dropped.

Copy link
Member

@BusyJay BusyJay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to changed the return type from Vec to Result, the implementation in crocksdb.c can be changed directly.

src/rocksdb.rs Outdated
let mut value_vec = Vec::with_capacity(num_keys);
for i in 0..num_keys {
if !err_list[i].is_null() {
return Err(crocksdb_ffi::error_message(err_list[i]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still memory leaks. But the content in value_list and err_list should be dropped.

lishihai9017 added 2 commits May 25, 2017 10:24
@@ -854,6 +854,7 @@ void crocksdb_multi_get_cf(
values_list_sizes[i] = 0;
if (!statuses[i].IsNotFound()) {
errs[i] = strdup(statuses[i].ToString().c_str());
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update multi_get too.

@siddontang
Copy link

Any test?

src/rocksdb.rs Outdated
@@ -513,6 +513,71 @@ impl DB {
self.get_cf_opt(cf, key, &ReadOptions::new())
}

pub fn multi_get_cf_opt(&self,
cf: &CFHandle,
keys: &[&[u8]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like using generic type here is more suitable. like f<F: AsRef<[u8]>>(keys: &[F])

@@ -854,6 +854,7 @@ void crocksdb_multi_get_cf(
values_list_sizes[i] = 0;
if (!statuses[i].IsNotFound()) {
errs[i] = strdup(statuses[i].ToString().c_str());
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errs should not be accessed as an array any more. Because only the first error is cared now. And copied values should be dropped.

src/rocksdb.rs Outdated
value_sizes.as_mut_ptr(),
err);
if !err.is_null() {
return Err(crocksdb_ffi::error_message(err));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may still meet memory leak if value list has values.

errs[i] = strdup(statuses[i].ToString().c_str());
} else {
errs[i] = nullptr;
err = strdup(statuses[i].ToString().c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. err is a pointer which will be passed by value. Use char** err and SaveError instead. And should drop all the content in values_list too.

@siddontang
Copy link

any benchmark to show multi get is better than seek + next if we want to get data in a range.

@BusyJon
Copy link
Author

BusyJon commented May 25, 2017

benched on my own laptop

fetch one kv

test multi_get::bench_get_serially_in_memtable ... bench:         679 ns/iter (+/- 311)
test multi_get::bench_iterator_get_in_memtable ... bench:         961 ns/iter (+/- 372)
test multi_get::bench_multi_get_in_memtable    ... bench:       1,559 ns/iter (+/- 976)

fetch two kv

test multi_get::bench_get_serially_in_memtable ... bench:       1,451 ns/iter (+/- 644)
test multi_get::bench_iterator_get_in_memtable ... bench:       1,080 ns/iter (+/- 397)
test multi_get::bench_multi_get_in_memtable    ... bench:       2,321 ns/iter (+/- 1,370)

it seems using get to fetch one key is quick than seek, but when fetching more than one key, the cheap next make the whole process getting more faster.

@siddontang
Copy link

How about get from SST file?
For the result, I guess we can't use it in raft CF. @zhangjinpeng1987

@zhangjinpeng87
Copy link
Member

If there is only one memtable, seek is as fast as get. Please add bench for the case there are some sst files, since in the most cases there are some sst files.

@BusyJon
Copy link
Author

BusyJon commented May 26, 2017

benchmark result

with about 10 sst file and some keys in memtable, fetch 8 kvs which is in memtable
use get is faster than seek+next when fetching no more than 8 kv.

test multi_get::bench_get_serially ... bench:       2,270 ns/iter (+/- 39)
test multi_get::bench_iterator     ... bench:       3,425 ns/iter (+/- 240)
test multi_get::bench_multi_get    ... bench:       2,525 ns/iter (+/- 40)

by the way, the differ between different count fetching in the same condition indicates seek is expensive.

fetch 1 key     2,815 ns/iter (+/- 77)
fetch 2 keys    2,923 ns/iter (+/- 443) 
fetch 4 keys    3,161 ns/iter (+/- 78)
fetch 8 keys    3,349 ns/iter (+/- 35)

@andelf andelf mentioned this pull request May 31, 2017
@andelf
Copy link

andelf commented Jun 1, 2017

according to #62, this pr should be closed. 😿 @lishihai9017

@siddontang siddontang closed this Jun 1, 2017
@siddontang siddontang mentioned this pull request Nov 28, 2017
@zhangjinpeng87 zhangjinpeng87 deleted the lishihai/add-mulit-get branch December 4, 2017 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants