add benchmark which compares radix to std::map#1038
add benchmark which compares radix to std::map#1038lukaszstolarczuk merged 1 commit intopmem:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1038 +/- ##
==========================================
+ Coverage 94.12% 94.15% +0.03%
==========================================
Files 48 48
Lines 4712 4685 -27
==========================================
- Hits 4435 4411 -24
+ Misses 277 274 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
igchor
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r1.
Reviewable status: 1 of 2 files reviewed, 4 unresolved discussions (waiting on @JanDorniak99)
benchmarks/radix/radix_tree.cpp, line 96 at r1 (raw file):
template <size_t batch_size> void insert_elements_kv(std::function<void(size_t)> insert_f)
A slighly faster approach would be to just pass a generic function type (F&& f) - the same way measure works. This allows for better optimizations.
benchmarks/radix/radix_tree.cpp, line 109 at r1 (raw file):
measure<std::chrono::nanoseconds>([&] { insert_f(j); }); insert_std_map_time += measure<std::chrono::nanoseconds>([&] {
Hm, instead of inserting to those maps in a single loop maybe we could just removeing this code for mymap from here and just run insert_elements_kv for the second time with different insert_f function? (the one which would insert to mymap?)
benchmarks/radix/radix_tree.cpp, line 139 at r1 (raw file):
r->kv->find(key); }); std::cout << "Average access time (persistent radix tree): "
You could probably say it's in nanoseconds.
benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):
auto radix_time = measure<std::chrono::nanoseconds>([&] { for (auto it = r->kv->begin(); it != r->kv->end();) { auto tmp_it = it++;
There's no need for tmp_it. You can just do it = r->kv->erase(it); without any increment operation.
8c807fb to
dabf23b
Compare
JanDorniak99
left a comment
There was a problem hiding this comment.
Reviewable status: 1 of 2 files reviewed, 4 unresolved discussions (waiting on @igchor)
benchmarks/radix/radix_tree.cpp, line 96 at r1 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
A slighly faster approach would be to just pass a generic function type (F&& f) - the same way measure works. This allows for better optimizations.
Done.
benchmarks/radix/radix_tree.cpp, line 109 at r1 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
Hm, instead of inserting to those maps in a single loop maybe we could just removeing this code for mymap from here and just run insert_elements_kv for the second time with different insert_f function? (the one which would insert to mymap?)
Done.
benchmarks/radix/radix_tree.cpp, line 139 at r1 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
You could probably say it's in nanoseconds.
Done.
benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
There's no need for tmp_it. You can just do it = r->kv->erase(it); without any increment operation.
Done.
f73efc0 to
a0e9868
Compare
igchor
left a comment
There was a problem hiding this comment.
Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @JanDorniak99)
benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):
Previously, JanDorniak99 (Jan Dorniak) wrote…
Done.
Actually this code might not be correct, the erased iterator is possibly invalidated so you should not increment it. I think it's safer to do as I suggest.
a0e9868 to
42be426
Compare
KFilipek
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @igchor and @JanDorniak99)
benchmarks/radix/radix_tree.cpp, line 22 at r2 (raw file):
struct data { unsigned long index; unsigned long data_1, data_2;
fields data_1 and data_2 are not used
benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):
* be only even keys) */ void gen_ne_keys(void)
generate_nonexisting_keys() and change all naming to be more human readabl, "ne" means nothing.
benchmarks/radix/radix_tree.cpp, line 61 at r2 (raw file):
for (size_t i = 0; i < (count / sample_size); ++i) { key = static_cast<unsigned long>(rand());
every unsigned long/unsigned should be changed to size_t
benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):
} while ((key1 & 0x1) == 0);
} while (key1 % 2 == 1);
Also you can change your line:
key1 = static_cast(rand());
into something like following and remove do-while loop:
key1 = static_cast(rand() | 1 << 0);
or
key1 = static_cast(rand());
key1 |= 1 << 0;
benchmarks/radix/radix_tree.cpp, line 66 at r2 (raw file):
key1 = static_cast<unsigned>(rand()); } while ((key1 & 0x1) == 0); key = key | key1;
key |= key1;
benchmarks/radix/radix_tree.cpp, line 83 at r2 (raw file):
do { key1 = static_cast<unsigned>(rand()); } while ((key1 & 0x1));
same as above:
key1 |= 1 << 0;
or
rand() | 1 << 0
benchmarks/radix/radix_tree.cpp, line 84 at r2 (raw file):
key1 = static_cast<unsigned>(rand()); } while ((key1 & 0x1)); key = key | key1;
key |= key1;
benchmarks/radix/radix_tree.cpp, line 103 at r2 (raw file):
std::cout << "Inserting " << count << " elements..." << std::endl; for (size_t j = 0; j < count; j += batch_size)
few times above was 'i' instead of 'j' in loops.
benchmarks/radix/radix_tree.cpp, line 198 at r2 (raw file):
Quoted 4 lines of code…
std::cout << "Average remove time (map): " << static_cast<unsigned long>(std_map_time) / keys_to_insert.size() << "ns" << std::endl;
Code multiplication, change it to:
print_time_per_element("Average remove time (map)", std_map_time, keys_to_insert.size());
Function can look like this:
void print_time_per_element(string msg, unsigned long total_time, size_t num_of_elements) {
std::cout << msg + ": "
<< static_cast<unsigned long>(std_map_time) / keys_to_insert.size()
<< "ns" << std::endl;
}
benchmarks/radix/radix_tree.cpp, line 235 at r2 (raw file):
Quoted 4 lines of code…
std::cerr << e.what() << std::endl; std::cerr << "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool" << std::endl;
std::cerr << e.what() << std::endl
<< "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool"
<< std::endl;
benchmarks/radix/radix_tree.cpp, line 252 at r2 (raw file):
for (size_t i = start; i < start + batch_size; ++i) { value_type value;
Use this if you want to initialize fields.
value_type value = { i };
benchmarks/radix/radix_tree.cpp, line 276 at r2 (raw file):
lookup_elements_kv(pop);
remove empty line
benchmarks/radix/radix_tree.cpp, line 278 at r2 (raw file):
lookup_ne_elements_kv(pop);
remove empty line
benchmarks/radix/radix_tree.cpp, line 284 at r2 (raw file):
} catch (const std::logic_error &e) { std::cerr << e.what() << std::endl; }
Is the complete list of exceptions?
catch (...) {
std::cerr << "Unexpected exception" << std::endl;
}
igchor
left a comment
There was a problem hiding this comment.
Let's wait with merging this for now
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @igchor and @JanDorniak99)
karczex
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor and @JanDorniak99)
benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):
}; static std::vector<unsigned long> keys_to_insert;
It would be better to not hold those vectors in global scope, as it's hard to do major changes on such code later.
benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):
* be only even keys) */ void gen_ne_keys(void)
This function may retrun vecor<size_t>, and get count and size as parameters.
benchmarks/radix/radix_tree.cpp, line 72 at r2 (raw file):
void gen_keys()
this function may just return vector keys_to_insert, and than you may iterate over returned vector with sample_rate to construct keys_to_lookup.
Or just iterate over keys_to_insert with sample_rate in lookup_elements functions.
karczex
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor, @JanDorniak99, and @KFilipek)
benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
} while ((key1 & 0x1) == 0);} while (key1 % 2 == 1);
Also you can change your line:
key1 = static_cast(rand());
into something like following and remove do-while loop:
key1 = static_cast(rand() | 1 << 0);
or
key1 = static_cast(rand());
key1 |= 1 << 0;
key1 |= 0x1
would be cleaner
KFilipek
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor, @JanDorniak99, and @karczex)
benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):
Previously, karczex (Paweł Karczewski) wrote…
It would be better to not hold those vectors in global scope, as it's hard to do major changes on such code later.
Maybe class describing this workload, then we can also add extra features like getNextKey and received key with specified distribution.
benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):
Previously, karczex (Paweł Karczewski) wrote…
key1 |= 0x1would be cleaner
+1
42be426 to
1b89bb5
Compare
JanDorniak99
left a comment
There was a problem hiding this comment.
Reviewable status: 1 of 2 files reviewed, 18 unresolved discussions (waiting on @igchor, @karczex, and @KFilipek)
benchmarks/radix/radix_tree.cpp, line 22 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
fields data_1 and data_2 are not used
Done.
benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
Maybe class describing this workload, then we can also add extra features like getNextKey and received key with specified distribution.
Done.
benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
generate_nonexisting_keys() and change all naming to be more human readabl, "ne" means nothing.
Done.
benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):
Previously, karczex (Paweł Karczewski) wrote…
This function may retrun vecor<size_t>, and get count and size as parameters.
Done.
benchmarks/radix/radix_tree.cpp, line 61 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
every unsigned long/unsigned should be changed to size_t
Done.
benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
+1
Done.
benchmarks/radix/radix_tree.cpp, line 66 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
key |= key1;
Done.
benchmarks/radix/radix_tree.cpp, line 72 at r2 (raw file):
Previously, karczex (Paweł Karczewski) wrote…
this function may just return vector keys_to_insert, and than you may iterate over returned vector with sample_rate to construct keys_to_lookup.
Or just iterate over keys_to_insert with sample_rate in lookup_elements functions.
Done.
benchmarks/radix/radix_tree.cpp, line 83 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
same as above:
key1 |= 1 << 0;
or
rand() | 1 << 0
Done.
benchmarks/radix/radix_tree.cpp, line 84 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
key |= key1;
Done.
benchmarks/radix/radix_tree.cpp, line 103 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
few times above was 'i' instead of 'j' in loops.
Done.
benchmarks/radix/radix_tree.cpp, line 198 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
std::cout << "Average remove time (map): " << static_cast<unsigned long>(std_map_time) / keys_to_insert.size() << "ns" << std::endl;Code multiplication, change it to:
print_time_per_element("Average remove time (map)", std_map_time, keys_to_insert.size());Function can look like this:
void print_time_per_element(string msg, unsigned long total_time, size_t num_of_elements) { std::cout << msg + ": " << static_cast<unsigned long>(std_map_time) / keys_to_insert.size() << "ns" << std::endl; }
Done.
benchmarks/radix/radix_tree.cpp, line 235 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
std::cerr << e.what() << std::endl; std::cerr << "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool" << std::endl;std::cerr << e.what() << std::endl << "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool" << std::endl;
Done.
benchmarks/radix/radix_tree.cpp, line 252 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
Use this if you want to initialize fields.
value_type value = { i };
Done.
benchmarks/radix/radix_tree.cpp, line 276 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
remove empty line
Done.
benchmarks/radix/radix_tree.cpp, line 278 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
remove empty line
Done.
benchmarks/radix/radix_tree.cpp, line 284 at r2 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
Is the complete list of exceptions?
catch (...) { std::cerr << "Unexpected exception" << std::endl; }
Done.
KFilipek
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @igchor, @JanDorniak99, @karczex, and @KFilipek)
benchmarks/radix/radix_tree.cpp, line 28 at r3 (raw file):
Quoted 5 lines of code…
struct benchmark_params { size_t count = 10000; size_t sample_size = 100; size_t batch_size = 1000; };
This struct is used in this file only for containing and grouping this fields/variables.
No offence, it looks good, but IMO better will be (you can now delete further instantiate, params variable is created here:
struct benchmark_params {
size_t count = 10000;
size_t sample_size = 100;
size_t batch_size = 1000;
} params;Also: it's now global visible so you don't have to pass it to every function. Anonymous struct like below is the way for one-file applications with global variables.
struct {
size_t count = 10000;
size_t sample_size = 100;
size_t batch_size = 1000;
} benchmark_params;benchmarks/radix/radix_tree.cpp, line 50 at r3 (raw file):
Quoted 8 lines of code…
void print_time_per_element(std::string msg, std::chrono::nanoseconds::rep total_time, size_t n_elements) { std::cout << msg << static_cast<size_t>(total_time) / n_elements << "ns" << std::endl; }
Great!
benchmarks/radix/radix_tree.cpp, line 100 at r3 (raw file):
template <typename F> void insert_elements_kv(F &&insert_f, benchmark_params ¶ms,
here you will have only 2 args with approach I've proposed on the top
1b89bb5 to
1f0dbdd
Compare
JanDorniak99
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @igchor, @karczex, and @KFilipek)
benchmarks/radix/radix_tree.cpp, line 28 at r3 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
struct benchmark_params { size_t count = 10000; size_t sample_size = 100; size_t batch_size = 1000; };This struct is used in this file only for containing and grouping this fields/variables.
No offence, it looks good, but IMO better will be (you can now delete further instantiate, params variable is created here:struct benchmark_params { size_t count = 10000; size_t sample_size = 100; size_t batch_size = 1000; } params;Also: it's now global visible so you don't have to pass it to every function. Anonymous struct like below is the way for one-file applications with global variables.
struct { size_t count = 10000; size_t sample_size = 100; size_t batch_size = 1000; } benchmark_params;
Done.
benchmarks/radix/radix_tree.cpp, line 100 at r3 (raw file):
Previously, KFilipek (Krzysztof Filipek) wrote…
here you will have only 2 args with approach I've proposed on the top
Done.
KFilipek
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @igchor and @karczex)
igchor
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @karczex)
lukaszstolarczuk
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r1, 1 of 1 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @karczex)
This change is