Skip to content
This repository was archived by the owner on Mar 22, 2023. It is now read-only.

add benchmark which compares radix to std::map#1038

Merged
lukaszstolarczuk merged 1 commit intopmem:masterfrom
JanDorniak99:add_radix_benchmark
Apr 28, 2021
Merged

add benchmark which compares radix to std::map#1038
lukaszstolarczuk merged 1 commit intopmem:masterfrom
JanDorniak99:add_radix_benchmark

Conversation

@JanDorniak99
Copy link
Copy Markdown
Contributor

@JanDorniak99 JanDorniak99 commented Feb 15, 2021

This change is Reviewable

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 15, 2021

Codecov Report

Merging #1038 (1f0dbdd) into master (59bdcc3) will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1038      +/-   ##
==========================================
+ Coverage   94.12%   94.15%   +0.03%     
==========================================
  Files          48       48              
  Lines        4712     4685      -27     
==========================================
- Hits         4435     4411      -24     
+ Misses        277      274       -3     
Flag Coverage Δ
tests_clang_debug_cpp17 93.66% <ø> (+0.02%) ⬆️
tests_gcc_debug 92.04% <ø> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
include/libpmemobj++/pool.hpp 85.89% <0.00%> (-1.99%) ⬇️
...libpmemobj++/detail/enumerable_thread_specific.hpp 98.91% <0.00%> (-1.09%) ⬇️
...ude/libpmemobj++/container/concurrent_hash_map.hpp 93.88% <0.00%> (-0.48%) ⬇️
include/libpmemobj++/transaction.hpp 82.56% <0.00%> (+0.91%) ⬆️
include/libpmemobj++/pexceptions.hpp 44.61% <0.00%> (+1.75%) ⬆️
include/libpmemobj++/detail/life.hpp 100.00% <0.00%> (+2.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59bdcc3...1f0dbdd. Read the comment docs.

Copy link
Copy Markdown
Contributor

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1.
Reviewable status: 1 of 2 files reviewed, 4 unresolved discussions (waiting on @JanDorniak99)


benchmarks/radix/radix_tree.cpp, line 96 at r1 (raw file):

template <size_t batch_size>
void
insert_elements_kv(std::function<void(size_t)> insert_f)

A slighly faster approach would be to just pass a generic function type (F&& f) - the same way measure works. This allows for better optimizations.


benchmarks/radix/radix_tree.cpp, line 109 at r1 (raw file):

			measure<std::chrono::nanoseconds>([&] { insert_f(j); });

		insert_std_map_time += measure<std::chrono::nanoseconds>([&] {

Hm, instead of inserting to those maps in a single loop maybe we could just removeing this code for mymap from here and just run insert_elements_kv for the second time with different insert_f function? (the one which would insert to mymap?)


benchmarks/radix/radix_tree.cpp, line 139 at r1 (raw file):

			r->kv->find(key);
	});
	std::cout << "Average access time (persistent radix tree): "

You could probably say it's in nanoseconds.


benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):

	auto radix_time = measure<std::chrono::nanoseconds>([&] {
		for (auto it = r->kv->begin(); it != r->kv->end();) {
			auto tmp_it = it++;

There's no need for tmp_it. You can just do it = r->kv->erase(it); without any increment operation.

Copy link
Copy Markdown
Contributor Author

@JanDorniak99 JanDorniak99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 4 unresolved discussions (waiting on @igchor)


benchmarks/radix/radix_tree.cpp, line 96 at r1 (raw file):

Previously, igchor (Igor Chorążewicz) wrote…

A slighly faster approach would be to just pass a generic function type (F&& f) - the same way measure works. This allows for better optimizations.

Done.


benchmarks/radix/radix_tree.cpp, line 109 at r1 (raw file):

Previously, igchor (Igor Chorążewicz) wrote…

Hm, instead of inserting to those maps in a single loop maybe we could just removeing this code for mymap from here and just run insert_elements_kv for the second time with different insert_f function? (the one which would insert to mymap?)

Done.


benchmarks/radix/radix_tree.cpp, line 139 at r1 (raw file):

Previously, igchor (Igor Chorążewicz) wrote…

You could probably say it's in nanoseconds.

Done.


benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):

Previously, igchor (Igor Chorążewicz) wrote…

There's no need for tmp_it. You can just do it = r->kv->erase(it); without any increment operation.

Done.

@JanDorniak99 JanDorniak99 force-pushed the add_radix_benchmark branch 2 times, most recently from f73efc0 to a0e9868 Compare February 17, 2021 10:12
Copy link
Copy Markdown
Contributor

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 1 unresolved discussion (waiting on @JanDorniak99)


benchmarks/radix/radix_tree.cpp, line 191 at r1 (raw file):

Previously, JanDorniak99 (Jan Dorniak) wrote…

Done.

Actually this code might not be correct, the erased iterator is possibly invalidated so you should not increment it. I think it's safer to do as I suggest.

Copy link
Copy Markdown
Contributor

@KFilipek KFilipek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @igchor and @JanDorniak99)


benchmarks/radix/radix_tree.cpp, line 22 at r2 (raw file):

struct data {
	unsigned long index;
	unsigned long data_1, data_2;

fields data_1 and data_2 are not used


benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):

 * be only even keys) */
void
gen_ne_keys(void)

generate_nonexisting_keys() and change all naming to be more human readabl, "ne" means nothing.


benchmarks/radix/radix_tree.cpp, line 61 at r2 (raw file):

	for (size_t i = 0; i < (count / sample_size); ++i) {
		key = static_cast<unsigned long>(rand());

every unsigned long/unsigned should be changed to size_t


benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):

} while ((key1 & 0x1) == 0);

} while (key1 % 2 == 1);

Also you can change your line:
key1 = static_cast(rand());
into something like following and remove do-while loop:
key1 = static_cast(rand() | 1 << 0);
or
key1 = static_cast(rand());
key1 |= 1 << 0;


benchmarks/radix/radix_tree.cpp, line 66 at r2 (raw file):

			key1 = static_cast<unsigned>(rand());
		} while ((key1 & 0x1) == 0);
		key = key | key1;

key |= key1;


benchmarks/radix/radix_tree.cpp, line 83 at r2 (raw file):

		do {
			key1 = static_cast<unsigned>(rand());
		} while ((key1 & 0x1));

same as above:
key1 |= 1 << 0;
or
rand() | 1 << 0


benchmarks/radix/radix_tree.cpp, line 84 at r2 (raw file):

			key1 = static_cast<unsigned>(rand());
		} while ((key1 & 0x1));
		key = key | key1;

key |= key1;


benchmarks/radix/radix_tree.cpp, line 103 at r2 (raw file):

	std::cout << "Inserting " << count << " elements..." << std::endl;

	for (size_t j = 0; j < count; j += batch_size)

few times above was 'i' instead of 'j' in loops.


benchmarks/radix/radix_tree.cpp, line 198 at r2 (raw file):

Quoted 4 lines of code…
	std::cout << "Average remove time (map): "
		  << static_cast<unsigned long>(std_map_time) /
			keys_to_insert.size()
		  << "ns" << std::endl;

Code multiplication, change it to:

print_time_per_element("Average remove time (map)", std_map_time, keys_to_insert.size());

Function can look like this:

void print_time_per_element(string msg, unsigned long total_time, size_t num_of_elements) {
    std::cout << msg + ": " 
        << static_cast<unsigned long>(std_map_time) / keys_to_insert.size()
        << "ns" << std::endl;
}

benchmarks/radix/radix_tree.cpp, line 235 at r2 (raw file):

Quoted 4 lines of code…
		std::cerr << e.what() << std::endl;
		std::cerr
			<< "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool"
			<< std::endl;
std::cerr << e.what() << std::endl
    << "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool"
    << std::endl;

benchmarks/radix/radix_tree.cpp, line 252 at r2 (raw file):

					for (size_t i = start;
					     i < start + batch_size; ++i) {
						value_type value;

Use this if you want to initialize fields.

value_type value = { i };

benchmarks/radix/radix_tree.cpp, line 276 at r2 (raw file):

		lookup_elements_kv(pop);

remove empty line


benchmarks/radix/radix_tree.cpp, line 278 at r2 (raw file):

		lookup_ne_elements_kv(pop);

remove empty line


benchmarks/radix/radix_tree.cpp, line 284 at r2 (raw file):

	} catch (const std::logic_error &e) {
		std::cerr << e.what() << std::endl;
	}

Is the complete list of exceptions?

catch (...) {
   std::cerr << "Unexpected exception" << std::endl;
}

Copy link
Copy Markdown
Contributor

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wait with merging this for now

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @igchor and @JanDorniak99)

Copy link
Copy Markdown

@karczex karczex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor and @JanDorniak99)


benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):

};

static std::vector<unsigned long> keys_to_insert;

It would be better to not hold those vectors in global scope, as it's hard to do major changes on such code later.


benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):

 * be only even keys) */
void
gen_ne_keys(void)

This function may retrun vecor<size_t>, and get count and size as parameters.


benchmarks/radix/radix_tree.cpp, line 72 at r2 (raw file):

void
gen_keys()

this function may just return vector keys_to_insert, and than you may iterate over returned vector with sample_rate to construct keys_to_lookup.
Or just iterate over keys_to_insert with sample_rate in lookup_elements functions.

Copy link
Copy Markdown

@karczex karczex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor, @JanDorniak99, and @KFilipek)


benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…
} while ((key1 & 0x1) == 0);

} while (key1 % 2 == 1);

Also you can change your line:
key1 = static_cast(rand());
into something like following and remove do-while loop:
key1 = static_cast(rand() | 1 << 0);
or
key1 = static_cast(rand());
key1 |= 1 << 0;

key1 |= 0x1 

would be cleaner

Copy link
Copy Markdown
Contributor

@KFilipek KFilipek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @igchor, @JanDorniak99, and @karczex)


benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):

Previously, karczex (Paweł Karczewski) wrote…

It would be better to not hold those vectors in global scope, as it's hard to do major changes on such code later.

Maybe class describing this workload, then we can also add extra features like getNextKey and received key with specified distribution.


benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):

Previously, karczex (Paweł Karczewski) wrote…
key1 |= 0x1 

would be cleaner

+1

Copy link
Copy Markdown
Contributor Author

@JanDorniak99 JanDorniak99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 2 files reviewed, 18 unresolved discussions (waiting on @igchor, @karczex, and @KFilipek)


benchmarks/radix/radix_tree.cpp, line 22 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

fields data_1 and data_2 are not used

Done.


benchmarks/radix/radix_tree.cpp, line 33 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

Maybe class describing this workload, then we can also add extra features like getNextKey and received key with specified distribution.

Done.


benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

generate_nonexisting_keys() and change all naming to be more human readabl, "ne" means nothing.

Done.


benchmarks/radix/radix_tree.cpp, line 55 at r2 (raw file):

Previously, karczex (Paweł Karczewski) wrote…

This function may retrun vecor<size_t>, and get count and size as parameters.

Done.


benchmarks/radix/radix_tree.cpp, line 61 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

every unsigned long/unsigned should be changed to size_t

Done.


benchmarks/radix/radix_tree.cpp, line 65 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

+1

Done.


benchmarks/radix/radix_tree.cpp, line 66 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

key |= key1;

Done.


benchmarks/radix/radix_tree.cpp, line 72 at r2 (raw file):

Previously, karczex (Paweł Karczewski) wrote…

this function may just return vector keys_to_insert, and than you may iterate over returned vector with sample_rate to construct keys_to_lookup.
Or just iterate over keys_to_insert with sample_rate in lookup_elements functions.

Done.


benchmarks/radix/radix_tree.cpp, line 83 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

same as above:
key1 |= 1 << 0;
or
rand() | 1 << 0

Done.


benchmarks/radix/radix_tree.cpp, line 84 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

key |= key1;

Done.


benchmarks/radix/radix_tree.cpp, line 103 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

few times above was 'i' instead of 'j' in loops.

Done.


benchmarks/radix/radix_tree.cpp, line 198 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…
	std::cout << "Average remove time (map): "
		  << static_cast<unsigned long>(std_map_time) /
			keys_to_insert.size()
		  << "ns" << std::endl;

Code multiplication, change it to:

print_time_per_element("Average remove time (map)", std_map_time, keys_to_insert.size());

Function can look like this:

void print_time_per_element(string msg, unsigned long total_time, size_t num_of_elements) {
    std::cout << msg + ": " 
        << static_cast<unsigned long>(std_map_time) / keys_to_insert.size()
        << "ns" << std::endl;
}

Done.


benchmarks/radix/radix_tree.cpp, line 235 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…
		std::cerr << e.what() << std::endl;
		std::cerr
			<< "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool"
			<< std::endl;
std::cerr << e.what() << std::endl
    << "To create pool run: pmempool create obj --layout=radix -s 1G path_to_pool"
    << std::endl;

Done.


benchmarks/radix/radix_tree.cpp, line 252 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

Use this if you want to initialize fields.

value_type value = { i };

Done.


benchmarks/radix/radix_tree.cpp, line 276 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

remove empty line

Done.


benchmarks/radix/radix_tree.cpp, line 278 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

remove empty line

Done.


benchmarks/radix/radix_tree.cpp, line 284 at r2 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

Is the complete list of exceptions?

catch (...) {
   std::cerr << "Unexpected exception" << std::endl;
}

Done.

Copy link
Copy Markdown
Contributor

@KFilipek KFilipek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @igchor, @JanDorniak99, @karczex, and @KFilipek)


benchmarks/radix/radix_tree.cpp, line 28 at r3 (raw file):

Quoted 5 lines of code…
struct benchmark_params {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
};

This struct is used in this file only for containing and grouping this fields/variables.
No offence, it looks good, but IMO better will be (you can now delete further instantiate, params variable is created here:

struct benchmark_params {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
} params;

Also: it's now global visible so you don't have to pass it to every function. Anonymous struct like below is the way for one-file applications with global variables.

struct {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
} benchmark_params;

benchmarks/radix/radix_tree.cpp, line 50 at r3 (raw file):

Quoted 8 lines of code…
void
print_time_per_element(std::string msg,
		       std::chrono::nanoseconds::rep total_time,
		       size_t n_elements)
{
	std::cout << msg << static_cast<size_t>(total_time) / n_elements << "ns"
		  << std::endl;
}

Great!


benchmarks/radix/radix_tree.cpp, line 100 at r3 (raw file):

template <typename F>
void
insert_elements_kv(F &&insert_f, benchmark_params &params,

here you will have only 2 args with approach I've proposed on the top

@JanDorniak99 JanDorniak99 force-pushed the add_radix_benchmark branch from 1b89bb5 to 1f0dbdd Compare March 1, 2021 08:37
Copy link
Copy Markdown
Contributor Author

@JanDorniak99 JanDorniak99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @igchor, @karczex, and @KFilipek)


benchmarks/radix/radix_tree.cpp, line 28 at r3 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…
struct benchmark_params {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
};

This struct is used in this file only for containing and grouping this fields/variables.
No offence, it looks good, but IMO better will be (you can now delete further instantiate, params variable is created here:

struct benchmark_params {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
} params;

Also: it's now global visible so you don't have to pass it to every function. Anonymous struct like below is the way for one-file applications with global variables.

struct {
	size_t count = 10000;
	size_t sample_size = 100;
	size_t batch_size = 1000;
} benchmark_params;

Done.


benchmarks/radix/radix_tree.cpp, line 100 at r3 (raw file):

Previously, KFilipek (Krzysztof Filipek) wrote…

here you will have only 2 args with approach I've proposed on the top

Done.

Copy link
Copy Markdown
Contributor

@KFilipek KFilipek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job: :lgtm:

Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @igchor and @karczex)

Copy link
Copy Markdown
Contributor

@igchor igchor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @karczex)

Copy link
Copy Markdown
Member

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r1, 1 of 1 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @karczex)

@lukaszstolarczuk lukaszstolarczuk merged commit a2cb574 into pmem:master Apr 28, 2021
@JanDorniak99 JanDorniak99 deleted the add_radix_benchmark branch April 28, 2021 12:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants