Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: erasure code benchmark tool #933

Merged
7 commits merged into from Dec 20, 2013
Merged

osd: erasure code benchmark tool #933

7 commits merged into from Dec 20, 2013

Conversation

ghost
Copy link

@ghost ghost commented Dec 12, 2013

Implement the ceph_erasure_code_benchmark utility to:

  • load an erasure code plugin
  • loop over the encode function using the parameters from the command
    line
  • print the number of bytes encoded and the time to process

For instance:

$ ceph_erasure_code_benchmark \
   --plugin jerasure \
   --parameter erasure-code-directory=.libs \
   --parameter erasure-code-technique=reed_sol_van \
   --parameter erasure-code-k=2 \
   --parameter erasure-code-m=2 \
   --iterations 1000
0.964759  1048576000

shows 1GB is encoded in 1second.

Signed-off-by: Loic Dachary loic@dachary.org

@ghost
Copy link
Author

ghost commented Dec 12, 2013

@apeters1971 I would very much appreciate your review when you get time

@apeters1971
Copy link
Contributor

This looks already very good. I would add also add the option to additionally measure the decoding performance when data stripes are unavailable and evt. beautify the output a little bit - e.g. print key-value pairs with input and output parameters e.g. also compute something human readable like MB/s ?!?!

I would consider the option to have the bufferlist allocated in individual chunks and see the impact of the performance. I don't know if in the final usage in CEPH there will be one big buffer allocated and segemented or each piece will have its own allocation ... in the SNAPRAID sources there is this remark:

  • When allocating a sequence of blocks with a size of power of 2,
  • there is the risk that the start of each block is mapped into the same cache line,
  • resulting in cache collisions if you access all the blocks in parallel,
  • from the start to the end.

I don't know your exact plans, but it would be also very useful to wrap the command with some scripting language printing the platform you run, memory and maybe run each command like this and some standard test parameters ?!?!?

perf stat /usr/bin/time -v ceph_erasure_code_benchmark ....
cat /proc/cpuinfo
cat /proc/meminfo

@ghost
Copy link
Author

ghost commented Dec 12, 2013

Thanks for the quick feedback. I'll work on it today :-)

@ghost
Copy link
Author

ghost commented Dec 12, 2013

@apeters1971 I've added decode + inserting erasure as you suggested.
The plan is to use this within another tool rather than being human readable. It would be a script added to ceph/qa/workunits that would call ceph_erasure_code_benchmark with various relevant parameters and compile the result in a CSV file to be consumed either by a teuthology suite designed to detect regression between versions or by something to show a nice human readable display.
Regarding the impact of chunk allocation, I would rather do it later : as you point out we don't know exactly how it will be done within Ceph.

@ghost
Copy link
Author

ghost commented Dec 12, 2013

@apeters1971 be8cfc9 is the implementation of the benchmark workunit based on ceph_erasure_code_benchmark

@ghost
Copy link
Author

ghost commented Dec 13, 2013

@apeters1971 I was able to find performance problems in the example ( I know it does not matter really but ... ;-). I was also able to confirm that jerasure performances as reported by ceph_erasure_code_benchmark are the same as the one you can get from the decoder/encoder examples found in Jerasure itself. This is a good sign that the tool works and that the plugin implementation does not introduce any significant performance problem.

Although I could work on a teuthology job using bench.sh, I would like to use it to measure the benefit of your implementation of BPC first.

@ghost
Copy link
Author

ghost commented Dec 14, 2013

@apeters1971 I ran bench.sh and interpreted the results as explained in the draft post here : http://dachary.org/loic/ecbench/ . The numbers look good and will be even better after your code is merged :-) It would be great if you could run it independently and confirm the numbers.

@apeters1971
Copy link
Contributor

@loic yes will do,
this looks familiar .. just that you have a 40% faster CPU than me. Which
GCC version did you use?

On Sat, Dec 14, 2013 at 2:41 AM, Loic Dachary notifications@github.comwrote:

@apeters1971 https://github.com/apeters1971 I ran bench.sh and
interpreted the results as explained in the draft post here :
http://dachary.org/loic/ecbench/ . The numbers look good and will be even
better after your code is merged :-) It would be great if you could run it
independently and confirm the numbers.


Reply to this email directly or view it on GitHubhttps://github.com//pull/933#issuecomment-30556893
.

@ghost
Copy link
Author

ghost commented Dec 14, 2013

I use

 gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)

and

 make CFLAGS='-O3' 

so jerasure compiles with -O3 instead of -O2

@ghost
Copy link
Author

ghost commented Dec 19, 2013

rebased against master and verified that make check is happy. ceph_erasure_code_benchmark was wrongfully listed as a unit test, it is a debug program. Also added packaging information. Foucault de Bonneval verified that it runs and provides consistent results on Dell R620.

Now checking to see if gitbuilder would be happy about the change.

Loic Dachary added 7 commits December 20, 2013 11:28
When profiling, tools such as valgrind --tool=callgrind require that the
dynamically loaded libraries are not dlclosed so they can collect usage
information.

The public ErasureCodePluginRegistry::disable_dlclose boolean is introduced
for this purpose.

Signed-off-by: Loic Dachary <loic@dachary.org>
The XOR based example is ten times slower than it could because it uses
the buffer::ptr[] operator. Use a temporary char * instead. It performs
as well as jerasure Reed Solomon when decoding with a single erasure:

$ ceph_erasure_code_benchmark \
   --plugin example  --parameter erasure-code-directory=.libs \
   --parameter erasure-code-technique=example \
   --parameter erasure-code-k=2 --parameter erasure-code-m=1 \
   --erasure 1 --workload decode --iterations 5000
8.095007	5GB

$ ceph_erasure_code_benchmark \
   --plugin jerasure  --parameter erasure-code-directory=.libs \
   --parameter erasure-code-technique=reed_sol_van \
   --parameter erasure-code-k=10 --parameter erasure-code-m=6 \
   --erasure 1 --workload decode --iterations 5000
7.870990	5GB

Signed-off-by: Loic Dachary <loic@dachary.org>
As shown in
https://www.usenix.org/legacy/events/fast09/tech/full_papers/plank/plank_html/
under "Impact of the Packet Size", the optimal for is in the order of 1k
rather than the current default of 8. Benchmarks are required to find
the actual optimum.

Signed-off-by: Loic Dachary <loic@dachary.org>
Implement the ceph_erasure_code_benchmark utility to:

* load an erasure code plugin

* loop over the encode/decode function using the parameters from the
  command line

* print the number of bytes encoded/decoded and the time to process

When decoding, random chunks ( as set with --erasures ) are lost on each
run.

For instance:

    $ ceph_erasure_code_benchmark \
       --plugin jerasure \
       --parameter erasure-code-directory=.libs \
       --parameter erasure-code-technique=reed_sol_van \
       --parameter erasure-code-k=2 \
       --parameter erasure-code-m=2 \
       --workload decode \
       --erasures 2 \
       --iterations 1000
    0.964759	1048576

shows 1GB is decoded in 1second.

It is intended to be used by other scripts to present a human readable
output or detect performance regressions.

Signed-off-by: Loic Dachary <loic@dachary.org>
Display benchmark results for the default erasure code plugins, in a tab
separated CSV file. The first two column contain the amount of KB
that were coded or decoded, for a given combination of parameters
displayed in the following fields.

seconds	KB	plugin	k	m	work.	iter.	size	eras.
1.2	10	example	2	1	encode	10	1024	0
0.5	10	example	2	1	decode	10	1024	1

It can be used as input for a human readable report. It is also intented
to be used to show if a given version of an erasure code plugin performs
better than another.

The last column ( not shown above for brievety ) is the exact command
that was run to produce the result so it can be copy / pasted to
reproduce them or to profile.

Only the jerasure techniques mentionned in
https://www.usenix.org/legacy/events/fast09/tech/full_papers/plank/plank_html/
are benchmarked, the others are assumed to be less interesting.

Signed-off-by: Loic Dachary <loic@dachary.org>
Add to the packaging for RPMs and DEBs

Signed-off-by: Loic Dachary <loic@dachary.org>
Signed-off-by: Loic Dachary <loic@dachary.org>
@ghost
Copy link
Author

ghost commented Dec 20, 2013

@apeters1971 implemented -h as you suggested and made it so no argument issues a sensible error message instead of a stack trace

ghost pushed a commit that referenced this pull request Dec 20, 2013
osd: erasure code benchmark tool

Reviewed-by: Andreas Peters <andreas.joachim.peters@cern.ch>
Reviewed-by: Christophe Courtaut <christophe.courtaut@gmail.com>
@ghost ghost merged commit e04d7b8 into ceph:master Dec 20, 2013
liewegas pushed a commit that referenced this pull request Dec 14, 2016
cephfs: update tests to enable multimds when needed
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant