mgr/cli lightweight get method api call #41150

Waadkh7 · 2021-05-04T16:18:06Z

Adding new module to call get method with threads and different number of calls.

Fixes: https://tracker.ceph.com/issues/50311
Signed-off-by: Waad AlKhoury walkhour@redhat.com

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

Waadkh7 · 2021-05-11T08:41:39Z

jenkins retest this please

tchaikov · 2021-05-17T09:33:45Z

is this module supposed to be used by end user or just be developers? and why not use cProfile ?

sebastian-philipp · 2021-05-17T09:54:20Z

is this module supposed to be used by end user or just be developers?

.. or Teuthology?

pereman2 · 2021-05-17T10:01:41Z

@tchaikov For now it's supposed to be a developer tool, we need it to stress test the mgr get with help of the injector (#41260). The reason for this at the end is to test the addition of caching.
Well, cProfile looks like a bit overkill for a simple function, let me know if you think we could benefit from it.

sebastian-philipp · 2021-05-17T10:11:43Z

why not add this to mgr/selftest for now? Injecting data and then verifying the performace in a teuthology test sounds like a sane idea to me

tchaikov · 2021-05-30T04:41:15Z

@tchaikov For now it's supposed to be a developer tool, we need it to stress test the mgr get with help of the injector (#41260). The reason for this at the end is to test the addition of caching.
Well, cProfile looks like a bit overkill for a simple function, let me know if you think we could benefit from it.

@Waadkh7 i think we should leverage existing tools if they can address our needs. cProfile is just one of them, if you are after something simpler, probably you should take a look at timeit. i just created #41591 which allows us to run code in the selftest module. in other words, we are able to run existing tools using it on mgr in a more straightforward way.

epuertat · 2021-06-10T14:25:59Z

@tchaikov this PR is a helper for the mgr API Caching activities, but also for developers and users to debug and interact with the mgr aPI (basically another iteration of what I started with #34840).

We want a tool that gives us the following:

List all ceph-mgr API methods and their syntax and documentation, same as with any other Ceph CLI command.
Invoke a single API method from the CLI (mostly for devels, users and operators to functionally debug the mgr API).
Perform threaded benchmarking against the mgr API methods.

The repl approach is ok but it doesn't cover 1. and 3. (I guess 2. could be achieved by feeding a Python script into the interpreter), and it's more devel-friendly that user-friendly. cProfile is IMHO an overkill if you just want to measure the latency of a high-level method, and doesn't solve the concurrency benchmarking. And same for timeit. Yes, all those tools can solve part of the issue, but not all.

epuertat

I know there's a follow up PR for this, but I promised to review it and that's what I'm doing ;-)

I just tried a few times and this is what I get:

>>> ceph mgr api get --help
# No help!

>>> ceph mgr api get
null
1.0967254638671875e-05

BTW this shouldn't happen. @tchaikov I have the feeling that the CLICommand parser is not properly distinguishing mandatory and optional args (there might be an issue with the default index).

>>> echo $? 
0
# a wrong command returns 0 (ok) instead of an error code and an error message?

>>> ceph mgr api get invented_map
null
2.1219253540039062e-05
>>> echo $? 
0
# same here

And lastly, and more importantly, have you verified that the list you return is correct? I decided to dump the shared_list and this is what I get:

# ceph mgr api benchmark get osd_map 100 10
2021-06-10T17:36:46.250+0000 7fed48b96700 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-10T17:36:46.263+0000 7fed4359e700 -1 WARNING: all dangerous and experimental features are enabled.
{"data": [0.00048470497131347656, 0.0005557537078857422, 0.00048732757568359375, 0.0005564689636230469, 0.0005829334259033203, 0.00038504600524902344, 0.0009906291961669922, 0.0005061626434326172, 0.00049591064453125, 0.0003733634948730469, 0.0003802776336669922, 0.0005617141723632812, 0.0006070137023925781, 0.0005869865417480469, 0.0006308555603027344, 0.0005338191986083984, 0.0007674694061279297, 0.0005700588226318359, 0.000408172607421875, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "min": 0, "max": 0.0009906291961669922, "avg": 0.00010464668273925782}

I see lots of zeros there that look suspicious...