Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/cli lightweight get method api call #41150

Merged
merged 1 commit into from Jul 9, 2021

Conversation

Waadkh7
Copy link

@Waadkh7 Waadkh7 commented May 4, 2021

Adding new module to call get method with threads and different number of calls.

Fixes: https://tracker.ceph.com/issues/50311
Signed-off-by: Waad AlKhoury walkhour@redhat.com

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@github-actions github-actions bot added the pybind label May 4, 2021
@Waadkh7 Waadkh7 added this to In progress in Dashboard via automation May 7, 2021
@Waadkh7
Copy link
Author

Waadkh7 commented May 11, 2021

jenkins retest this please

@tchaikov
Copy link
Contributor

is this module supposed to be used by end user or just be developers? and why not use cProfile ?

@sebastian-philipp
Copy link
Contributor

is this module supposed to be used by end user or just be developers?

.. or Teuthology?

@pereman2
Copy link
Contributor

pereman2 commented May 17, 2021

@tchaikov For now it's supposed to be a developer tool, we need it to stress test the mgr get with help of the injector (#41260). The reason for this at the end is to test the addition of caching.
Well, cProfile looks like a bit overkill for a simple function, let me know if you think we could benefit from it.

@sebastian-philipp
Copy link
Contributor

why not add this to mgr/selftest for now? Injecting data and then verifying the performace in a teuthology test sounds like a sane idea to me

@Waadkh7 Waadkh7 force-pushed the fix-50311-master branch 2 times, most recently from 38c705e to 81035da Compare May 19, 2021 10:48
@tchaikov
Copy link
Contributor

tchaikov commented May 30, 2021

@tchaikov For now it's supposed to be a developer tool, we need it to stress test the mgr get with help of the injector (#41260). The reason for this at the end is to test the addition of caching.
Well, cProfile looks like a bit overkill for a simple function, let me know if you think we could benefit from it.

@Waadkh7 i think we should leverage existing tools if they can address our needs. cProfile is just one of them, if you are after something simpler, probably you should take a look at timeit. i just created #41591 which allows us to run code in the selftest module. in other words, we are able to run existing tools using it on mgr in a more straightforward way.

@Waadkh7 Waadkh7 marked this pull request as ready for review May 31, 2021 07:36
@epuertat
Copy link
Member

@tchaikov this PR is a helper for the mgr API Caching activities, but also for developers and users to debug and interact with the mgr aPI (basically another iteration of what I started with #34840).

We want a tool that gives us the following:

  1. List all ceph-mgr API methods and their syntax and documentation, same as with any other Ceph CLI command.
  2. Invoke a single API method from the CLI (mostly for devels, users and operators to functionally debug the mgr API).
  3. Perform threaded benchmarking against the mgr API methods.

The repl approach is ok but it doesn't cover 1. and 3. (I guess 2. could be achieved by feeding a Python script into the interpreter), and it's more devel-friendly that user-friendly. cProfile is IMHO an overkill if you just want to measure the latency of a high-level method, and doesn't solve the concurrency benchmarking. And same for timeit. Yes, all those tools can solve part of the issue, but not all.

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there's a follow up PR for this, but I promised to review it and that's what I'm doing ;-)

I just tried a few times and this is what I get:

>>> ceph mgr api get --help
# No help!
>>> ceph mgr api get
null
1.0967254638671875e-05

BTW this shouldn't happen. @tchaikov I have the feeling that the CLICommand parser is not properly distinguishing mandatory and optional args (there might be an issue with the default index).

>>> echo $? 
0
# a wrong command returns 0 (ok) instead of an error code and an error message?

>>> ceph mgr api get invented_map
null
2.1219253540039062e-05
>>> echo $? 
0
# same here 

And lastly, and more importantly, have you verified that the list you return is correct? I decided to dump the shared_list and this is what I get:

# ceph mgr api benchmark get osd_map 100 10
2021-06-10T17:36:46.250+0000 7fed48b96700 -1 WARNING: all dangerous and experimental features are enabled.
2021-06-10T17:36:46.263+0000 7fed4359e700 -1 WARNING: all dangerous and experimental features are enabled.
{"data": [0.00048470497131347656, 0.0005557537078857422, 0.00048732757568359375, 0.0005564689636230469, 0.0005829334259033203, 0.00038504600524902344, 0.0009906291961669922, 0.0005061626434326172, 0.00049591064453125, 0.0003733634948730469, 0.0003802776336669922, 0.0005617141723632812, 0.0006070137023925781, 0.0005869865417480469, 0.0006308555603027344, 0.0005338191986083984, 0.0007674694061279297, 0.0005700588226318359, 0.000408172607421875, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "min": 0, "max": 0.0009906291961669922, "avg": 0.00010464668273925782}

I see lots of zeros there that look suspicious...

src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
Comment on lines 67 to 69
class ThreadedBenchmarkRunner:
def __init__(self,number_of_total_calls,number_of_parellel_calls):
self.lock = threading.RLock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, another suggestion here, the Thread class can also be used to derive a worker/runner class, so why not taking benefit of it:

class BenchmarkRunner(threading.Thread):
  def __init__(self, num_iter, thread_number, func, results_list, *args, **kwargs):
    super().__init__(*args, **kwars)
    self.num_iter = num_iter
    self.thread_number = ...

  def run():
    # You just need to implement this method, so you put the `timer` code here
    for i in range():
      ...

runners = [BenchmarkRunner(num_iter, i, func, result_list) for i in range()]

for runner in runners:
  runner.start() # this internally calls .run()
  

src/pybind/mgr/cli_api/tests/test_cliapi.py Outdated Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Show resolved Hide resolved
src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
Dashboard automation moved this from In progress to Review in progress Jun 17, 2021
src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
@Waadkh7 Waadkh7 force-pushed the fix-50311-master branch 2 times, most recently from ee96016 to 0a00ad5 Compare June 22, 2021 08:42
Dashboard automation moved this from Review in progress to Reviewer approved Jun 22, 2021
@Waadkh7 Waadkh7 changed the base branch from master to feature-48388-cache June 29, 2021 13:21
Copy link
Contributor

@pereman2 pereman2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. I left a minor suggestion

src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please change the snake_case comments from @pereman2 and myself. Take the suggestion of adding pylint as purely optional (but it'd help you catch these issues). Thank you. You're almost there, @Waadkh7 !

src/pybind/mgr/cli_api/module.py Outdated Show resolved Hide resolved
@epuertat epuertat moved this from Reviewer approved to Ready-to-merge in Dashboard Jul 6, 2021
@Waadkh7 Waadkh7 force-pushed the fix-50311-master branch 2 times, most recently from 92b6f8a to 41a31b1 Compare July 6, 2021 13:50
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the pylint changes, you need to do clean-up on the newly added code.

src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
Dashboard automation moved this from Ready-to-merge to Review in progress Jul 6, 2021
Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT @Waadkh7 ! But please remove the rst reference in the tox ini.

src/pybind/mgr/tox.ini Outdated Show resolved Hide resolved
Dashboard automation moved this from Review in progress to Reviewer approved Jul 8, 2021
@epuertat epuertat moved this from Reviewer approved to Ready-to-merge in Dashboard Jul 9, 2021
Fixes: https://tracker.ceph.com/issues/50311
Signed-off-by: Waad AlKhoury <walkhour@redhat.com>
@epuertat epuertat merged commit 2141590 into ceph:feature-48388-cache Jul 9, 2021
Dashboard automation moved this from Ready-to-merge to Done Jul 9, 2021
@epuertat epuertat deleted the fix-50311-master branch July 9, 2021 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Dashboard
  
Done
6 participants