Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: optimize reana-client #359

Merged
merged 2 commits into from Jan 22, 2020

Conversation

mvidalgarcia
Copy link
Member

@mvidalgarcia mvidalgarcia commented Jan 20, 2020

closes #354

Optimizations:

  • ping: ~2.2s → ~0.8s
  • --help: ~0.6s → ~0.3s
  • version: ~0.6s → ~0.3s
  • create: ~2s → ~1.2s
  • start: ~2.4s → ~1.5s
  • status: ~1.6s → ~1.2s
  • run: ~7s → ~3.9s (reana-demo-worldpopulation)
  • list: ~1.6s → ~1.2s
  • diff: ~1.6s → ~1s
  • logs: ~1.6s → ~1s
  • stop: ~1s → ~0.4s
  • upload: ~2.3s → ~1.2s
  • download: ~1.8s → ~1.1s
  • du: ~1.5s → ~0.9s
  • ls: ~2s → ~1.6s
  • mv: ~2.9s → ~2.3s
  • rm: ~1.6s → ~1s

@@ -77,6 +76,7 @@ def get_files(ctx, workflow, _filter,
Examples: \n
\t $ reana-client ls --workflow myanalysis.42
"""
import tablib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW another improvement like importing tablib is related to parsing yaml, which is very slow. Do we need to use FullLoader? Cannot we use CLoader? It would be much faster:

In [19]: %timeit with open('./reana_client/schemas/reana_analysis_schema.json', 'r') as afile: res_f = yaml.load(afile, Loader=yaml.FullLoader)
24 ms ± 376 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [20]: %timeit with open('./reana_client/schemas/reana_analysis_schema.json', 'r') as afile: res_c = yaml.load(afile, Loader=yaml.CLoader)
1.87 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it and didn't notice significant differences in performance. Perhaps reana.yaml should be bigger to notice them? Any example to better test it?

@mvidalgarcia mvidalgarcia force-pushed the 354-optimization branch 3 times, most recently from f9fe724 to 4fe0805 Compare January 21, 2020 10:31
@mvidalgarcia mvidalgarcia marked this pull request as ready for review January 21, 2020 10:31
@tiborsimko
Copy link
Member

Nice improvements! And we have a room for more, e.g. compare reana-client version with reana-dev version:

$ python -m cProfile -o rc.prof ~/.virtualenvs/reana/bin/reana-client version
$ echo -e 'sort cumtime\nstats' | python -m pstats rc.prof | less
         453917 function calls (443016 primitive calls) in 1.018 seconds
   ...
        1    0.000    0.000    0.506    0.506 /home/simko/private/project/reana/src/reana-client/reana_client/utils.py:8(<module>)
        1    0.000    0.000    0.500    0.500 /home/simko/.virtualenvs/reana/lib/python3.8/site-packages/yadageschemas/__init__.py:1(<module>)
        1    0.000    0.000    0.270    0.270 /home/simko/.virtualenvs/reana/lib/python3.8/site-packages/pkg_resources/__init__.py:2(<module>)
        1    0.000    0.000    0.236    0.236 /home/simko/.virtualenvs/reana/lib/python3.8/site-packages/jsonschema/__init__.py:1(<module>)
     2151    0.008    0.000    0.235    0.000 /home/simko/.virtualenvs/reana/lib/python3.8/re.py:287(_compile)
      172    0.000    0.000    0.226    0.001 /home/simko/.virtualenvs/reana/lib/python3.8/re.py:248(compile)

with ~0.5 seconds spent in yadage schemas and ~0.3 seconds with the regular expression business handling (2K compilations!), while reana-dev is ultra fast here:

$ python -m cProfile -o rc.prof ~/.virtualenvs/reana/bin/reana-dev version
$ echo -e 'sort cumtime\nstats' | python -m pstats rc.prof | less
28090 function calls (27441 primitive calls) in 0.098 seconds

We can think of these further improvements later.

@diegodelemos
Copy link
Member

All commands tested, everything working except reana-client delete (see here). Very smooth 🏄‍♂️

* Client operations are expensive, moving the client
  imports inside the functions that use them reduces
  loading time as it only performs the import when it's
  really needed
@diegodelemos diegodelemos merged commit 0b425c0 into reanahub:master Jan 22, 2020
@mvidalgarcia mvidalgarcia deleted the 354-optimization branch October 19, 2020 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cli: improve start up time
3 participants