Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark for CASTf90 analogs process #48

Closed
nilshempelmann opened this issue Jun 20, 2016 · 12 comments
Closed

benchmark for CASTf90 analogs process #48

nilshempelmann opened this issue Jun 20, 2016 · 12 comments
Assignees
Milestone

Comments

@nilshempelmann
Copy link
Member

No description provided.

@sradanov
Copy link
Member

Computer: 4 CPUs, system fedora
Case1: 2years simulation, with 40 years archive, counting only castf90 execution part, not data preparation.
castf90 home edition called from terminal: 10s (22s with 1 CPU)
castf90 conda edition called from terminal: 12s
castf90 home edition called from python command line with os.system: 12s
castf90 home edition called from python command line with subprocess: 15s
castf90 called in analogs process called with birdy: ~47s
castf90 called in analogs process via phoenix GUI: 47s
In all cases (except the 1cpu case) the system monitor shows a CPU activity at 100% for all 4 CPUs during the indicated time.
Case2: 10years simulation, with 40 years archive,...
castf90 home edition called from terminal: 16s (42s with 1 CPU)
castf90 conda edition called from terminal: 14s
castf90 home edition called from python command line with os.system: 15s
castf90 home edition called from python command line with subprocess: 14s
castf90 called in analogs process called with birdy: ~3:10min minimum, probably longer
castf90 called in analogs process via phoenix GUI: 3:41

@sradanov
Copy link
Member

Case1
commented unused processes in flyingpigeon/flyingpigeon/processes/init.py
castf90 called in analogs process via phoenix GUI: 49s
--> this one doesn't have a lot of influence.

@cehbrecht
Copy link
Member

you can use owslib to call the process directly (without the phoenix gui or birdy comand line). There is an example as ipython notebook:

https://github.com/bird-house/flyingpigeon/blob/master/notebooks/WPS_analogs.ipynb

You can call the process async (default) or synchronously. In the notebook example async took 27 secs and sync 18 secs.

I have added some measuring in the analogs process:

https://github.com/bird-house/flyingpigeon/blob/master/flyingpigeon/processes/wps_analogs.py

In the log you can see duration for each step. Make sure you have debug log mode enabled:

  • edit custom.cfg

    [settings]
    log-level = DEBUG

  • make update

  • make restart

See the log with the performance:

$ cd ~/.conda/envs/birdhouse/
$ tail -f var/log/pywps/flyingpigeon.log | grep took

In my case it looks like this:

DEBUG: init took 0.00246214866638 seconds.
DEBUG: get_input_data took 14.1966841221 seconds.
DEBUG: data preperation took 0.612922906876 seconds.
DEBUG: write_config took 0.000335931777954 seconds.
DEBUG: castf90 took 2.40370893478 seconds.
DEBUG: total execution took 17.2252738476 seconds.

So ... the time goes into get_input_data preparation.

@sradanov
Copy link
Member

Well, Castf90 was already writing the time when it starts (after data preparation) and the time when it ends into the log file, so the times I indicated are WITHOUT data preparation, which is an issue apart. Thanks for the notebook!

@cehbrecht
Copy link
Member

Case2: 10years simulation, with 40 years archive,...
castf90 conda edition called from terminal: 14s

so 14 secs is the time we should expect here. If you only measure the execution time of castf90 in the wps process ... it should be the same. I don't understand how it can be minutes ...

Do we need to set some environment variables?

One could check only this code part in the process ...

If you run the wps service then use owslib with sync mode ... (birdy and phoenix only use async mode).

Can i reproduce this use case on my laptop? Do you have the input parameters?

@sradanov
Copy link
Member

execution time: Well yes I agree it should, but it isn't. That's the issue.

environment variables: Since it already occupies all my 4 CPUs, for me it was not necessary to set environment variables. If it doesn't, one may set OMP_NUM_THREADS=4 (or what ever the number of threads to use...)

I tried sync mode using
execute = wps.execute(identifier="analogs", inputs=[("dateSt",'2013-01-01'),("refEn",'1995-12-31')], async=False)
(is that how it works?)
Which did not change anything substantial: 42s with async=True, 42s with async=False.

laptop: of course you can. inputs=[("dateSt",'2013-01-01'),("refEn",'1995-12-31')], the default ones for the other parameters for case 1 and inputs=[("dateSt",'2005-01-01'),("refEn",'1995-12-31')], for case 2.

What exactly do you mean by checking only the code part in the process?

@sradanov
Copy link
Member

PS: the numbers are for case1

@cehbrecht
Copy link
Member

i have run use case 1 and wps-call took 45 secs for the castf90 call.

Then i took the produced config file and the same preprocessed input data and have run the conda script on the command line ... and it took 42 secs.

So ... is there something wrong with the config file or the preprocessed input data?

@nilshempelmann
the archive and simulation input files are written to the etc/pywps folder ... could not see why .. open a ticket for it.

@cehbrecht
Copy link
Member

i've updated the notebook with both use cases:

https://github.com/bird-house/flyingpigeon/blob/master/notebooks/WPS_analogs.ipynb

async mode is better in this long running case ;)

@sradanov
Copy link
Member

ok, aparently I messed up something with the direct conda package call. I now have 42s as well for case 1, which means everything is ok with the wps, but the conda package is slow compared to the classic command line castf90.

@cehbrecht
Copy link
Member

so we need to update the cast conda package ( #37 ) and find out how to improve the performance of it.

As a workaround one can use the manual installation of the castf90 ...

@nilshempelmann
Copy link
Member Author

CASTf90 seems to be very fast.
Performance bottleneck is now the sub-setting during data preparation (#80)

@nilshempelmann nilshempelmann modified the milestone: v1.0 Sep 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants