New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
anuga parallel reducing Geospatial data reading timing #120
Comments
Hi Tejul,
Please send a copy of your python script to see what you are doing. But
my guess is that you are actually producing 4 serial domains with the
anuga.create_domain_from_regions command.
The command anuga.create_domain_from_regions should be inside an if
statement
if anuga.myid == 0:
domain = anuga.create_domain_from_regions( )
else:
domain = None
Cheers
Steve
…On 02/02/17 16:09, TejuJ wrote:
Hello
I am using anuga in my project for model simulation and successfully
run my data in serial and parallel(4 cores) but when I am comparing
result output timings
using DEM and creating domain by anuga.create_domain_from_regions
Geospatial_data: Reading block taking 24 mins in serial and 36 mins in
parallel(4 cores)
why that much difference is there and what is the actual process while
reading Geospatial data.
I checked code in anuga directory geospatial_data.py
can anyone help how to reduce data reading time in parallel.
Regards,
Tejaswi
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#120>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AFTlJLurkK6pp_bp9J6ynwI82QAOijn9ks5rYWUSgaJpZM4L0ubg>.
--
======================================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 2005 P.A.P. Moran Building #26B
The Australian National University
Canberra, ACT 2601
AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
|
Hi Tejaswi,
It looks like you are doing the right thing. I am surprised that the
initial setup of sequential domain is taking some much longer when run
in parallel. You might want to check the processes running, (I would use
htop on ubuntu) to see if there is another process running.
How many cores does your computer have? I have observed that you only
get speedups, up to the physical number of cores.
By the way, as you have a dem file you should be able to use that
(instead of the pts file) to setup the quantitiy. Ie. something like
this should work
domain.set_quantity('elevation',
filename=name_stem + '.dem',
use_cache=cache,
verbose=verbose)
instead of
domain.set_quantity('elevation',
filename=name_stem + '.pts',
use_cache=cache,
verbose=verbose,
alpha=0.1)
You do have to be careful that the dem file covers your computational
domain (otherwise NaN can be applied to the quantity).
Cheers
Steve
…On 02/02/17 17:05, TejuJ wrote:
here is attachment
parallel.txt
<https://github.com/GeoscienceAustralia/anuga_core/files/746933/parallel.txt>
and I created domain like that only
thank you for reply
Tejaswi.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#120 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFTlJIkYJ7bgt7xFsrMKe_f7Ky-WimGQks5rYXIjgaJpZM4L0ubg>.
--
======================================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 2005 P.A.P. Moran Building #26B
The Australian National University
Canberra, ACT 2601
AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
|
Hi Steve Thanks Tejaswi |
Hi Tejaswi,
If we have a dem file, the data is defined on a uniform grid, and so it is quick to interpolate the elevation data onto the centroids or vertices of the unstructured triangulation.
If we have a pts file then we don't assume any structure to the data set. So we have the problem of interpolating one set of unstructured data onto an unstructured triangulation. The most time consuming part of pts data fitting is the process of finding which data points are within which triangles (this uses a quad tree algorithm). It may happen that there are many triangles that have no elevation data in or close to them. The alpha is used to control the smoothness of the function filling in the regions with holes in the data.
Cheers
Steve
==============================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 2005 P.A.P. Moran Building #26B
The Australian National University
Canberra, ACT 2601 AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
…________________________________
From: TejuJ <notifications@github.com>
Sent: Friday, 3 February 2017 10:10:44 PM
To: GeoscienceAustralia/anuga_core
Cc: Stephen Roberts; Comment
Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120)
Hi Steve
giving .dem as a input worked for me, but why the process takes extra time while we give input as a pts and I am using bash on windows 10 and analyse by task manager
and one more question as now I understood alpha is for surface smoothness what are the parameters we pass alpha =0.1 what it means and how we decide what to pass there
Thanks
Tejaswi
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJBEEff1nSq7t29x7XbZzWxh1z2D0ks5rYws0gaJpZM4L0ubg>.
|
Hi Steve Regards, |
Hi TejuJ,
I have installed anuga on a CentOS system.
I first had to ensure all of the dependencies were installed (which you can deduce from the install_ubuntu.sh script).
THis is essentially python2 with numpy and scipy, gdal, openmpi or mpich(2). Need to setup your default compiler to be compatible with these packages.
Then should be able to setup anuga.
I setup anuga to run "inplace" as on the centOS system I was using I didn't have admin rights.
In this case you need to setup PYTHONPATH to point to the location of your anuga_core directory, and run
python setup.py build_ext --inplace
to compile the appropriate anuga C programs.
Should be able to then test as normal via
python runtests.py
Cheers
Steve
==============================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 2005 P.A.P. Moran Building #26B
The Australian National University
Canberra, ACT 2601 AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
…________________________________
From: TejuJ <notifications@github.com>
Sent: Thursday, 23 February 2017 4:11:33 PM
To: GeoscienceAustralia/anuga_core
Cc: Stephen Roberts; Comment
Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120)
Hi Steve
Thanks, I understood why serial takes lesser time than parallel and one more question can I install anuga on CentOS and if I want to install what changes I have to do.
Regards,
Tejaswi
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJNN4m23XWegL29pPN5G-Rfmnezhcks5rfRUFgaJpZM4L0ubg>.
|
Thanks Steve.... I'll try and ill post back is there any other errors.... |
Hello Steve What changes I have to do for parallel installation? Regards, |
Hi Tejaswi,
The environment variable ANUGA_PARALLEL is just used in the installation script tools/install_ubuntu.sh
What you need to do is setup openmpi first and then install anuga.
Look in the script install_ubuntu.sh
First it checks if MPI is already installed. Then it uses apt-get to install the preferred MPI (openmpi in your case). So you need to manually install openmpi.
Then you need to install pypar. Follow the code in the shell script.
And then install anuga
If you want to install inplace then you will need to run
python setup.py build_ext --inplace
But you will also have to set up your PYTHONPATH to point to your anuga_core directory.
export PYTHONPATH=/path/to/anuga_core:$PYTHONPATH
(should stick that in your .bashrc file)
Running
python setup.py install
will rebuild and place anuga in the system site_packages directory (or perhaps your local site_packages).
Let me know if you have problems.
Cheers
Steve
==============================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 1176 John Dedman Building #27
The Australian National University
Canberra, ACT 2601 AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
…________________________________
From: TejuJ <notifications@github.com>
Sent: Monday, 12 June 2017 2:58:19 PM
To: GeoscienceAustralia/anuga_core
Cc: Stephen Roberts; Comment
Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120)
Hello Steve
I installed anuga on centos using fo;;owing commands
export ANUGA_PARALLEL="openmpi"
python setup.py build_ext --inplace
python setup.py install
python runtests.py
But while running parallel program its executing serially.
What changes I have to do for parallel installation?
Regards,
Tejaswi
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJO2xPW0i1WvAoGOjof7CsjrgeUbFks5sDMVrgaJpZM4L0ubg>.
|
Thank you. I installed successfully but running parallel program getting following error -bash-4.1$ mpirun -np 4 python run_parallel_sw_merimbula.py |
Hi Tejaswi,
Not sure what is the problem.
You should check that mpirun is working Try something simple like
mpirun -np 4 pwd
Then should see if pypar is working.
Go to your pypar/tests directory and try
mpirun -np 2 test_pypar.py
I actually get an error running this, something to do with status. But you should see whether pypar has been installed correctly.
What results do you get?
Cheers
Steve
==============================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 1176 John Dedman Building #27
The Australian National University
Canberra, ACT 2601 AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
…________________________________
From: TejuJ <notifications@github.com>
Sent: Tuesday, 13 June 2017 3:08:02 PM
To: GeoscienceAustralia/anuga_core
Cc: Stephen Roberts; Comment
Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120)
Thank you. I installed successfully but running parallel program getting following error
-bash-4.1$ mpirun -np 4 python run_parallel_sw_merimbula.py
Attempting to use an MPI routine before initializing MPI
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJFfnvuqZzCPjbJgaekbztcFvgPX0ks5sDhkygaJpZM4L0ubg>.
|
Hi Steve |
Hi Tejaswi,
You might find it sudeful to join hte mailing list at
https://lists.sourceforge.net/lists/listinfo/anuga-user
to get help from more people.
With your particular error. Are you also obtaining it when not running in parallel?
The error message, though cryptic, seems to indicate that the input boundary data has a duplicate entry. I suggest to look at your boundary definition. If I get time I might look at the procedure to see if I can provide more info about the error.
Cheers
Steve
==============================
Stephen Roberts
Undergraduate Convenor
Mathematical Sciences Institute
Room 1176 John Dedman Building #27
The Australian National University
Canberra, ACT 2601 AUSTRALIA
Ph: +61 2 61254445
CRICOS: 00120C
…________________________________
From: TejuJ <notifications@github.com>
Sent: Wednesday, 14 June 2017 10:25:00 PM
To: GeoscienceAustralia/anuga_core
Cc: Stephen Roberts; Comment
Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120)
Hi Steve
Because of pypar installation i got that error now I installed sample program running properly but while running my program getting following error
Domain: Initialising
Pmesh_to_Domain: Initialising
Traceback (most recent call last):
File "sikkim_whole_parallel.py", line 34, in
verbose=verbose)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/extras.py", line 162, in create_domain_from_regions
args, kwargs)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/extras.py", line 205, in _create_domain_from_regions
domain = Domain(mesh_filename, use_cache=False, verbose=verbose)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/shallow_water/shallow_water_domain.py", line 218, in init
ghost_layer_width=ghost_layer_width)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/generic_domain.py", line 97, in init
verbose=verbose)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 101, in pmesh_to_domain
result = apply(_pmesh_to_domain, (file_name, mesh_instance))
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 136, in _pmesh_to_domain
tag_dict = pmesh_dict_to_tag_dict(mesh_dict)
File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 215, in pmesh_dict_to_tag_dict
tag_dict = build_boundary_dictionary(triangles, segments, segment_tags, tag_dict)
RuntimeError: pmesh2domain.c: build_boundary_dictionary Duplicate segments
Thanks.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJHM7vshETjw7FPr6aUES34iG1yW2ks5sD9EcgaJpZM4L0ubg>.
|
Hello
I am using anuga in my project for model simulation and successfully run my data in serial and parallel(4 cores) but when I am comparing result output timings
using DEM and creating domain by anuga.create_domain_from_regions
Geospatial_data: Reading block taking 24 mins in serial and 36 mins in parallel(4 cores)
why that much difference is there and what is the actual process while reading Geospatial data.
I checked code in anuga directory geospatial_data.py
can anyone help how to reduce data reading time in parallel.
Regards,
Tejaswi
The text was updated successfully, but these errors were encountered: