anuga parallel reducing Geospatial data reading timing #120

TejaswiRJadhav · 2017-02-02T05:09:38Z

Hello
I am using anuga in my project for model simulation and successfully run my data in serial and parallel(4 cores) but when I am comparing result output timings
using DEM and creating domain by anuga.create_domain_from_regions

Geospatial_data: Reading block taking 24 mins in serial and 36 mins in parallel(4 cores)

why that much difference is there and what is the actual process while reading Geospatial data.
I checked code in anuga directory geospatial_data.py

can anyone help how to reduce data reading time in parallel.

Regards,
Tejaswi

stoiver · 2017-02-02T05:28:52Z

Hi Tejul, Please send a copy of your python script to see what you are doing. But my guess is that you are actually producing 4 serial domains with the anuga.create_domain_from_regions command. The command anuga.create_domain_from_regions should be inside an if statement if anuga.myid == 0: domain = anuga.create_domain_from_regions( ) else: domain = None Cheers Steve

…

On 02/02/17 16:09, TejuJ wrote: Hello I am using anuga in my project for model simulation and successfully run my data in serial and parallel(4 cores) but when I am comparing result output timings using DEM and creating domain by anuga.create_domain_from_regions Geospatial_data: Reading block taking 24 mins in serial and 36 mins in parallel(4 cores) why that much difference is there and what is the actual process while reading Geospatial data. I checked code in anuga directory geospatial_data.py can anyone help how to reduce data reading time in parallel. Regards, Tejaswi — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#120>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFTlJLurkK6pp_bp9J6ynwI82QAOijn9ks5rYWUSgaJpZM4L0ubg>.

-- ====================================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 2005 P.A.P. Moran Building #26B The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

stoiver · 2017-02-02T23:09:06Z

Hi Tejaswi, It looks like you are doing the right thing. I am surprised that the initial setup of sequential domain is taking some much longer when run in parallel. You might want to check the processes running, (I would use htop on ubuntu) to see if there is another process running. How many cores does your computer have? I have observed that you only get speedups, up to the physical number of cores. By the way, as you have a dem file you should be able to use that (instead of the pts file) to setup the quantitiy. Ie. something like this should work domain.set_quantity('elevation', filename=name_stem + '.dem', use_cache=cache, verbose=verbose) instead of domain.set_quantity('elevation', filename=name_stem + '.pts', use_cache=cache, verbose=verbose, alpha=0.1) You do have to be careful that the dem file covers your computational domain (otherwise NaN can be applied to the quantity). Cheers Steve

…

On 02/02/17 17:05, TejuJ wrote: here is attachment parallel.txt <https://github.com/GeoscienceAustralia/anuga_core/files/746933/parallel.txt> and I created domain like that only thank you for reply Tejaswi. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#120 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFTlJIkYJ7bgt7xFsrMKe_f7Ky-WimGQks5rYXIjgaJpZM4L0ubg>.

-- ====================================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 2005 P.A.P. Moran Building #26B The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

TejaswiRJadhav · 2017-02-03T11:10:43Z

Hi Steve
giving .dem as a input worked for me, but why the process takes extra time while we give input as a pts and I am using bash on windows 10, analyse by task manager
one more question as now I understood alpha is for surface smoothness what are the parameters we pass alpha =0.1 what it means and how we decide what to pass there

Thanks

Tejaswi

stoiver · 2017-02-03T14:30:38Z

Hi Tejaswi, If we have a dem file, the data is defined on a uniform grid, and so it is quick to interpolate the elevation data onto the centroids or vertices of the unstructured triangulation. If we have a pts file then we don't assume any structure to the data set. So we have the problem of interpolating one set of unstructured data onto an unstructured triangulation. The most time consuming part of pts data fitting is the process of finding which data points are within which triangles (this uses a quad tree algorithm). It may happen that there are many triangles that have no elevation data in or close to them. The alpha is used to control the smoothness of the function filling in the regions with holes in the data. Cheers Steve ============================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 2005 P.A.P. Moran Building #26B The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

…

________________________________ From: TejuJ <notifications@github.com> Sent: Friday, 3 February 2017 10:10:44 PM To: GeoscienceAustralia/anuga_core Cc: Stephen Roberts; Comment Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120) Hi Steve giving .dem as a input worked for me, but why the process takes extra time while we give input as a pts and I am using bash on windows 10 and analyse by task manager and one more question as now I understood alpha is for surface smoothness what are the parameters we pass alpha =0.1 what it means and how we decide what to pass there Thanks Tejaswi — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJBEEff1nSq7t29x7XbZzWxh1z2D0ks5rYws0gaJpZM4L0ubg>.

TejaswiRJadhav · 2017-02-23T05:11:32Z

Hi Steve
Thanks, I understood why serial takes lesser time than parallel and one more question can I install anuga on CentOS and if I want to install what changes I have to do.

Regards,
Tejaswi

stoiver · 2017-02-25T02:08:38Z

Hi TejuJ, I have installed anuga on a CentOS system. I first had to ensure all of the dependencies were installed (which you can deduce from the install_ubuntu.sh script). THis is essentially python2 with numpy and scipy, gdal, openmpi or mpich(2). Need to setup your default compiler to be compatible with these packages. Then should be able to setup anuga. I setup anuga to run "inplace" as on the centOS system I was using I didn't have admin rights. In this case you need to setup PYTHONPATH to point to the location of your anuga_core directory, and run python setup.py build_ext --inplace to compile the appropriate anuga C programs. Should be able to then test as normal via python runtests.py Cheers Steve ============================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 2005 P.A.P. Moran Building #26B The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

…

________________________________ From: TejuJ <notifications@github.com> Sent: Thursday, 23 February 2017 4:11:33 PM To: GeoscienceAustralia/anuga_core Cc: Stephen Roberts; Comment Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120) Hi Steve Thanks, I understood why serial takes lesser time than parallel and one more question can I install anuga on CentOS and if I want to install what changes I have to do. Regards, Tejaswi — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJNN4m23XWegL29pPN5G-Rfmnezhcks5rfRUFgaJpZM4L0ubg>.

TejaswiRJadhav · 2017-02-27T06:08:17Z

Thanks Steve.... I'll try and ill post back is there any other errors....

TejaswiRJadhav · 2017-06-12T04:58:18Z

Hello Steve
I installed anuga on centos using following commands
export ANUGA_PARALLEL="openmpi"
python setup.py build_ext --inplace
python setup.py install
python runtests.py
But while running parallel program its executing serially.

What changes I have to do for parallel installation?
I run test_all.py skipping parallel package.

Regards,
Tejaswi

stoiver · 2017-06-12T22:14:33Z

Hi Tejaswi, The environment variable ANUGA_PARALLEL is just used in the installation script tools/install_ubuntu.sh What you need to do is setup openmpi first and then install anuga. Look in the script install_ubuntu.sh First it checks if MPI is already installed. Then it uses apt-get to install the preferred MPI (openmpi in your case). So you need to manually install openmpi. Then you need to install pypar. Follow the code in the shell script. And then install anuga If you want to install inplace then you will need to run python setup.py build_ext --inplace But you will also have to set up your PYTHONPATH to point to your anuga_core directory. export PYTHONPATH=/path/to/anuga_core:$PYTHONPATH (should stick that in your .bashrc file) Running python setup.py install will rebuild and place anuga in the system site_packages directory (or perhaps your local site_packages). Let me know if you have problems. Cheers Steve ============================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 1176 John Dedman Building #27 The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

…

________________________________ From: TejuJ <notifications@github.com> Sent: Monday, 12 June 2017 2:58:19 PM To: GeoscienceAustralia/anuga_core Cc: Stephen Roberts; Comment Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120) Hello Steve I installed anuga on centos using fo;;owing commands export ANUGA_PARALLEL="openmpi" python setup.py build_ext --inplace python setup.py install python runtests.py But while running parallel program its executing serially. What changes I have to do for parallel installation? Regards, Tejaswi — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJO2xPW0i1WvAoGOjof7CsjrgeUbFks5sDMVrgaJpZM4L0ubg>.

TejaswiRJadhav · 2017-06-13T05:08:01Z

Thank you. I installed successfully but running parallel program getting following error

-bash-4.1$ mpirun -np 4 python run_parallel_sw_merimbula.py
Attempting to use an MPI routine before initializing MPI

stoiver · 2017-06-13T07:25:01Z

Hi Tejaswi, Not sure what is the problem. You should check that mpirun is working Try something simple like mpirun -np 4 pwd Then should see if pypar is working. Go to your pypar/tests directory and try mpirun -np 2 test_pypar.py I actually get an error running this, something to do with status. But you should see whether pypar has been installed correctly. What results do you get? Cheers Steve ============================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 1176 John Dedman Building #27 The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

…

________________________________ From: TejuJ <notifications@github.com> Sent: Tuesday, 13 June 2017 3:08:02 PM To: GeoscienceAustralia/anuga_core Cc: Stephen Roberts; Comment Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120) Thank you. I installed successfully but running parallel program getting following error

-bash-4.1$ mpirun -np 4 python run_parallel_sw_merimbula.py Attempting to use an MPI routine before initializing MPI — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJFfnvuqZzCPjbJgaekbztcFvgPX0ks5sDhkygaJpZM4L0ubg>.

TejaswiRJadhav · 2017-06-14T12:24:58Z

Hi Steve
Thanks. I installed anuga successfully on centos.

stoiver · 2017-06-15T23:24:13Z

Hi Tejaswi, You might find it sudeful to join hte mailing list at https://lists.sourceforge.net/lists/listinfo/anuga-user to get help from more people. With your particular error. Are you also obtaining it when not running in parallel? The error message, though cryptic, seems to indicate that the input boundary data has a duplicate entry. I suggest to look at your boundary definition. If I get time I might look at the procedure to see if I can provide more info about the error. Cheers Steve ============================== Stephen Roberts Undergraduate Convenor Mathematical Sciences Institute Room 1176 John Dedman Building #27 The Australian National University Canberra, ACT 2601 AUSTRALIA Ph: +61 2 61254445 CRICOS: 00120C

…

________________________________ From: TejuJ <notifications@github.com> Sent: Wednesday, 14 June 2017 10:25:00 PM To: GeoscienceAustralia/anuga_core Cc: Stephen Roberts; Comment Subject: Re: [GeoscienceAustralia/anuga_core] anuga parallel reducing Geospatial data reading timing (#120) Hi Steve Because of pypar installation i got that error now I installed sample program running properly but while running my program getting following error Domain: Initialising Pmesh_to_Domain: Initialising Traceback (most recent call last): File "sikkim_whole_parallel.py", line 34, in verbose=verbose) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/extras.py", line 162, in create_domain_from_regions args, kwargs) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/extras.py", line 205, in _create_domain_from_regions domain = Domain(mesh_filename, use_cache=False, verbose=verbose) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/shallow_water/shallow_water_domain.py", line 218, in init ghost_layer_width=ghost_layer_width) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/generic_domain.py", line 97, in init verbose=verbose) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 101, in pmesh_to_domain result = apply(_pmesh_to_domain, (file_name, mesh_instance)) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 136, in _pmesh_to_domain tag_dict = pmesh_dict_to_tag_dict(mesh_dict) File "/home/internal/geomatics/sivakumar/anuga_installation/python27/lib/python2.7/site-packages/anuga/abstract_2d_finite_volumes/pmesh2domain.py", line 215, in pmesh_dict_to_tag_dict tag_dict = build_boundary_dictionary(triangles, segments, segment_tags, tag_dict) RuntimeError: pmesh2domain.c: build_boundary_dictionary Duplicate segments Thanks. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#120 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFTlJHM7vshETjw7FPr6aUES34iG1yW2ks5sD9EcgaJpZM4L0ubg>.

stoiver closed this as completed Sep 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

anuga parallel reducing Geospatial data reading timing #120

anuga parallel reducing Geospatial data reading timing #120

TejaswiRJadhav commented Feb 2, 2017

stoiver commented Feb 2, 2017 via email

stoiver commented Feb 2, 2017 via email

TejaswiRJadhav commented Feb 3, 2017 •

edited

stoiver commented Feb 3, 2017 via email

TejaswiRJadhav commented Feb 23, 2017

stoiver commented Feb 25, 2017 via email

TejaswiRJadhav commented Feb 27, 2017

TejaswiRJadhav commented Jun 12, 2017 •

edited

stoiver commented Jun 12, 2017 via email

TejaswiRJadhav commented Jun 13, 2017

stoiver commented Jun 13, 2017 via email

TejaswiRJadhav commented Jun 14, 2017 •

edited

stoiver commented Jun 15, 2017 via email

anuga parallel reducing Geospatial data reading timing #120

anuga parallel reducing Geospatial data reading timing #120

Comments

TejaswiRJadhav commented Feb 2, 2017

stoiver commented Feb 2, 2017 via email

stoiver commented Feb 2, 2017 via email

TejaswiRJadhav commented Feb 3, 2017 • edited

stoiver commented Feb 3, 2017 via email

TejaswiRJadhav commented Feb 23, 2017

stoiver commented Feb 25, 2017 via email

TejaswiRJadhav commented Feb 27, 2017

TejaswiRJadhav commented Jun 12, 2017 • edited

stoiver commented Jun 12, 2017 via email

TejaswiRJadhav commented Jun 13, 2017

stoiver commented Jun 13, 2017 via email

TejaswiRJadhav commented Jun 14, 2017 • edited

stoiver commented Jun 15, 2017 via email

TejaswiRJadhav commented Feb 3, 2017 •

edited

TejaswiRJadhav commented Jun 12, 2017 •

edited

TejaswiRJadhav commented Jun 14, 2017 •

edited