Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r.mfilter: implement parallelization with OpenMP #1708

Merged
merged 15 commits into from
Jan 12, 2022

Conversation

aaronsms
Copy link
Contributor

@aaronsms aaronsms commented Jul 4, 2021

Checklists before merging:

  • code review
  • CI passes
  • performance section in documentation
  • confirm values in test are from the old version (run new tests with old code)
  • run tests without openmp
  • visual check of results with custom data ("looks good" with non NC SPM dataset)
  • check that it works with really large data (16B cells)
  • run multi-core benchmark (no degraded performance with many threads)
  • run one core benchmark on many resolutions or many cell
  • rebase to main

@aaronsms
Copy link
Contributor Author

aaronsms commented Jul 5, 2021

The previous workflow run has all passsing checks, except for OSGeo4W, where the download link is no longer available.

@wenzeslaus wenzeslaus added this to In progress in Raster Parallelization GSoC 2021 via automation Jul 6, 2021
@wenzeslaus wenzeslaus added enhancement New feature or request gsoc Reserved for Google Summer of Code student(s) labels Jul 6, 2021
Copy link
Contributor

@marisn marisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks OK. I only could not run some tests as I don't have "slope" map – would be nice to replace with one present in the basic version of NC location.

filter = self.create_filter(self.filter_options["sequential"])
self.assertModule(
"r.mfilter",
input="slope",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My version of nc_basic_spm_grass7 location does not have a slope map. Can it be replaced with something else to allow testing on a basic location?

Copy link
Member

@HuidaeCho HuidaeCho Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronsms Slope is missing in the basic location. It can be a good idea to use the basic location for writing test scripts.

@marisn Any specific reasons why you don't (cannot?) use the full nc_spm_08_grass7?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will replace 'slope' with 'lakes'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not elevation?

@petrasovaa
Copy link
Contributor

I tested it with bigger data and it crashes with Error reading temporary file. I run it on a slope map (derived from US 30m DEM) and it shows up with region with
16.832.104.560 cells and nprocs=1 and with 900.000.000 cells and nprocs=2. I ran a benchmark with this data successfully for 1.000.000 cells.
I also ran it with the original version of r.mfilter and that works fine.

@HuidaeCho
Copy link
Member

16.832.104.560 cells and nprocs=1

@aaronsms Please check nprocs=1 code again. In this case, I think we need to revert back to the original behavior for nprocs=1. I thought we're not creating more tempfiles for r.mfilter, but are they bigger now even for nrpcos=1?

@petrasovaa
Copy link
Contributor

Also, I ran it with valgrind and it gave me one error, and although it didn't look too related as far as I remember, it's worth checking that.

@aaronsms
Copy link
Contributor Author

aaronsms commented Jul 14, 2021

I thought we're not creating more tempfiles for r.mfilter, but are they bigger now even for npcos=1?

For r.mfilter, we are not changing the default behavior. The original one uses a temporary file as buffer as well.

it shows up with region with
16.832.104.560 cells and nprocs=1 and with 900.000.000 cells and nprocs=2

@petrasovaa, can I clarify that whether you meant, it works with the mentioned two cases or, you achieve different results with (nprocs = 1 vs = 2)? Also, may I also ask at what cell count did it crash with the Error reading temporary files? Thanks

@aaronsms
Copy link
Contributor Author

Currently, I have found some issues when grass is compiled without OpenMP, will follow up soon. But I haven't found issues with the one compiled with OpenMP.

@petrasovaa
Copy link
Contributor

@petrasovaa, can I clarify that whether you meant, it works with the mentioned two cases or, you achieve different results with (nprocs = 1 vs = 2)? Also, may I also ask at what cell count did it crash with the Error reading temporary files? Thanks

16.832.104.560 cells and nprocs=1
and
900.000.000 cells and nprocs=2
failed with the Error reading temporary files

With just 1.000.000 cells everything worked, although I haven't looked at the resulting raster closely.

@aaronsms
Copy link
Contributor Author

@petrasovaa I have solved the overflow issues, so it should now work for larger raster map as per previous behavior.

@petrasovaa
Copy link
Contributor

Here is my benchmark on a slope map of 16.000.000.000 cells and 5x5 filter, only 3 repeats, on a machine with 28 cores:
results

@petrasovaa
Copy link
Contributor

petrasovaa commented Jul 24, 2021

Another benchmark, this time with larger filter (15x15), the rest is the same (slope map of 16.000.000.000 cells, 3 repeats, on a machine with 28 cores):
results15_2

@aaronsms
Copy link
Contributor Author

Another benchmark, this time with larger filter (15x15), the rest is the same (slope map of 16.000.000.000 cells, 3 repeats, on a machine with 28 cores):
results15_2

Hi @petrasovaa, I was wondering if I can make use of this benchmark result for the twitter post to show the result on large map? Thanks.

@HuidaeCho HuidaeCho added the raster Related to raster data processing label Aug 1, 2021
@aaronsms aaronsms changed the title r.mfilter: Add parallel support and test cases r.mfilter: implement parallelization with OpenMP Aug 4, 2021
@wenzeslaus wenzeslaus added this to the 8.2.0 milestone Aug 24, 2021
@petrasovaa
Copy link
Contributor

Question, I can see that in PR #1724 (r.neighbors) you use memory option but not here. Also I can see huge temporary file being created when I run this on large data. @aaronsms are there any improvements in the r.neighbors PR that would be relevant here?

@aaronsms
Copy link
Contributor Author

aaronsms commented Jan 7, 2022

Question, I can see that in PR #1724 (r.neighbors) you use memory option but not here. Also I can see huge temporary file being created when I run this on large data. @aaronsms are there any improvements in the r.neighbors PR that would be relevant here?

For r.mfilter, the original behavior is we use at most 2 temp files for processes requiring 2 or more sequential filters. This PR maintains that behavior of using temp files.

I think there can be improvements from r.neighbor that would be relevant here (similar approach but will change the current behavior), e.g., if the users want to use memory instead of disk. For example, if a chunk of memory is used, then we can propagate from start all the way to finish, before moving on to the next chunk. Could be worth exploring, then it would remove the need for temp files altogether.

@petrasovaa
Copy link
Contributor

Thanks for replying, now I remember I ran into this before. Since this was the original behavior, I plan to merge this as it is now.

@petrasovaa
Copy link
Contributor

Tested again, with 16.000.000.000 cells (US digital elevation model), 2 different window sizes, 3 repeats:
rmfilter_DEM

This is result of the benchmark from this PR:

rmfilter_benchmark

@petrasovaa
Copy link
Contributor

I updated manual page with performance section, not sure if there is something else that needs to go in there.
It would be useful to have a keyword for parallel modules, I think this was even discussed, not sure if there was any conclusion. I suggest parallel, but I am open to other options.

@neteler
Copy link
Member

neteler commented Jan 10, 2022

It would be useful to have a keyword for parallel modules, I think this was even discussed, not sure if there was any conclusion. I suggest parallel, but I am open to other options.

Very good idea! Another keyword might be multicore but parallel sounds better to me. Others?

@nilason
Copy link
Contributor

nilason commented Jan 10, 2022

parallel

👍

@petrasovaa petrasovaa merged commit 829768d into OSGeo:main Jan 12, 2022
Raster Parallelization GSoC 2021 automation moved this from In progress to Done Jan 12, 2022
ninsbl pushed a commit to ninsbl/grass that referenced this pull request Oct 26, 2022
Co-authored-by: Anna Petrasova <kratochanna@gmail.com>
ninsbl pushed a commit to ninsbl/grass that referenced this pull request Feb 17, 2023
Co-authored-by: Anna Petrasova <kratochanna@gmail.com>
neteler pushed a commit to nilason/grass that referenced this pull request Nov 7, 2023
Co-authored-by: Anna Petrasova <kratochanna@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gsoc Reserved for Google Summer of Code student(s) raster Related to raster data processing
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

7 participants