-
-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow larger regions in r.terraflow #31
Conversation
…ovided by dnewcomb
This is exactly the change where a test of computation correctness based on previous results would be appropriate and more than welcome. Would you or somebody else be willing to add one? https://grass.osgeo.org/grass76/manuals/libpython/gunittest_testing.html |
Very reasonable. ToDo:
A test of computational correctness would of course only check if the change affects behavior within current limits. |
(At least) two options for a synthetic DEM:
|
raster/r.terraflow/3scan.h
Outdated
@@ -123,7 +123,7 @@ scan3line(FUN &funobj, | |||
a[2]=*tmp; | |||
} else { | |||
a[2] = nodata; | |||
assert(ae == AMI_ERROR_END_OF_STREAM); | |||
assert((off_t)ae == AMI_ERROR_END_OF_STREAM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why (off_t)ae? ae and AMI_ERROR_END_OF_STREAM are of the same type AMI_err. Applies also to L140
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I just picked it up / copied it from https://trac.osgeo.org/grass/ticket/1421
cause I am no C/C++ programmer.
What should be done?
I can test if this change is not necessary....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing currently on a bigger box with 65kx50k pixels. Without (off_t) I get:
r.terraflow: 3scan.h:140: void scan3line(FUN&, AMI_STREAM<T>*, AMI_STREAM<T>*, AMI_STREAM<T>*, BASETYPE, dimension_type) [with T = float; BASETYPE = float; FUN = detectEdgeNodata; dimension_type = int]: Assertion
ae == AMI_ERROR_END_OF_STREAM' failed.`
Changing back to with (off_t) did not help, unfortunately. So, I also changed back from int to long (original patch in 1421). But now I get:
Header of raster map <DTM_1M_full> compatible with region header
Elevation stored as FLOAT (4B)
Region size is 68385 x 56176
STREAM temporary files in </home/shared/tmp>. THESE INTERMEDIATE STREAMS
WILL NOT BE DELETED IN CASE OF ABNORMAL TERMINATION OF THE PROGRAM. TO SAVE
SPACE PLEASE DELETE THESE FILES MANUALLY!
Memory manager registering memory in MM_WARN_ON_MEMORY_EXCEEDED mode.
Reading data from <DTM_1M_full> to stream </home/shared/tmp/STREAM_gaaaaX>
Reading input data...
100%
EMPQUEUEADAPTIVE: starting in-memory pqueue
EMPQUEUEADAPTIVE: available memory: 100000MB
EMPQUEUEADAPTIVE: desired memory: 100000MB
sz_stream: 270400 buf_arity: 200 mm_overhead: 8666496 mm_avail: 104857600000.
EMPQUEUEADAPTIVE: memory overhead set to 8.26501MB
EMPQUEUEADAPTIVE: pqsize set to 13106116688
EMPQUEUEADAPTIVE: starting in-memory pqueue
EMPQUEUEADAPTIVE: available memory: 100000MB
EMPQUEUEADAPTIVE: desired memory: 100000MB
sz_stream: 270400 buf_arity: 200 mm_overhead: 8666496 mm_avail: 104857600000.
EMPQUEUEADAPTIVE: memory overhead set to 8.26501MB
EMPQUEUEADAPTIVE: pqsize set to 13106116688
EMPQUEUEADAPTIVE: starting in-memory pqueue
EMPQUEUEADAPTIVE: available memory: 100000MB
EMPQUEUEADAPTIVE: desired memory: 100000MB
sz_stream: 270400 buf_arity: 200 mm_overhead: 8666496 mm_avail: 104857600000.
EMPQUEUEADAPTIVE: memory overhead set to 8.26501MB
EMPQUEUEADAPTIVE: pqsize set to 13106116688
EMPQUEUEADAPTIVE: starting in-memory pqueue
EMPQUEUEADAPTIVE: available memory: 100000MB
EMPQUEUEADAPTIVE: desired memory: 100000MB
sz_stream: 270424 buf_arity: 200 mm_overhead: 8705664 mm_avail: 104857600000.
EMPQUEUEADAPTIVE: memory overhead set to 8.30237MB
EMPQUEUEADAPTIVE: pqsize set to 3276527948
terminate called after throwing an instance of 'std::bad_array_new_length'
what(): std::bad_array_new_length
Any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little update:
Now I double checked that changes are exactly as in the original ticket, ran make distclean
and recompiled with debugging flags, requested debugging output (g.gisenv set="DEBUG=3"
) and ran
r.terraflow --overwrite --verbose elevation=DTM_1M_full@p_CulvertFragmentation_stefan.blumentrath filled=DEM_1M_flooded memory=30000 directory=/home/shared/tmp/ stats=/home/shared/tmp/terra_stats.txt &> /home/shared/tmp/dbg.txt
with less RAM assigned to the process (30GB instead of 100GB).
For the time being it seems to succeed beyond where it crashed in the earlier try...
Once I have a successful run, we can try to reintroduce the requested changes in this PR and see if it still works (meaning if the issue is the amount of memory and not long vs. int).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the command above succeeded. Next step will be to check if the issue was caused by memory...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(off_t) in 3scan.h does not seem to be needed. However, int in types.h does not work on LINUX while long does. Would it be a solution to define dimaension_type platform dependent as long on LINUX and long long (?) on Windows?
Error message with INT in types.h
Reading input data...
100%
D3/3: nrows=60012 ncols=45010 stream_len()=2701140120
D1/3: total elements=2701140120, nodata elements=90019
D1/3: largest temporary files:
D1/3: FILL: 150.94G (162068407200) [2701140120 elements, 60B each]
D1/3: FLOW: -133889044380 [2701050101 elements, 84B each]
D1/3: Will need at least 301.88G (324136814400) space available in /data/grassdata/ETRS_33N/culverts/.tmp/localhost/93228.0
r.terraflow: 3scan.h:140: void scan3line(FUN&, AMI_STREAM<T>*, AMI_STREAM<T>*, AMI_STREAM<T>*, BASETYPE, dimension_type) [with T = float; BASETYPE = float; FUN = detectEdgeNodata; dimension_type = int]: Assertion `ae == AMI_ERROR_END_OF_STREAM' failed.
Aborted (core dumped)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason why int as dimension_type fails is integer overflow because of missing/incorrect type casts: at various places the number of cells is computed as nrows * ncols, but in this case nrows * ncols = 2701140120, larger than INT_MAX = 2147483647. E.g.
D1/3: FLOW: -133889044380 [2701050101 elements, 84B each]
is a negative number because of integer overflow. The fix would be in main.cpp
- long long flowmaxsize = (long long)(nrows*ncols - nodata_count)*sizeof(sweepItem);
+ long long flowmaxsize = ((long long)nrows*ncols - nodata_count)*sizeof(sweepItem);
long as dimension_type avoids these integer overflows, but GRASS uses int as type for the number of rows and columns. The real fix would thus be to fix all occurrences where nrows * ncols
or equivalent is not properly converted to a 64 bit integer type. r.terraflow (and the iostream lib) currently use long as 64 bit integer type which is not guaranteed to be 64 bit, sometimes it is 32 bit and thus the code is not portable.
In contrast to other GRASS modules, r.terraflow was not designed to work with more than 2 billion grid cells.
I suggest to use int as dimension_type and fix any resultant integer overflows.
Regarding tests, a one-time test with a large region where the number of rows and columns exceed SHORT_MAX should be ok. |
Tests added based on original version of r.terraflow (without the changes in this PR). Those tests succeed also with the latest changes modified based on the review (@metzm , thanks for taking the time!) |
Still changes requested? |
replaced by #265 |
see https://trac.osgeo.org/grass/ticket/1421
provided by dnewcomb
Tested with 40k x 50k region (though not fully on my limited box).
See e.g. http://osgeo-org.1560.x6.nabble.com/r-terraflow-on-massive-DEM-td5181062.html