r.fillnulls: avoid locale-dependent stderr parsing#6930
r.fillnulls: avoid locale-dependent stderr parsing#6930Abhi-d-gr8 wants to merge 12 commits intoOSGeo:mainfrom
Conversation
bfd66bd to
0881726
Compare
|
Please add a test for it. |
|
Thanks for the clarification @marisn , and apologies for missing the test initially |
|
Added a regression test (test_no_nulls) covering the no-NULL input case to ensure r.fillnulls exits successfully without relying on stderr parsing. CI is running. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
wenzeslaus
left a comment
There was a problem hiding this comment.
I'm wondering if the right solution to the issue is modifying the underlying C tool rather than changing the Python code.
scripts/r.fillnulls/r.fillnulls.py
Outdated
|
|
||
| def raster_has_nulls(raster): | ||
| """Return True if raster has any NULL cells.""" | ||
| stats = gs.parse_command("r.univar", flags="g", map=raster) |
There was a problem hiding this comment.
Indeed. As it requires a full scan (and probably resampling too!) of all computational region, it adds a nasty speed penalty. Then better to keep existing behavior as it works and is reasonably fast. I used r.fillnulls in a pipeline of over 11TB of data – every second counts.
|
Hi ! @wenzeslaus I agree though that fixing this at the source in the underlying C module could be a cleaner long-term solution and benefit other callers as well. I also see your point about the extra I’m still getting familiar with the codebase, so I’m very open to guidance here. If you think adjusting the C tool is the better direction, I’m happy to explore that or rework this PR accordingly. Just let me know what you’d prefer. |
A possible solution might be to change the exit code here: grass/raster/r.resamp.bspline/main.c Line 459 in b6eba06 Exit code for The new exit code can then be checked for in the test. |
Better still, set global |
|
Hi ! @nilason and @wenzeslaus Update: I reworked the fix based on the review feedback and pushed new commits.
Could you please take another look when you get a chance? ... As well as thank you for the insights and happy to work if there are any follow ups or any other issues you think I can work upon |
With |
Wrong topic. Discussion should be: "is it an error to call |
You might be right, but that’s for another PR. |
|
Thanks @nilason and @marisn, this discussion helped me understand the underlying concern much better. I see your point now: the real question isn’t how to detect the no-NULL case, but whether calling r.resamp.bspline -n on a raster without NULLs should be treated as an error at all. From a user point of view it feels like a no-op, but I also get that this touches older design decisions and expected behaviour. For this PR, I don’t want to reopen or change that broader behaviour unexpectedly. So I’m totally fine with keeping G_fatal_error() and not changing the success/failure contract of r.resamp.bspline. If I’m understanding correctly, the preferred approach here would be:
Or alternatively would you like me to go @wenzeslaus :
Please let me know if that sounds right ... I’ll adjust the PR accordingly and if the “no-NULL should just copy and warn” idea is better handled as a separate follow-up PR, I’m happy to leave it out of this one. Thanks again for the guidance. |
Yes, that what I'd recommend for now. Instead of: grass/scripts/r.fillnulls/r.fillnulls.py Line 593 in 9a994f4 check for errno.EINVAL. (Current code actually does the copy in |
|
Hi! @nilason , |
|
This can't work, you have to set errno in raster/r.resamp.bspline/main.c (and revert the changes there too). |
|
My bad ... I had some extra changes in main.c earlier which made this unclear. I’ve now reverted the extra bits and kept only the minimal errno = EINVAL before G_fatal_error(). |
No worries, but make sure to test locally before push to save precious CI (and maintainer) time. |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
Looks good to me now, thanks! |
|
The test is failing: |
|
I see the test failure ... it’s coming from assertRasterFitsUnivar() being called without a precision in test_no_nulls, which leads to float(None) inside values_equal(). I’ll push a small fix adding an explicit precision for the null_cells check (doing precision=0) and rerun. |
@petrasovaa Even I tried to trace back where the precision gets defined/used, and couldn’t even answer myself if 0 made sense here. from https://grass.osgeo.org/grass-devel/manuals/libpython/gunittest.html#gunittest.case.TestCase.assertRasterFitsUnivar, https://grass.osgeo.org/grass-devel/manuals/libpython/_modules/gunittest/case.html#TestCase.assertRasterFitsUnivar, That is passed down to keyvalue_equals https://grass.osgeo.org/grass-devel/manuals/libpython/_modules/gunittest/checkers.html#keyvalue_equals that uses values_equal (probably in an unsafe way the way the default is in the signature if it could be mutable, I’m not sure for functions, but lists and dicts are immutable), https://grass.osgeo.org/grass-devel/manuals/libpython/_modules/gunittest/checkers.html#values_equal In there, I’m not sure if the default value is still 0.000001 when it gets passed grass/python/grass/gunittest/testsuite/test_checkers.py Lines 30 to 70 in f258881 |
|
Thanks @echoix for digging into this and confirming the root cause ... I agree about the explanation of the precision you are mentioning . Though from my side, I don’t think there’s anything left to adjust in this PR now. Happy to adjust anything further if you or @nilason or @petrasovaa think something should be handled differently. |
| self.assertRasterFitsUnivar( | ||
| raster=self.mapComplete, | ||
| reference={"null_cells": float(0)}, | ||
| precision=0, |
There was a problem hiding this comment.
I don't think precision 0 is appropriate. It would mean a full unit of a number off would still pass. Is it that you had in mind?
There was a problem hiding this comment.
Thank you for the detailed investigation @echoix ! I understand your concern about precision=0, but I believe there might be a misunderstanding about what it means.
Looking at the values_equal function in checkers.py (lines 262-324), when precision=0:
For floats (lines 279-291), the code checks: if abs(value_a - value_b) > precision: Since both null_cells values are float(0), we get abs(0 - 0) > 0 which is False, so the values are considered equal.
For integers (lines 305-313), precision is only applied if int(precision) > 0. With precision=0, this condition is False, so it falls through to line 322 which does strict equality: elif value_a != value_b: return False
So precision=0 actually means "exact match required" with no tolerance, not "allow a difference of 1 unit".
Looking at the existing test suite (test_checkers.py), there are examples like:
Lines 290-295: precision=0 is used for exact integer comparisons
Line 42: values_equal(5, 8, precision=3) returns True (allows difference of 3)
In our case, we're checking that null_cells is exactly 0 (no tolerance), which is the correct behaviour for this test. Does this make sense, or am I missing something?
There was a problem hiding this comment.
Hi @nilason !
Are you suggesting I should use precision=0.0 instead of precision=0 to make it clearer that it's a float comparison? Or are you referring to the null_cells: float(0) value itself? Happy to change either for clarity!
If it's the first case then I saw codebase using 0 everywhere and I think python does auto converts it accordingly...
This PR fixes a locale-dependent bug in r.fillnulls when using bilinear/bicubic interpolation.
Instead of parsing English error messages from r.resamp.bspline stderr, the script now checks for NULL cells using
r.univar -gand exits early when none are present.This avoids failures in non-English locales and simplifies the control flow.
Fixes #1495.