Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r.in.pdal: fix integer overflow #2844

Merged
merged 3 commits into from
Feb 21, 2023
Merged

Conversation

nilason
Copy link
Contributor

@nilason nilason commented Feb 21, 2023

Fixes #2840

The segmentation fault reported with #2840 is caused by a call to Rast_set_null_value() where the parameter numVals is defined as integer. Overflow occurs if the col * row value given to numVals exceeds INT_MAX.

I put this up initially as a draft as this brings up some questions:

Firstly:
The array is allocated as rows * (cols + 1):

point_binning->index_array =
G_calloc((size_t)rows * (cols + 1), Rast_cell_size(CELL_TYPE));

but zeroed as rows * cols:

Rast_set_null_value(array, nrows * ncols, map_type);

Is this really by design?

Secondly:

The question is whether Rast_set_null_value() family of functions should all be changed accordingly and to what type. I have now changed numVals to unsigned int, but this could be size_t, unsigned long ...
Or do we re-implement this function locally within r.in.pdal and leave lib/raster as is?

include/grass/defs/raster.h Outdated Show resolved Hide resolved
@@ -57,7 +57,7 @@ int blank_array(void *array, int nrows, int ncols, RASTER_MAP_TYPE map_type,
case -1:
/* fill with NULL */
/* alloc for col+1, do we come up (nrows) short? no. */
Rast_set_null_value(array, nrows * ncols, map_type);
Rast_set_null_value(array, (size_t)nrows * ncols, map_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t on 64 bit systems is larger than unsigned int, this will not work for larger regions.

I would rather loop over the rows and set each row to NULL. A pointer can be used to go over the rows, increased with ptr = G_incr_void_ptr(ptr, ncols * map_type_size):

        for (row = 0; row < nrows; row++) {
            Rast_set_null_value(ptr, ncols, map_type);
            ptr = G_incr_void_ptr(ptr, ncols * Rast_cell_size(map_type));
        }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While "set whole raster map to null" seems like a good function to have, I'm afraid that the correct way of using Rast_set_null_value is to set only up a row of value as @metzm says.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change @metzm suggested, I guess there is no need to change (at this time) Rast_set_null_value()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, that's what I meant. "Set whole raster map to null" would be a separate function wrapping that code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Set whole raster map to null" would be a rather special function because typically raster data are accessed by rows and not as a whole. Standard procedure would be e.g. to set the next output row to NULL, do some calculations with input data, set data in the output row where applicable, write output row, continue with next row.

In order to save memory consumption, raster operations should wherever possible never load a whole raster to memory but instead proceed row by row or a small chunk of rows (e.g. r.neighbors). This is IMHO a big strength of GRASS, also for cloud and grid computing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, perhaps too specific to the binning procedure here. This reminds me that "whole map" here is actually a chunk of rows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed the suggested changes. Please review!

@wenzeslaus
Copy link
Member

...The array is allocated as rows * (cols + 1)...but zeroed as rows * cols... Is this really by design?

The history of the code is that I took the point binning code from the original r.in.lidar which likely took it from r.in.xyz. While I was trying to improve the API when working on r.in.lidar, I think I assumed the code to be correct and extra allocated space would pass any automated checks. So, from this point of view, it is likely not intentional. By quick look into the code I don't see anything either, but perhaps a second look would be good.

Both "null whole raster map" and point/value binning would be good to have in the library, but that's a different story.

nilason and others added 2 commits February 21, 2023 16:17
@nilason nilason marked this pull request as ready for review February 21, 2023 15:40
@nilason nilason merged commit 75cb8f6 into OSGeo:main Feb 21, 2023
@nilason
Copy link
Contributor Author

nilason commented Feb 21, 2023

Thank you both for your input and in particular for the brilliant solution @metzm !

Shall this be backported?

@nilason nilason deleted the fix_r-in-pdal-2840 branch February 21, 2023 18:45
@nilason nilason added bug Something isn't working C Related code is in C labels Feb 21, 2023
@nilason nilason added this to the 8.3.0 milestone Feb 21, 2023
@neteler
Copy link
Member

neteler commented Feb 21, 2023

Shall this be backported?

Yes please.

@petrasovaa
Copy link
Contributor

There is identical code in r.in.xyz as Vashek points out:

https://github.com/OSGeo/grass/blob/main/raster/r.in.xyz/support.c#L48

@nilason
Copy link
Contributor Author

nilason commented Feb 21, 2023

There is identical code in r.in.xyz as Vashek points out:

https://github.com/OSGeo/grass/blob/main/raster/r.in.xyz/support.c#L48

Thanks! I'll get to it.

@nilason
Copy link
Contributor Author

nilason commented Feb 21, 2023

There is identical code in r.in.xyz as Vashek points out:
https://github.com/OSGeo/grass/blob/main/raster/r.in.xyz/support.c#L48

Thanks! I'll get to it.

Done with #2847

@metzm
Copy link
Contributor

metzm commented Feb 21, 2023

There are a number of modules with shared code base such as r.in.xyz and r.in.pdal. In order to ease maintenance, it would be great to add some comment in the code for developers at a prominent place to keep shared code in sync. I tried to do this e.g. for the GDAL/OGR import modules r.in.gdal, r.external, v.in.ogr, v.external.

@nilason
Copy link
Contributor Author

nilason commented Feb 21, 2023

There are a number of modules with shared code base such as r.in.xyz and r.in.pdal. In order to ease maintenance, it would be great to add some comment in the code for developers at a prominent place to keep shared code in sync. I tried to do this e.g. for the GDAL/OGR import modules r.in.gdal, r.external, v.in.ogr, v.external.

Good point!

@wenzeslaus
Copy link
Member

In case of r.in.xyz and r.in.pdal, an actual library might be more appropriate. However, as a simple fix, we now have also enforcement of duplicate files to be the same: 8b6cd94

@nilason
Copy link
Contributor Author

nilason commented Feb 21, 2023

In case of r.in.xyz and r.in.pdal, an actual library might be more appropriate. However, as a simple fix, we now have also enforcement of duplicate files to be the same: 8b6cd94

They are not duplicate files, r.in.pdal is, as far as I can tell, heavily refactored too. Library functions for repetitive code sounds better, if practical.

nilason added a commit to nilason/grass that referenced this pull request Feb 22, 2023
Address integer overflow in call to Rast_set_null_value()
by instead looping over rows and set rows to NULL.

Solution by:
Co-authored-by: Markus Metz <markus.metz.giswork@gmail.com>
landam pushed a commit that referenced this pull request Jun 6, 2023
* r.in.pdal: fix integer overflow (manual backport of #2844)

Address integer overflow in call to Rast_set_null_value()
by instead looping over rows and set rows to NULL.

Solution by:
Co-authored-by: Markus Metz <markus.metz.giswork@gmail.com>

* r.in.xyz: fix integer overflow (manual backport of #2847)

Address integer overflow in call to Rast_set_null_value()
by instead looping over rows and set rows to NULL.

Solution by:
Co-authored-by: Markus Metz <markus.metz.giswork@gmail.com>
@landam landam removed this from the 8.3.0 milestone Jun 6, 2023
@landam landam added this to the 8.2.2 milestone Jun 6, 2023
neteler pushed a commit to nilason/grass that referenced this pull request Nov 7, 2023
Address integer overflow in call to Rast_set_null_value()
by instead looping over rows and set rows to NULL.

Solution by:
Co-authored-by: Markus Metz <markus.metz.giswork@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working C Related code is in C
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] r.in.pdal: apparent integer overflow
6 participants