Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add range checks #423

Closed

Conversation

alexey-milovidov
Copy link
Contributor

Implement #392

Not sure if everything is correct, I did not dig into the algorithms.
But at least it fixes ClickHouse/ClickHouse#19219

There should not be significant performance degradation for two reasons:

  • the branch is easy predictable and it can be executed in parallel due to instruction level parallelism;
  • the algorithms are already heavy enough, the difference in a few clock ticks should be neglible.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.7%) to 98.516% when pulling af8068a on ClickHouse-Extras:clickhouse-range-checks into dd0b1e5 on uber:master.

@alexey-milovidov
Copy link
Contributor Author

I see there are some issues with formatting. I don't think I'm ready to finish this PR, feel free to apply any edits or close and resubmit another PR...

@@ -840,6 +842,8 @@ bool _isBaseCellPolarPentagon(int baseCell) {
* Valid ijk+ lookup coordinates are from (0, 0, 0) to (2, 2, 2).
*/
int _faceIjkToBaseCell(const FaceIJK* h) {
if (!h || h->face < 0 || h->face >= NUM_ICOSA_FACES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaacbrodsky - Do you think we need guards on face here? Or is this value only calculated within the library? If we can't trust face, we should probably add guards for coord.i, coord.j, and coord.k as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little curious what case this is guarding against. While I agree with preventing undefined behavior, I am not sure offhand whether this is a needed check.

@nrabinowitz
Copy link
Collaborator

Thanks for submitting this! You should be able to fix the test failures with make format. The bigger issue here is probably test coverage - we'll need to add tests for the new branches introduced here.

@isaacbrodsky - thoughts on bringing this into 3.x? This is potentially a security issue.

@sahrk
Copy link

sahrk commented Jan 22, 2021

I haven't been following the issues daily so sorry for being late to the conversation again...

The lack of range checks on arrays throughout H3 core (including the recent baseCells array merge) was intentional (as it is in the C language itself). Personally I think that performance degradation in H3 core should be avoided at all costs. The hit might be small for each change, but such things add up. Rather than changing the core functions, I would strongly encourage you to add "safe" wrapper functions that can be used in situations where the user doesn't want the onus of checking their own array indexes when necessary.

Copy link
Collaborator

@isaacbrodsky isaacbrodsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @sahrk' point, another suggestion posted in the issues is to expand the base cells array and ensure we never calculate an index into that array outside of 0-127 inclusive.

I would like to ensure that H3 is resilient to dereferencing out of bounds. Beyond the security concerns, this type of crash is often very difficult to debug, especially when it happens in a library loaded into a manageed framework. I would be willing to create a new 3.x release to incorporate this kind of change.

If possible, could you post test cases demonstrating some of these? Even if not coded in olur test framework it would be very helpful for coding the tests.

@@ -290,6 +290,9 @@ H3Index h3NeighborRotations(H3Index origin, Direction dir, int* rotations) {

int newRotations = 0;
int oldBaseCell = H3_GET_BASE_CELL(out);
if (oldBaseCell < 0 || oldBaseCell >= NUM_BASE_CELLS)
return 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: return H3_NULL

@@ -840,6 +842,8 @@ bool _isBaseCellPolarPentagon(int baseCell) {
* Valid ijk+ lookup coordinates are from (0, 0, 0) to (2, 2, 2).
*/
int _faceIjkToBaseCell(const FaceIJK* h) {
if (!h || h->face < 0 || h->face >= NUM_ICOSA_FACES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little curious what case this is guarding against. While I agree with preventing undefined behavior, I am not sure offhand whether this is a needed check.

@sahrk
Copy link

sahrk commented Jan 22, 2021

To @sahrk' point, another suggestion posted in the issues is to expand the base cells array and ensure we never calculate an index into that array outside of 0-127 inclusive.

I should point-out that the baseCells and the faces in faceIjkToBaseCell are not the only arrays that are accessed without bounds checking in the core. There are a whole bunch of arrays in the core, many with multiple dimensions. If I ever did bounds checking on any of them it was by mistake :).

The core was designed to be as tight as possible, which is why so much of it is table lookups, which are as cheap as it gets in C. In the case of the baseCells we're talking about a function that's just a wrapper for an array access. However you slice it, if you add bounds checking you're at least doubling the cost of the function.

I would like to ensure that H3 is resilient to dereferencing out of bounds. Beyond the security concerns, this type of crash is often very difficult to debug, especially when it happens in a library loaded into a manageed framework. I would be willing to create a new 3.x release to incorporate this kind of change.

That is all reasonable of course. Earlier I suggested making safe versions of table lookup functions that are exposed to external users (like baseCells), but what if you made the API functions safe and created internal functions without bounds checking for use just in the core?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buffer overflow is possible in "h3" library
5 participants