Check that matrix sizes are in the supported range before passing them to R #1066

szhorvat · 2024-01-02T12:39:14Z

Fixes #889

This fix is theoretical at the moment. I can't seem to test it on macOS. The test would use a very large amount of memory.

> g<-make_empty_graph(.Machine$integer.max + 2)
> distances(g, v=1)
Error: vector memory exhausted (limit reached?)

@krlmlr @Antonov548 Can either of you try to test this on a large-memory Linux machine? I don't have access to one at this moment. Or otherwise do we merge as-is without a through test?

aviator-app · 2024-01-02T12:39:17Z

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged using Aviator.

See the real-time status of this PR on the Aviator webapp.

Use the Aviator Chrome Extension to see the status of your PR within GitHub.

krlmlr · 2024-01-02T15:28:50Z

Instead of a one-off test, can we instead implement a function that checks matrix dimensions, test the behavior of that function (in an automated test), and use it where needed?

If we don't want to expose this function to R, we can use the catch testing framework in testthat: https://testthat.r-lib.org/reference/use_catch.html .

szhorvat · 2024-01-02T15:41:14Z

Each matrix type needs a different implementation, so this can't be reduced to calling a single function three times

krlmlr · 2024-01-02T15:42:08Z

Just the part where ncol and nrow is checked, perhaps?

szhorvat · 2024-01-02T19:53:48Z

test the behavior of that function (in an automated test),

How do you intend to test a C function from R? R doesn't even support the integer type that the function would be operating with. What would be the point of testing a function that just compares two integers and reports an error? I am not sure how to proceed here.

I managed to test manually that the functionality works after setting R_MAX_VSIZE=100Gb, using the command line instead of RStudio, and using a barebones function like layout_in_circle(). The R process used over 120 GB of memory.

> g<-make_empty_graph(n=.Machine$integer.max+1)
> l<-layout_in_circle(g)
Error in layout_in_circle(g) : 
  At rinterface_extra.c:2588 : igraph returned a matrix of size 2147483648 by 2. R does not support matrices with more than 2147483647 rows or columns. Failed

krlmlr · 2024-01-02T21:07:00Z

src/rinterface_extra.c

+  /* Assuming that this function is called in a context where
+   * igraph_error() does not return. */
+  if (nrow > INT_MAX || ncol > INT_MAX) {
+    igraph_errorf("igraph returned a matrix of size %" IGRAPH_PRId " by %" IGRAPH_PRId ". "
+                  "R does not support matrices with more than %d rows or columns.",
+                  __FILE__, __LINE__, IGRAPH_FAILURE,
+                  nrow, ncol, INT_MAX);
+  }


This could be extracted into a function, say, R_igraph_check_matrix_dims(igraph_integer_t nrow, igraph_integer_t ncol) . This function becomes testable, e.g., through an R wrapper that we export just for testing. (Overkill, perhaps.)

We could also go one step further and make this function sensitive to a global switch (one that's perhaps only active in debug mode). The global switch would change the definition of INT_MAX for this function so that we can even test that (some) functions that return a matrix actually go through this code path, without allocating tons of memory. (Overkill, perhaps.)

But these two steps combined allow creating a one-off version of igraph that doesn't require allocating hundreds of gigabytes to check the functionality.

No action needed here, but for the future, should we strive to take testability into account when implementing?

should we strive to take testability into account when implementing

This PR fixes a bug that manifests itself in difficult to test scenarios. Nevertheless it is a bug that should be fixed. I did verify the fix manually.

This function becomes testable, e.g., through an R wrapper that we export just for testing.

The function would take arguments of type igraph_integer_t. This type is not supported by R. Using double makes things complicated and increases the chance of mistakes (not every igraph_integer_t value is representable as a double).

So I'm not sure how to do this testing in a productive way.

It's in the merge queue, thanks for working on this.

Happy to elaborate if this situation occurs on an issue from https://github.com/igraph/rigraph/milestone/14.

…pes to definitions

…ssing them to R

krlmlr · 2024-01-02T21:08:14Z

Thanks!

szhorvat force-pushed the fix/matrix-checks branch from ed7a851 to f849c7e Compare January 2, 2024 12:50

szhorvat marked this pull request as ready for review January 2, 2024 13:22

krlmlr reviewed Jan 2, 2024

View reviewed changes

krlmlr added the mergequeue label Jan 2, 2024

szhorvat added 2 commits January 2, 2024 21:08

fix: complete set of igraph_0ormatrix_to_SEXP function, match prototy…

dcd7cc8

…pes to definitions

fix: check that matrix sizes are within the supported range before pa…

9425930

…ssing them to R

aviator-app bot force-pushed the fix/matrix-checks branch from f849c7e to 9425930 Compare January 2, 2024 21:08

aviator-app bot merged commit f1df4c2 into main Jan 2, 2024
25 checks passed

aviator-app bot deleted the fix/matrix-checks branch January 2, 2024 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check that matrix sizes are in the supported range before passing them to R #1066

Check that matrix sizes are in the supported range before passing them to R #1066

szhorvat commented Jan 2, 2024

aviator-app bot commented Jan 2, 2024 •

edited

krlmlr commented Jan 2, 2024

szhorvat commented Jan 2, 2024

krlmlr commented Jan 2, 2024

szhorvat commented Jan 2, 2024

krlmlr Jan 2, 2024 •

edited

szhorvat Jan 2, 2024 •

edited

krlmlr Jan 2, 2024

krlmlr commented Jan 2, 2024

Check that matrix sizes are in the supported range before passing them to R #1066

Check that matrix sizes are in the supported range before passing them to R #1066

Conversation

szhorvat commented Jan 2, 2024

aviator-app bot commented Jan 2, 2024 • edited

Current Aviator status

krlmlr commented Jan 2, 2024

szhorvat commented Jan 2, 2024

krlmlr commented Jan 2, 2024

szhorvat commented Jan 2, 2024

krlmlr Jan 2, 2024 • edited

Choose a reason for hiding this comment

szhorvat Jan 2, 2024 • edited

Choose a reason for hiding this comment

krlmlr Jan 2, 2024

Choose a reason for hiding this comment

krlmlr commented Jan 2, 2024

aviator-app bot commented Jan 2, 2024 •

edited

krlmlr Jan 2, 2024 •

edited

szhorvat Jan 2, 2024 •

edited