New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up IntegerMatrix #26237
Comments
comment:1
Works for me:
|
comment:2
Hmm...strange. Let me try from a fresh session to see if I somehow unintentionally corrupted things. |
comment:3
Okay, I had one little seemingly innocuous change: I added an explicit return type |
comment:4
So what was going wrong is that there is a If it is not, I might also recycle this ticket with some Cython tweaks to |
comment:5
Well, there is a bug somewhere (on vanilla 8.4.beta4):
|
comment:6
Replying to @tscrim:
I'm not following here... what do you mean with "Cython could somehow not automatically convert that to a boolean check"? |
comment:7
Yes, that is correct (well, an |
comment:8
At least, once I changed that to an explicit check != 0, it fixed the problem. |
comment:9
Replying to @tscrim:
Most (all?) LinearMatroid methods assume that we are allowed to pivot in the matrix, and do so without leaving the ring. Maybe that's what went wrong? |
comment:10
Replying to @tscrim:
It's still not clear which check you mean and what the problem really is... |
comment:11
The method GenericMatrix.is_nonzero seems to be called only by LeanMatrix.gauss_jordan_reduce() and LeanMatrix.nonzero_positions_in row(). LeanMatrix.pivot() does not even use is_nonzero(), but ‘s=self.get_unsafe(i,y)’ followed by ‘if s and ..’ (line 303 of lean_matrix.pyx). Apparently the conversion to a bool implicit in ‘if s’ does work in that context. It may be more effcient too, since depending on the base ring evaluating s!=0 may involve casting 0 as a ring element. I’m quite sure that for finite fields ‘if s’ is about 5 times faster than ‘if (s!=0)’. So it may also be a solution to rewrite LeanMatrix.gauss_jordan_reduce() and LeanMatrix.nonzero_positions_in row() in the more efficient way used in pivot(), completely avoiding the use of is_nonzero(). |
comment:12
Replying to @jdemeyer:
Sorry, I was trying to answer quickly while I was out. So when I explicitly tell Cython that cdef bint is_nonzero(self, long r, long c) except -2: # Not a Sage matrix operation
- return self.get(r, c)
+ return self.get(r, c) != 0 otherwise I get the original error message in the ticket description. |
comment:13
Replying to @sagetrac-Stefan:
But in the latter case, the matrix is over |
comment:14
Replying to @sagetrac-Rudi:
Yes, as you surmised, the
What I was hoping to do was take advantage of the known return types in the integer matrix to improve the Cython code. With the explicit casting, my computation goes from 13.6s to 3s. diff --git a/src/sage/matroids/lean_matrix.pxd b/src/sage/matroids/lean_matrix.pxd
index c284f6a..122dab1 100644
--- a/src/sage/matroids/lean_matrix.pxd
+++ b/src/sage/matroids/lean_matrix.pxd
@@ -102,7 +102,7 @@ cdef class QuaternaryMatrix(LeanMatrix):
cdef class IntegerMatrix(LeanMatrix):
cdef int* _entries
- cdef inline get(self, long r, long c) # Not a Sage matrix operation
+ cdef inline int get(self, long r, long c) # Not a Sage matrix operation
cdef inline void set(self, long r, long c, int x) # Not a Sage matrix operation
cdef inline long row_len(self, long i) except -1 # Not a Sage matrix operation
diff --git a/src/sage/matroids/lean_matrix.pyx b/src/sage/matroids/lean_matrix.pyx
index 93a32bc..d6022c2 100644
--- a/src/sage/matroids/lean_matrix.pyx
+++ b/src/sage/matroids/lean_matrix.pyx
@@ -2844,7 +2844,7 @@ cdef class IntegerMatrix(LeanMatrix):
"""
return "IntegerMatrix instance with " + str(self._nrows) + " rows and " + str(self._ncols) + " columns"
- cdef inline get(self, long r, long c): # Not a Sage matrix operation
+ cdef inline int get(self, long r, long c): # Not a Sage matrix operation
return self._entries[r * self._ncols + c]
cdef inline void set(self, long r, long c, int x): # Not a Sage matrix operation
@@ -2877,7 +2877,7 @@ cdef class IntegerMatrix(LeanMatrix):
return 0
cdef bint is_nonzero(self, long r, long c) except -2: # Not a Sage matrix operation
- return self.get(r, c)
+ return self.get(r, c) != 0
cdef LeanMatrix copy(self): # Deprecated Sage matrix operation
cdef IntegerMatrix M = IntegerMatrix(self._nrows, self._ncols)
@@ -2982,12 +2982,14 @@ cdef class IntegerMatrix(LeanMatrix):
ignored.
"""
cdef long i
+ cdef int sval
if s is None:
for i from 0 <= i < self._ncols:
self.set(x, i, self.get(x, i) + self.get(y, i))
else:
+ sval = int(s)
for i from 0 <= i < self._ncols:
- self.set(x, i, self.get(x, i) + s * self.get(y, i))
+ self.set(x, i, self.get(x, i) + sval * self.get(y, i))
return 0
cdef int swap_rows_c(self, long x, long y) except -1:
@@ -3010,8 +3012,9 @@ cdef class IntegerMatrix(LeanMatrix):
compatibility, and is ignored.
"""
cdef long i
+ cdef int sval = int(s)
for i from 0 <= i < self._ncols:
- self.set(x, i, s * self.get(x, i))
+ self.set(x, i, sval * self.get(x, i))
return 0
cdef int rescale_column_c(self, long y, s, bint start_row) except -1:
@@ -3020,8 +3023,9 @@ cdef class IntegerMatrix(LeanMatrix):
compatibility, and is ignored.
"""
cdef long j
+ cdef int sval = int(s)
for j from 0 <= j < self._nrows:
- self.set(j, y, self.get(j, y) * s)
+ self.set(j, y, self.get(j, y) * sval)
return 0
cdef int pivot(self, long x, long y) except -1: # Not a Sage matrix operation |
comment:15
Replying to @tscrim:
That looks like a good solution. Great! Could |
comment:16
Replying to @sagetrac-Rudi:
You cannot explicitly cast to something you don't know. Plus the arithmetic operations being done for |
comment:17
I guess something you could do is also special case |
comment:18
Replying to @tscrim:
First of all, that's a bad idea: a C int has limited range (typically 32 bits) and you cannot guarantee that all entries fit in that.
I don't consider that a bug. The reason is that the conversion of a Python object to a |
comment:19
So the solution for really converting an |
comment:20
Should I close this ticket or do you want to recycle it? |
comment:21
I will recycle this, at least for the speedups to However, the fact that I am guessing instead of |
This comment has been minimized.
This comment has been minimized.
comment:23
Replying to @tscrim:
Suppose that an entry in the matrix equals the Python integer -2. Currently (without applying any changes): when calling |
comment:24
Replying to @tscrim:
I just looked at the Cython sources for However... this just shows that using
|
comment:25
Thing is, the LeanMatrix classes are internal datatypes. Regular matroids have matrices with entries equal to -1, 0, 1, and are such that pivoting preserves that property. For a potentially regular matroid on which you want to run is_valid(), we check this condition through pivoting, which means we get to a "bad subdeterminant" of size at most 2x2, hardly enough to cause overflows if the entries are 0,1, or -1. We took some effort to keep the LeanMatrix away from the end user (LinearMatroid.Representation returns a Sage matrix, for instance, and the datatypes don't get imported into the Sage namespace). Speed is of the essence here, since we create and destroy tons of these (Sage matrices have a lot of overhead at creation time), and we do tons of row operations on them. If you dig deep enough to use IntegerMatrix at this spot in the code, you'll have some idea of what you're getting yourself into. Again, since you want to use IntegerMatrix in the LinearMatroid subclasses, and since you need a matrix where pivoting is safe, really the only use case is Regular Matroids where overflows are never an issue. |
comment:33
Done. I didn't know that. Thanks. |
comment:34
You will see that no function ever returns an
|
comment:35
Yea, I noticed that and so I scrapped the
where |
comment:36
Can you move "Implement The commits that you added here to improve |
comment:37
Replying to @tscrim:
+1 to |
comment:38
Replying to @jdemeyer:
Yep, I can do that. I will do that when I get into my office tomorrow. |
This comment has been minimized.
This comment has been minimized.
comment:39
#26269 for the |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
|
comment:41
I've split the ticket into the two parts. This part is now ready for review. |
comment:42
(Essentially) Green patchbot. |
comment:43
So how about renaming to |
Branch pushed to git repo; I updated commit sha1. New commits:
|
comment:45
I did the refactoring. Stefan, Rudi, any objections? |
comment:46
I am happy with the new name. Don’t have time for a long review. |
comment:47
I've looked through the code and I am happy with it, so LGTM. However contributed to this ticket (mainly review), please insert your name in the reviewer field. |
Reviewer: Daniel Krenn |
comment:48
Stefan, Rudi I added you as reviewers based on our conversions above. Jeroen is obviously a reviewer for looking at the Cython code (as well as his additional useful insights). |
Changed reviewer from Daniel Krenn to Daniel Krenn, Jeroen Demeyer, Stefan Van Zwam, Rudi Pendavingh |
Changed branch from public/matroids/speedup_integer_matrix-26237 to |
We explicitly declare things to be
int
type so Cython can generate better C code.CC: @sagetrac-Stefan @sagetrac-Rudi @sagetrac-yomcat @sagetrac-msaaltink
Component: matroid theory
Author: Travis Scrimshaw
Branch/Commit:
3c0195b
Reviewer: Daniel Krenn, Jeroen Demeyer, Stefan Van Zwam, Rudi Pendavingh
Issue created by migration from https://trac.sagemath.org/ticket/26237
The text was updated successfully, but these errors were encountered: