Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make hecke operators not blow up the memory #21303

Closed
sagetrac-kartikv mannequin opened this issue Aug 22, 2016 · 44 comments
Closed

Make hecke operators not blow up the memory #21303

sagetrac-kartikv mannequin opened this issue Aug 22, 2016 · 44 comments

Comments

@sagetrac-kartikv
Copy link
Mannequin

sagetrac-kartikv mannequin commented Aug 22, 2016

Constructing Hecke operators acting on modular symbols can blow up the memory usage, though it doesn't have to. The eclib code has sparse and dense options for generating Hecke operators, but the sage caller only calls the dense options. The simple solution is to modify the sage caller to add a flag which specifies to use the sparse options: perhaps a better solution would be to intuit whether the resulting matrix is likely to be sparse and do something intelligent.

Upstream: None of the above - read trac for reasoning.

CC: @JohnCremona @kedlaya

Component: modular forms

Keywords: eclib, modular symbols, hecke operators

Author: Kiran Kedlaya

Branch/Commit: 6c3bac5

Reviewer: David Roe

Issue created by migration from https://trac.sagemath.org/ticket/21303

@sagetrac-kartikv sagetrac-kartikv mannequin added this to the sage-7.4 milestone Aug 22, 2016
@sagetrac-kartikv

This comment has been minimized.

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 22, 2016

comment:2

The sage_matrix_over_ZZ method of internal Matrix class in the eclib wrapper does have a flag sparse (defaults to False), but at this point it is too late; the Hecke matrix was already created as an eclib dense matrix (i.e., a C++ array of long's). If I understand correctly, what is really needed is a flag in the hecke_matrix method that gets passed to eclib, so that the underlying C++ code switches correctly.

@JohnCremona
Copy link
Member

comment:3

You will also need to write a cython function to convert an eclib sparse matrix to a Sage one. William and I did this for dense matrices in about 2008.
The eclib sparese matrix class is written at rather a low level for efficiency with arrays of pointers etc. Ask me if it is not clear how to extract the information from one.

The output function at lne 729 of https://github.com/JohnCremona/eclib/blob/master/libsrc/smat.cc might help; also the associated header file. I suggest that you loop through all the rows and for each one fetch all the (column, value) pairs. One thing to watch out for: if a row contains n nonzero entries then the 0'th entry in the associated col array stores n so that array has size n+1 while the values array has size n. Row and column numbers start from 1 not 0. Good luck.

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 25, 2016

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 25, 2016

comment:4

I got a rather crude version of this up and running, so here's a patch. The crude part is that I ask eclib for dense vectors one at a time, rather than asking for a whole sparse matrix; but this was easier to wrap (I didn't need to tamper with the include files to expose any functionality) and already avoids an obvious quadratic memory step.

I tried this out on my usual compute server:

sage: time C = CremonaModularSymbols(400001, sign=-1)
CPU times: user 26.6 s, sys: 119 ms, total: 26.7 s
Wall time: 26.7 s
sage: time T2a = C.hecke_matrix(2)
CPU times: user 17.7 s, sys: 4.9 s, total: 22.6 s
Wall time: 22.6 s
sage: time T2b = T2a.sage_matrix_over_ZZ(sparse=True)
CPU times: user 26.2 s, sys: 83 ms, total: 26.2 s
Wall time: 26.3 s
sage: del T2a
sage: time T2c = C.sparse_hecke_matrix(2)
CPU times: user 1min 10s, sys: 548 ms, total: 1min 11s
Wall time: 1min 11s
sage: time T2b == T2c # True but not instant for sparse matrices
CPU times: user 11 s, sys: 650 ms, total: 11.7 s
Wall time: 11.7 s
True

The regression (48.9s to 71.7s) is annoying; the bottleneck is the line

        ans = M(entries=d)

where M is the matrix space of sparse matrices and d is the dict mapping coordinate pairs to matrix entries. I had the thought that the matrix constructor must be copying the dict (as is necessary generically) even though in this case it's not required, but I don't think that's the issue:

sage: time dd = copy(d) #too short
CPU times: user 960 ms, sys: 73 ms, total: 1.03 s
Wall time: 1.03 s
sage: time dd = deepcopy(d) #too long
CPU times: user 2min 21s, sys: 1.7 s, total: 2min 23s
Wall time: 2min 23s

On the memory side (crudely measured by watching top), the computation of T2b via T2a peaks at 5.9g whereas the computation of T2c peaks at 1.6g, so at least that went according to plan.

And in my intended use case I work modulo some prime, and this is a big win:

sage: time T2d = C.sparse_hecke_matrix(2, base_ring=GF(2))
CPU times: user 15.3 s, sys: 317 ms, total: 15.7 s
Wall time: 15.7 s
sage: time T2e = C.sparse_hecke_matrix(2, base_ring=GF(next_prime(2^30)))
CPU times: user 24.3 s, sys: 607 ms, total: 25 s
Wall time: 25 s
}

so I'm taken care of.


New commits:

8075a0cGenerate sparse Hecke matrix without dense intermediary
0bcc834Add base_ring option to sparse_hecke_matrix, use sparse constructor
bd5a8c5Docstring corrections
bcd410aMore minor edits

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 25, 2016

Commit: bcd410a

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 26, 2016

comment:5

I should add that I have been running this code since I posted it and am observing a memory leak somewhere. Specifically, I've been running this code snippet (for various values of i):

def zero_eigenvalue_multiplicity(m): #taken from trac #20788
    n = 0
    m1 = m
    r = m.dimensions()[0]
    while True:
        m2 = m1.extended_echelon_form(subdivide=True)
        t = r - m2.subdivisions()[0][0]
        n += t
        if t == 0 or t == r: return n
        m3 = m2.subdivision(0,0)
        m4 = m2.submatrix(0,r,r,r)
        m5 = m3 * m4.inverse()
        m1 = m5.submatrix(0,0,r-t,r-t)
        r -= t

for n in range((i-5)*1000+1, i*1000, 2):
    C = CremonaModularSymbols(n, cuspidal=True, sign=-1)
    T2 = C.sparse_hecke_matrix(2, base_ring=GF(2)).dense_matrix()
    t = (n, zero_eigenvalue_multiplicity(T2), zero_eigenvalue_multiplicity(T2+1))

and observing (via top) memory usage growing from 200-300mb for the initial stages to over 10gb later in the loop. My guess is that there is no serious memory leak in eclib, and I wouldn't suspect one in m4ri either; but I'm having trouble coming up with other options, and I'm not up to speed with valgrind enough to try checking that way.

@JohnCremona
Copy link
Member

comment:6

When eclib was first put into Sage back in about 2007 it was subjected to rigorous valgrinding by Michael Abshoff, which revealed several issues which were then fixed (with some effort!). But there has been considerable change since then, so I should do that again.

I have made this an Issue for eclib: JohnCremona/eclib#18

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 29, 2016

comment:7

Update: I can reproduce the memory leak behavior by simply repeating the code from #20788 (as reported therein). That is not to say that running valgrind again on eclib wouldn't be advisable, but I am not specifically claiming a memory leak in the computation of Hecke matrices (though if you do find one, it would certainly help me for it to be fixed).

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 30, 2016

comment:8

Oh, now this is bad news.

sage: C = CremonaModularSymbols(45, cuspidal=True,sign=-1)
sage: T2a = C.hecke_matrix(2).sage_matrix_over_ZZ()
sage: print T2a
[-1 -1 -1]
[ 0  1  2]
[ 0  0 -1] 
sage: T2b = C.sparse_hecke_matrix(2)
sage: print T2b
[-1 -1  0]
[ 0  1  2]
[ 0  0 -3]

This is only an issue when cuspidal=True: the eclib function heckeop takes the projection onto the cuspidal subspace into account but heckeop_col does not.

It looks like s_heckeop does account for the cuspidal projection, so switching to that should fix this problem.

@JohnCremona
Copy link
Member

comment:9

Apologies for the inconsistencies. There must be quite a few methods in this homspace class which are no longer used in the main programs and whose implementations have therefore not been kept consistent or tested. Unlike Sage where every single method has its own test, I just have high-level test programs and there must be a lot of code which is no longer tested, thought of by me as obsolete but not actually deleted. The number of people who have looked at this code in any detail is very small!

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 30, 2016

Changed commit from bcd410a to 6fba176

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 30, 2016

Branch pushed to git repo; I updated commit sha1. New commits:

6fba176Use sparse Hecke matrix from eclib, correct results for cuspidal subspace

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 30, 2016

comment:11

I went ahead and switched over to s_heckeop and this indeed resolves the issue. I am still doing the silly thing of converting a sparse vector into a dense vector before transferring it into Python; that's because I'm not familiar enough with the C++-Python interaction to know how to correctly wrap the functions that provide direct access to the underlying dict-like structure of a sparse vector.

Timings are similar, but now on a completely unloaded machine, so more indicative than before:

sage: sage: time C = CremonaModularSymbols(400001, sign=-1)
CPU times: user 25.4 s, sys: 95 ms, total: 25.5 s
Wall time: 25.5 s
sage: time T2a = C.hecke_matrix(2)
CPU times: user 13.7 s, sys: 4.08 s, total: 17.8 s
Wall time: 17.8 s
sage: time T2b = T2a.sage_matrix_over_ZZ(sparse=True)
CPU times: user 25 s, sys: 59 ms, total: 25.1 s
Wall time: 25.1 s
sage: time T2c = C.sparse_hecke_matrix(2)
CPU times: user 1min 2s, sys: 476 ms, total: 1min 3s
Wall time: 1min 3s
sage: time T2b == T2c
CPU times: user 9.48 s, sys: 633 ms, total: 10.1 s
Wall time: 10.1 s
True
sage: time T2c = C.sparse_hecke_matrix(2, base_ring=GF(2))
CPU times: user 17.4 s, sys: 328 ms, total: 17.7 s
Wall time: 17.7 s

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Aug 22, 2017

Author: Kiran Kedlaya

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Oct 2, 2017

comment:13

Note to myself: check to see whether the resolution of #22164 has had any effect on the memory leak reported above.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 4, 2017

Changed commit from 6fba176 to 05e52f4

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 4, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

0c0025aMerge branch 'u/kedlaya/make_hecke_operators_not_blow_up_the_memory' of git://trac.sagemath.org/sage into 21303
05e52f4Fix doctests for py3 print

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2017

comment:15

This syntax is deprecated and shouldn't be used anymore:

for j from 1 <= j <= n:

Cython will generate efficient code for range() if the variables are of some C type (which is the case here since you declared cdef long j).

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2017

comment:16

You should not put Python code inside sig_on()/sig_off() blocks.

There are two (not mutually exclusive) cases:

(A) If there is any particular non-Python statement that takes a long time or might generate signals, put sig_on()/sig_off() around that particular statement.

(B) If you just want to interrupt the loop, put sig_check() inside that loop.

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2017

comment:17

In the eclib.pxd file, you have a duplicate

ctypedef int scalar

It should be declared just once.

@jdemeyer
Copy link

jdemeyer commented Oct 4, 2017

comment:18

(NOTE: I just checked the Cython programming issues, I didn't actually try to understand the code; I assume that John can do that)

@JohnCremona
Copy link
Member

comment:19

I looked at the code. Congratulations for being the first person in history other than me and Luis Figueiredo (a student of Taylor in the 1990s) to read my sparse vector / matrix code!

What you do is go through the rows, extract the i'th row as a sparse vector, convert it to a dense vector, and then go through its entries storing the nonzero ones in the new Sage sparse matrix. This is inefficient as you look at all m*n netries (if the matrix is mxn). It would be better to extract just the nonzero entries in the first place. I can try to write that, or help someone else, but before do I need to know if Cython will have access to the "protected" (in C++ terms) i.e. semi-private data components of the C++ sparse matrix. If so I can write the necessary lines.

@JohnCremona
Copy link
Member

comment:21

That makes it harder. The thing is an array of "rows" (one for each actual row even if the row is zero). We know how many from M.nrows(). For the i'th row we store the number of nonzero entries, the list of columns they are in and the list of the entries. The only difficulty is in getting the 0/1 rebasing correct since C, like Python goes from 0 bu my user interface goes from 1...

I can see that to provide the interface needed will require adding to eclib. Sorry. Meanwhile it would still be simpler to extract just the nonzero entries from the sparse vector which we already extract for each row. But even that requires access to protected members of the svec class (which are essentially dicts containing (index, netry) pairs). So that leaves this:

        for i in range(n):
            for j in range(n):
                Mij = M.elem(i+1,j+1)
                if Mij:
                    d[(i, j)] = Mij

I don't expect that to be any better than the current version though since for each i,j the elem() method will do a linear search along the i'th row to see if the j'th entry is present.

Sorry not to have a low-level interface!

@JohnCremona
Copy link
Member

comment:22

A possible low-level interface could work like a Python iterator delivering triple (i,j,M[i,j]) with some signal at the end such as (0,0,0). Without doing a lot of work that could easily be provided as a single 3xN matrix where N is one more than the number of nonzero entries. Then the user can loop over that. But such a change to eclib is for the future, and should not delay this.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 5, 2017

Changed commit from 05e52f4 to 68ba41c

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 5, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

68ba41cCorrected some Cython issues

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Oct 5, 2017

comment:24

I fixed the Cython issues (I think).

That said, svec.h does appear to include some access functions:

  // functions to enable iteration over the entries without direct
  // access to the entries map:
  map<int,scalar>::const_iterator begin() const {return entries.begin();}
  map<int,scalar>::const_iterator end() const {return entries.end();}
  map<int,scalar>::iterator begin() {return entries.begin();}
  map<int,scalar>::iterator end() {return entries.end();}
  void erase(int i);  // erases v[i]; error if not set
  int first_index() const {return entries.upper_bound(0)->first;}
  std::set<int> support() const;

that should avoid creating a dense vector; but I need to learn a bit more about Cython to figure out how to exploit those.

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Oct 5, 2017

Changed upstream from Not yet reported upstream; Will do shortly. to None of the above - read trac for reasoning.

@jdemeyer
Copy link

jdemeyer commented Oct 5, 2017

comment:25

Replying to @kedlaya:

I need to learn a bit more about Cython to figure out how to exploit those.

Cython does support that (I'm assuming that map means std::map):

from libcpp.map cimport map

cdef extern from "...":
    cdef cppclass ...:
        map[int, scalar].iterator begin()
        map[int, scalar].iterator end()

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 5, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

981edb0Attempt to directly access svec

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 5, 2017

Changed commit from 68ba41c to 981edb0

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Oct 5, 2017

comment:27

I tried to use the iterator following a model in the Cython documentation, but...:

[sagelib-8.1.beta6] ...
[sagelib-8.1.beta6]         cdef smat M = self.H.s_heckeop(p, dual, verbose)
[sagelib-8.1.beta6]         sig_off()
[sagelib-8.1.beta6]         for i in range(n):
[sagelib-8.1.beta6]             sv = M.row(i+1)
[sagelib-8.1.beta6]             iter = sv.begin()
[sagelib-8.1.beta6]             while iter != sv.end():
[sagelib-8.1.beta6]                       ^
[sagelib-8.1.beta6] ------------------------------------------------------------
[sagelib-8.1.beta6]
[sagelib-8.1.beta6] sage/libs/eclib/homspace.pyx:284:23: Invalid types for '!=' (<error>, iterator)

@jdemeyer
Copy link

jdemeyer commented Oct 6, 2017

comment:28

Next time, please quote the complete error message:

[sagelib-8.1.beta7] Error compiling Cython file:
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] ...
[sagelib-8.1.beta7]         cdef long n = self.dimension()
[sagelib-8.1.beta7]         cdef long i=0
[sagelib-8.1.beta7]         cdef long j=0
[sagelib-8.1.beta7]         cdef vec v
[sagelib-8.1.beta7]         cdef svec sv
[sagelib-8.1.beta7]         cdef map[int, scalar].iterator iter
[sagelib-8.1.beta7]             ^
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] sage/libs/eclib/homspace.pyx:276:13: 'map' is not a type identifier
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] Error compiling Cython file:
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] ...
[sagelib-8.1.beta7]         cdef smat M = self.H.s_heckeop(p, dual, verbose)
[sagelib-8.1.beta7]         sig_off()
[sagelib-8.1.beta7]         for i in range(n):
[sagelib-8.1.beta7]             sv = M.row(i+1)
[sagelib-8.1.beta7]             iter = sv.begin()
[sagelib-8.1.beta7]             while iter != sv.end():
[sagelib-8.1.beta7]                       ^
[sagelib-8.1.beta7] ------------------------------------------------------------
[sagelib-8.1.beta7] 
[sagelib-8.1.beta7] sage/libs/eclib/homspace.pyx:284:23: Invalid types for '!=' (<error>, iterator)

You forgot from libcpp.map cimport map

@jdemeyer
Copy link

jdemeyer commented Oct 6, 2017

comment:29

In fact, Cython does type inference, so you could also just remove the line

cdef map[int, scalar].iterator iter

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 6, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

35c2ee5Merge branch 'develop' of git://trac.sagemath.org/sage into t/21303/make_hecke_operators_not_blow_up_the_memory
d257eabImplement map iteration, correctly this time

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 6, 2017

Changed commit from 981edb0 to d257eab

@kedlaya
Copy link
Sponsor Contributor

kedlaya commented Oct 6, 2017

comment:31

OK, I think I now have a working interface that avoids instantiating any dense vectors.

@roed314
Copy link
Contributor

roed314 commented Oct 27, 2017

@roed314
Copy link
Contributor

roed314 commented Oct 27, 2017

comment:33

I made some trivial fixes. Looks good to me.


New commits:

6c3bac5Trivial fixes

@roed314
Copy link
Contributor

roed314 commented Oct 27, 2017

Reviewer: David Roe

@roed314
Copy link
Contributor

roed314 commented Oct 27, 2017

Changed commit from d257eab to 6c3bac5

@vbraun
Copy link
Member

vbraun commented Oct 30, 2017

Changed branch from u/roed/make_hecke_operators_not_blow_up_the_memory to 6c3bac5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants