Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Hilbert series implementation beyond Singular's limitations #26243

Closed
simon-king-jena opened this issue Sep 11, 2018 · 94 comments
Closed

Comments

@simon-king-jena
Copy link
Member

In my work on group cohomology, I was bitten by the fact that Singular limits integers to 32 bit in the coefficients of Hilbert series. By consequence, examples such as the following from #20145 fail:

        sage: n=4;m=11;P = PolynomialRing(QQ,n*m,"x"); x = P.gens(); M = Matrix(n,x)
        sage: I = P.ideal(M.minors(2))
        sage: J = P*[m.lm() for m in I.groebner_basis()]
        sage: J.hilbert_numerator()
        Traceback (most recent call last):
        ...
        RuntimeError: error in Singular function call 'hilb':
        int overflow in hilb 1

It seems that upstream does not want to change Singular to use 64 bit integers by default and arbitrary size integers when needed (that's what I would propose. Therefore I am providing an implementation of Hilbert series computation that does not suffer from such limitation.

        sage: from sage.rings.polynomial.hilbert import first_hilbert_series
        sage: hilbert_poincare_series(J).numerator()
        120*t^3 + 135*t^2 + 30*t + 1
        sage: hilbert_poincare_series(J).denominator().factor()
        (t - 1)^14

The above is the correct result according to #20145.

My implementation is a bit slower than libsingular, which is why I do not propose to use my implementation as a replacement of libsingular's hilb function. However, in big applications, it is a good way out.

CC: @w-bruns

Component: commutative algebra

Keywords: Hilbert series

Author: Simon King

Branch/Commit: 5637e07

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/26243

@simon-king-jena
Copy link
Member Author

@simon-king-jena
Copy link
Member Author

Commit: 827f579

@simon-king-jena
Copy link
Member Author

New commits:

1a6c6dcProvide an implementation of Hilbert series without Singular's limitations
827f579Provide documentation for Hilbert series computation

@simon-king-jena

This comment has been minimized.

@w-bruns
Copy link
Mannequin

w-bruns mannequin commented Sep 11, 2018

comment:5

The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858

@simon-king-jena
Copy link
Member Author

Some real world example for the computation of Hilbert series

@simon-king-jena
Copy link
Member Author

comment:6

Attachment: hilbert_example.sobj.gz

Replying to @w-bruns:

The residue class ring of the polynomial ring in 11 x 4 variables modulo the 2x2 minors is a normal monoid ring. You can use Normaliz, but there is also an explicit formula. See MR1213858

On #25091, you said that normaliz can only compute the Hilbert series of the integral closure of a monomial ideal. That may not be what is required here.

Anyway: Clearly I am not interested in that particular example. The example is just that: An example, that I borrowed from #20145. It is suitable as a doc test, as it can be set up in short time; and it is instructive as a doc example, as it exposes limitations of Singular that make Singular unusable in "real world examples" for which there are no explicit formulae in the literature.

Or would you know a formula for the Hilbert series of the mod-2 cohomology ring of group number 299 of order 256 (numbering according to the small groups library)? That cohomology ring is the smallest potential counter example to a conjecture of Dave Benson. A small part of the attempt to verify that conjecture is the computation of the Hilbert series of several quotients of that ring. Unfortunately, that "small part" is too big for Singular or Frobby.

The leading ideal of the relation ideal of the above-mentioned cohomology ring is in attachment: hilbert_example.sobjThe computation of its Hilbert series actually shouldn't take very long:

sage: from sage.rings.polynomial.hilbert import hilbert_poincare_series
sage: I,grading = load("/home/king/Projekte/coho/hilbert_example.sobj")
sage: %time hilbert_poincare_series(I,grading)
CPU times: user 270 ms, sys: 2.85 ms, total: 273 ms
Wall time: 324 ms
(-8*t^26 + 24*t^25 - 16*t^24 - 16*t^23 + 24*t^22 - 8*t^21 - t^7 - t^6 + t^5 - t^4 - t^3 - t^2 + t - 1)/(t^13 - 3*t^12 + 2*t^11 + 2*t^10 - 3*t^9 + t^8 - t^5 + 3*t^4 - 2*t^3 - 2*t^2 + 3*t - 1)

However, Singular immediately tells that the example is too big, and Frobby needs to be killed after several minutes. There is no interface to CoCoA, and I don't know how to utilise the interface to normaliz. I also don't know how to install Macaulay2, so, I can't test that either.

@tscrim
Copy link
Collaborator

tscrim commented Sep 11, 2018

comment:7

All of these comments are about sum_from_list:

Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g., L[:l2] and L[:l2]). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Some other thoughts:

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting. It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

It is faster to do if not Id: than if len(Id)==0:, even in Cython.

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

Should those ETuple functions be a part of the ETuple class?

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

You can avoid a few additions here:

-    for i from 0 <= i < 2*m._nonzero by 2:
-        degree += m._data[i+1]
+    for i in range(1, 2*m._nonzero, 2):
+        degree += m._data[i]

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

quotient(by_var) knows its return type is a list since interred returns a list.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

@simon-king-jena
Copy link
Member Author

comment:8

Dear Travis,

first of all: Thank you for your thoughtful comments!

Replying to @tscrim:

Something that should give you better speed would be having it take another argument of the starting point to do the sum. That way you would not have to create all of the intermediate lists (e.g., L[:l2] and L[:l2]). It will be slightly trickier to code, but not by much and I would imagine give a decent speedup.

Right, I should try that.

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it.
Anyway, here is the resulting C code:

    /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268
 *     if l==1:
 *         m1 = L[0]
 *         return L[0]             # <<<<<<<<<<<<<<
 *     if l==2:
 *         m1,m2=L
 */
    __Pyx_XDECREF(((PyObject *)__pyx_r));
    if (unlikely(__pyx_v_L == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      __PYX_ERR(0, 268, __pyx_L1_error)
    }
    __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_2);
    if (!(likely(((__pyx_t_2) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_2, __pyx_ptype_4sage_5rings_10polynomial_8polydict_ETuple))))) __PYX_ERR(0, 268, __pyx_L1_error)
    __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2);
    __pyx_t_2 = 0;
    goto __pyx_L0;

versus

    /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":268
 *     if l==1:
 *         m1 = L[0]
 *         return <ETuple>L[0]             # <<<<<<<<<<<<<<
 *     if l==2:
 *         m1,m2=L
 */
    __Pyx_XDECREF(((PyObject *)__pyx_r));
    if (unlikely(__pyx_v_L == Py_None)) {
      PyErr_SetString(PyExc_TypeError, "'NoneType' object is not subscriptable");
      __PYX_ERR(0, 268, __pyx_L1_error)
    }
    __pyx_t_2 = __Pyx_GetItemInt_List(__pyx_v_L, 0, long, 1, __Pyx_PyInt_From_long, 1, 0, 1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 268, __pyx_L1_error)
    __Pyx_GOTREF(__pyx_t_2);
    __Pyx_INCREF(((PyObject *)((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2)));
    __pyx_r = ((struct __pyx_obj_4sage_5rings_10polynomial_8polydict_ETuple *)__pyx_t_2);
    __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
    goto __pyx_L0;

Is the second one better?

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting.

Can you give me a pointer on how to use arrays of Cython types?

It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

I also use append and pop. How to do these with arrays?

It is faster to do if not Id: than if len(Id)==0:, even in Cython.

Cool! I thought that, again, it is equivalent. But the code becomes a lot shorter:

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292
 *     cdef int e
 *     # First, the easiest cases:
 *     if len(Id)==0:             # <<<<<<<<<<<<<<
 *         return PR(1)
 *     cdef ETuple m = Id[-1]
 */
  if (unlikely(__pyx_v_Id == Py_None)) {
    PyErr_SetString(PyExc_TypeError, "object of type 'NoneType' has no len()");
    __PYX_ERR(0, 292, __pyx_L1_error)
  }
  __pyx_t_2 = PyList_GET_SIZE(__pyx_v_Id); if (unlikely(__pyx_t_2 == ((Py_ssize_t)-1))) __PYX_ERR(0, 292, __pyx_L1_error)
  __pyx_t_3 = ((__pyx_t_2 == 0) != 0);
  if (__pyx_t_3) {

versus

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":292
 *     cdef int e
 *     # First, the easiest cases:
 *     if not Id:             # <<<<<<<<<<<<<<
 *         return PR(1)
 *     cdef ETuple m = Id[-1]
 */
  __pyx_t_2 = (__pyx_v_Id != Py_None)&&(PyList_GET_SIZE(__pyx_v_Id) != 0);
  __pyx_t_3 = ((!__pyx_t_2) != 0);
  if (__pyx_t_3) {

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

I don't understand that. What bounds check are you talking about? Are you saying that Id[-1] does a bounds check, whereas Id[len(Id)-1] doesn't?

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.

Should those ETuple functions be a part of the ETuple class?

Probably. Since ETuple is designed for monomials anyway, it make sense to have monomial divisibility checks there and not here.

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

Almost. I get

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110
 *     cdef int exp1
 *     cdef ETuple result
 *     for i from 0 <= i < 2*m1._nonzero by 2:             # <<<<<<<<<<<<<<
 *         if m1._data[i] == index:
 *             result = <ETuple>m1._new()
 */
  __pyx_t_1 = (2 * __pyx_v_m1->_nonzero);
  for (__pyx_v_i = 0; __pyx_v_i < __pyx_t_1; __pyx_v_i+=2) {
    ...
    }

versus

  /* "_home_king_Sage_git_sage_src_sage_rings_polynomial_hilbert_pyx_0.pyx":110
 *     cdef int exp1
 *     cdef ETuple result
 *     for i in range(0,2*m1._nonzero,2):             # <<<<<<<<<<<<<<
 *         if m1._data[i] == index:
 *             result = <ETuple>m1._new()
 */
  __pyx_t_1 = (2 * __pyx_v_m1->_nonzero);
  __pyx_t_2 = __pyx_t_1;
  for (__pyx_t_3 = 0; __pyx_t_3 < __pyx_t_2; __pyx_t_3+=2) {
    __pyx_v_i = __pyx_t_3;
    ...
    }

So, in the non-deprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use in range(...)) into from ... by.

You can avoid a few additions here:

-    for i from 0 <= i < 2*m._nonzero by 2:
-        degree += m._data[i+1]
+    for i in range(1, 2*m._nonzero, 2):
+        degree += m._data[i]

Right, good point!

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.

One thing, though: At some point, one needs to check whether w is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely in first_hilbert_series. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.

quotient(by_var) knows its return type is a list since interred returns a list.

Right.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

This is related with the point above: In a struct, I could also more easily use the underlying C data type of flint polynomials.

But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?

@simon-king-jena
Copy link
Member Author

comment:9

Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?

@w-bruns
Copy link
Mannequin

w-bruns mannequin commented Sep 12, 2018

comment:10
  1. One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.

  2. I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.

@simon-king-jena
Copy link
Member Author

comment:11

Replying to @w-bruns:

  1. One of the bad aspects of Singular is that it is the opposite of being careful in dealing with classes of objects. What is called the Hilbert series of an ideal I there, is the Hilbert series of R/I where R is the ambient polynomial ring. But I is a graded module and has its own Hilbert series. If one wants the Hilbert series of R/I, it should NOT be called the Hilbert series of I.

Right. In most cases, I actually use the wording "Hilbert series of I" and "Poincaré series of R/I", which I guess isn't standard, but at least it makes the difference clear. But I think in my comments above I haven't been sufficiently careful with these wordings.

EDIT: Note that Bigatti uses the notation "", which coincides with the Hilbert-Poincaré series of R/I. But that's just a notation, not a name. It seems to me that in practical applications, one most commonly has an ideal I and wants to compute the Hilbert-Poincaré series of R/I, not of "I as a module". Hence, from a practical aspect, it actually makes sense to use a function name such as "hilb" or "hilbert_numerator" and so on for a function that tells something about R/I but takes I as an input.

  1. I don't think one can relate the Hilbert series of a (monomial) ideal and the Hilbert series of its integral closure in a useful way.

That's unfortunate. I really need the !Hilbert/Poincaré series of R/I.

@tscrim
Copy link
Collaborator

tscrim commented Sep 12, 2018

comment:12

Replying to @simon-king-jena:

first of all: Thank you for your thoughtful comments!

No problem. Sorry I am not always able to find the time to review your code.

Replying to @tscrim:

I think this might also give you better C code (and hence, be faster):

    if l == 1:
        return <ETuple> L[0]
    if l == 2:
        return (<ETuple> L[0]).eadd(<ETuple> L[1])

Right, actually the first "if" is nonsense in my code: I assign to m1, but do not return it.
Anyway, here is the resulting C code:
[snip]
Is the second one better?

Yes, you can see that it does not do the type check (which could result in a segfault if the result is not the correct type).

Does L need to be a list (i.e., not an array of ETuple's)? This would mean less type-casting.

Can you give me a pointer on how to use arrays of Cython types?

Right, you can't have an array of cdef classes. Forgot about that. There is a way around that by having an array of pointers to the objects:

https://stackoverflow.com/questions/33851333/cython-how-do-you-create-an-array-of-cdef-class

It is just really annoying to have to do all of the type-casting.

It also seems like you are using list manipulations other than one sort (which you can most likely just use C qsort on).

I also use append and pop. How to do these with arrays?

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

If you replace Id[-1] with Id[len(Id)-1], you can then remove the bounds check and wrap around from HilbertBaseCase. You probably should remove those checks across the file to get some extra speed.

I don't understand that. What bounds check are you talking about? Are you saying that Id[-1] does a bounds check, whereas Id[len(Id)-1] doesn't?

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

You are probably going to be better in terms of speed to directly the low-level polynomial manipulation via flint, that way you have fewer intermediate (Sage) objects and indirection.

So, I should allocate the underlying flint C types and manipulate these, and only in the very end create a Sage polynomial? I guess that's related with another point below.

Yes, if you want to squeeze out as much speed as possible. It is more work and makes the code a bit more obscure, but it should give you some more speed.

The for i from 0 <= i < k by m is discouraged (if not semi-deprecated) in favor of the more standard Python for i in range(0, k, m), which gets turned into the same C code AFAIK.

Almost. I get
[snip]
So, in the non-deprecated version there is an additional variable assignment. Actually this is why I changed my original code (which did use in range(...)) into from ... by.

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

You should have two functions quotient_degree and quotient_degree_weighted, where the former does not take w as an input. There is almost no code shared between the two and it makes it easier to understand what is going on where they are called. Same for degree.

Really? I thought it was clearer the other way around: In all cases, we have a computation that involves degree weights, and it is the job of the function to find out if the weights are trivial or not.

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

One thing, though: At some point, one needs to check whether w is None or not. By having a massive code duplication, I could achieve that this check is only done once, namely in first_hilbert_series. Currently, it is checked frequently, and the frequency wouldn't be reduced with your suggestion of a small code duplication.

I see; I didn't think it was going to be that bad overall. However, it's not a big issue for me the way things are now.

I don't see why AN should be a dict (and hence the first arguments D to a number of your functions). It seems more like you are mimicking a struct with a dict. Using a struct will be much faster because it doesn't have to do string hashing and equality checking; it would be a direct C lookup. For those things you are manipulating in subfunction calls, you can pass it by pointer like you would normally in C (I forget if Cython passes everything by copy or reference by default for C objects).

This is related with the point above: In a struct, I could also more easily use the underlying C data type of flint polynomials.

Yes, that is correct, but the struct could hold onto a Sage polynomial just as well. The reason for doing this is both cleaner and faster code by avoiding the dict.

But just to be clear: I would need to do the memory management myself, wouldn't I? Basically, it is a binary search tree; is there code in Sage (I am sure there is!) where such data structure is used and from where I can get inspiration?

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

@tscrim
Copy link
Collaborator

tscrim commented Sep 12, 2018

comment:13

Replying to @simon-king-jena:

Just to be sure: If I turn the !ETuple functions into methods, they should probably be cpdef, not cdef, right? As long as all instances of !ETuple in my code are cdefined, a cpdef method wouldn't be less efficient than a cdef method, right?

No, a cpdef called from a Cython file is just a C function call IIRC. So it should not be any less efficient from within Cython code (although I think it can introduce a bit of extra overhead when doing a Python function call compared to a plain def).

sage: %%cython
....: def foo():
....:     pass
....: cpdef cpfoo():
....:     pass
....: cdef cfoo():
....:     pass
....: def test():
....:     foo()
....: def ctest():
....:     cfoo()
....: def cptest():
....:     cpfoo()
....: cpdef cptestcp():
....:     cpfoo()
....:
sage: %timeit test()
10000000 loops, best of 3: 34.9 ns per loop
sage: %timeit ctest()
10000000 loops, best of 3: 22.9 ns per loop
sage: %timeit cptest()
10000000 loops, best of 3: 22.9 ns per loop
sage: %timeit cptestcp()
10000000 loops, best of 3: 23.3 ns per loop

(These timings are relatively consistent across multiple runs for me.)

@simon-king-jena
Copy link
Member Author

comment:14

Just some questions to make sure I understand your suggestions:

Replying to @tscrim:

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

So, <ETuple> silences the type check, which is faster but can theoretically result in a segfault, right?

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

So, when I do L[len(L)-1], Cython will automatically know that a bound check is not needed?

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

In that case I should change it.

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

I do require tuple. Would list be better?

Anyway, do you think that I should duplicate the loop in first_hilbert_series, once for the case that w is None and once for the case that it isn't?

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

Maybe go an easy middle way, and have some

cdef class Node:
    cdef Node Back, Left, Right
    cdef object LMult   # or some flint boilerplate instead of object
    cdef object LeftFHS # or some flint boilerplate instead of object

The overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?

@simon-king-jena
Copy link
Member Author

comment:15

PS, concerning "pool":

I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?

It may actually be an idea to equip !ETuple with pool/kill list. In that way, MPolynomial_polydict would probably suck slightly less, and the code here might also be a little faster.

@simon-king-jena
Copy link
Member Author

comment:16

Replying to @simon-king-jena:

PS, concerning "pool":

I recall that when I was young, there used to be a utility that automatically equipped a given cdef class with a pool/kill list, similar to what is hard coded for Integer. But I cannot find it now. Has it gone?

Found it: sage.misc.allocator

@simon-king-jena
Copy link
Member Author

comment:17

Replying to @simon-king-jena:

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

Since I am sure about bounds being correct, I guess I can use <ETuple>PyList_GET_ITEM(...).

@tscrim
Copy link
Collaborator

tscrim commented Sep 12, 2018

comment:18

Replying to @simon-king-jena:

Just some questions to make sure I understand your suggestions:

Replying to @tscrim:

I missed that when going through the code. So then it is just better to use lists. Add lots of <ETuple>'s to avoid as much extra type checking as possible.

So, <ETuple> silences the type check, which is faster but can theoretically result in a segfault, right?

Yes, that is correct. If you want to do an explicit check in there that can raise an error, you can do <Etuple?>. (I believe this is suppose to be different than letting Cython raise an error when not given the correct type, but I'm not sure about that, Jeroen?)

Id[-1] is a wrap-around: normal C would just do pointer offset. You can tell Cython to assume that indices passed in behave like C arrays and do not check if the result is negative (wrap-around check) or outside the list (bounds check).

So, when I do L[len(L)-1], Cython will automatically know that a bound check is not needed?

You have to also tell Cython not to do those checks. You can do this at the level of the file by adding # cython: boundscheck=False, wraparound=False as the first line(s) of the file (see, e.g., matrix/args.pyx) or a decorator @cython.boundscheck(False) and @cython.wraparound(False) on the appropriate functions (see, e.g., combinat/combinat_cython.pyx).

How to avoid a bound check in sum_from_list(list L, size_t s, size_t l), where I access L[0], L[1], L[s], L[s+l//2] etc.?

See above.

Hmm...curious. Although the compiler will almost certainly optimize those extra variables out anyways.

In that case I should change it.

I bet that will make Jeoren happy. :)

It seems like they want to separate functions to me. Plus you could then require w to be a list for the signature.

I do require tuple. Would list be better?

I couldn't remember and was guessing. tuple is perfectly fine.

Anyway, do you think that I should duplicate the loop in first_hilbert_series, once for the case that w is None and once for the case that it isn't?

I don't see why that one has to be split. It is only degree and quotient_degree that should be split IMO. For example, in HilbertBaseCase you would have

-        for m2 in Id:
-            if m is not m2:
-                Factor *= (1-t**quotient_degree(m2,m,w))
+        if w is None:
+            for m2 in Id:
+                if m is not m2:
+                    Factor *= (1-t**quotient_degree(m2,m))
+        else:
+            for m2 in Id:
+                if m is not m2:
+                    Factor *= (1-t**quotient_degree_graded(m2,m,w))

In some cases, this will result in fewer if checks (unless the compiler is optimizing those out). In others, this will be no worse than before.

Yes, you would have to do some memory management. I'm not sure exactly how much from my relatively quick look, but it should not be too bad.

Maybe go an easy middle way, and have some

cdef class Node:
    cdef Node Back, Left, Right
    cdef object LMult   # or some flint boilerplate instead of object
    cdef object LeftFHS # or some flint boilerplate instead of object

The overhead would be that Python objects will be allocated (unless I use a pool, similar to what Sage does with Integers...), but I guess the allocation is not more than the allocation of a dict, and moreover I wouldn't need to care about memory management. And accessing the cdef attributes of the node will be faster than looking up dictionary items, wouldn't it?

Yes, it should be much faster (at that scale) and require less memory than a dict (hash tables are generally at most 80% full IIRC, but you have to store both the key (which would be a string) and the value). I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

Yes, you can use PyList_GET_ITEM, which I think is even a bit faster.

@simon-king-jena
Copy link
Member Author

comment:19

Replying to @tscrim:

I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.

@tscrim
Copy link
Collaborator

tscrim commented Sep 12, 2018

comment:20

Replying to @simon-king-jena:

Replying to @tscrim:

I think a pool is overkill for this as you are building out a tree rather than frequently creating and removing nodes, right? The memory management really only comes into play when you start dealing with the flint objects directly.

I wasn't talking about a pool for Nodes (which just build a tree), but about a pool for !ETuple, which I guess will be created and killed (during interreduction) more frequently.

Ah, I see. Yea, that might be good to do. Although the use of the pool could end up having a fair bit of overhead when not having a lot of ETuple's that would be GC'ed. I haven't actually looked though.

@simon-king-jena
Copy link
Member Author

comment:21

It seems to me that really for i in range(s,e,b) is a bit slower than for i from s<=i<e by b. My main example becomes about 2% slower. See also this little benchmark:

sage: cython("""
....: def test1(size_t m):
....:     cdef size_t i,j
....:     j = 0
....:     for i in range(0,m,2):
....:         j += i
....:     return j
....: def test2(size_t m):
....:     cdef size_t i,j
....:     j = 0
....:     for i from 0<=i<m by 2:
....:         j += 1
....:     return j
....: """)
sage: %timeit test1(10000)
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 3.18 µs per loop
sage: %timeit test1(10000)
100000 loops, best of 3: 3.1 µs per loop
sage: %timeit test2(10000)
The slowest run took 4.26 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.85 µs per loop
sage: %timeit test2(10000)
1000000 loops, best of 3: 1.86 µs per loop

So, if the loop itself is timed, one sees that the for ... from ... by is double the speed of for ... in range(...).

I guess I should use the deprecated syntax and post the above benchmark on cython-users, shouldn't I?

@simon-king-jena
Copy link
Member Author

comment:22

Ouch. I had a type in my benchmark (j+=i versus j+=1). Without the typo, the timing of the test functions is basically equal. So, probably you are right that the additional assignments are optimized out by the compiler.

I don't know if the 2% difference in the real world example is significant.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 12, 2018

Changed commit from 827f579 to 7698bb6

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 12, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

7698bb6Some code optimizations in Hilbert series computation

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 15, 2018

Changed commit from 2d29b9b to 07ff2e1

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 15, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

07ff2e1Reverting use of get_exp for __getitem__.

@tscrim
Copy link
Collaborator

tscrim commented Sep 15, 2018

comment:67

Okay, I just reverted using get_exp in the __getitem__. So there is some code duplication, but using get_exp will be faster when you know your input is an int. Thus, I think it is useful to actually have, but I wanted to match the data structure (which is why I changed it from a size_t input). I also saw a few doc/PEP8 things I missed on my previous pass that I added in too. The failing tests from the patchbot now pass for me.

@simon-king-jena
Copy link
Member Author

comment:69

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials? This and the comment on the failing test and the fact that "ETuple free code" as in comment:63 indicates that negative exponents is wrong usage.

So, what could we do about it?

An obvious change (on a new ticket, of course) would be to change ETuple._data from int* to size_t*.

@simon-king-jena
Copy link
Member Author

comment:70

Replying to @simon-king-jena:

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?

Too bad. ETuple is used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are non-negative, though. The assumption is explicitly documented in some places.

@tscrim
Copy link
Collaborator

tscrim commented Sep 15, 2018

comment:71

Replying to @simon-king-jena:

Replying to @simon-king-jena:

Am I correct in thinking that ETuple is designed for implementing polynomials, not Laurent polynomials?

Too bad. ETuple is used for Laurent polynomials. In the Hilbert series code, I do assume that the exponents are non-negative, though. The assumption is explicitly documented in some places.

I think the assumption is documented enough where it is used, so it is fine. If someone needs the negatives, then they can change it to, e.g., Py_ssize_t.

@simon-king-jena
Copy link
Member Author

comment:72

Are the tests passing? Can this be back to positive review?

@tscrim
Copy link
Collaborator

tscrim commented Sep 16, 2018

comment:73

Tests are passing for me. From #20145, don't you need another commit?

@tscrim
Copy link
Collaborator

tscrim commented Sep 16, 2018

comment:74

Replying to @tscrim:

Tests are passing for me. From #20145, don't you need another commit?

Let me rephrase, do you want the last commit on #20145 here?

@simon-king-jena
Copy link
Member Author

comment:75

Replying to @tscrim:

Replying to @tscrim:

Tests are passing for me. From #20145, don't you need another commit?

Let me rephrase, do you want the last commit on #20145 here?

No, why? I was told occasionally that one should keep tickets small. In this case, there is one ticket for a new implementation of Hilbert series, and one ticket (#20145) for using the new implementation (also adding an implementation of Hilbert polynomial) in existing methods.

Additionally, when you ask if I need the two commits from #20145: For my own applications, I just need the commits from here.

@tscrim
Copy link
Collaborator

tscrim commented Sep 16, 2018

comment:76

Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.

@simon-king-jena
Copy link
Member Author

comment:77

Replying to @tscrim:

Okay, I wasn't sure if the last commit from #20145 would be considered a bug from this part or not. Thank you for clarifying. So yes, all tests pass for me, and this is a positive review if my last change is good.

No, it wasn't a bug --- at least not a bug here. I thought it was needed to amend the sign in #20145, which I did in the first commit, but in fact it is not needed, and thus I reverted the sign amendment in the second commit.

@tscrim
Copy link
Collaborator

tscrim commented Sep 17, 2018

comment:78

I see. Thanks for explaining. This is ready to be set back to positive review.

@simon-king-jena
Copy link
Member Author

comment:79

Thanks!

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 18, 2018

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

7b2ed50Merge branch 'u/tscrim/hilbert_functions-26243' of git://trac.sagemath.org/sage into u/tscrim/hilbert_functions-26243
5637e07Fixing INPUT:: to INPUT: in hilbert.pyx.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 18, 2018

Changed commit from 07ff2e1 to 5637e07

@tscrim
Copy link
Collaborator

tscrim commented Sep 18, 2018

comment:81

A little stupid trivial fix.

@vbraun
Copy link
Member

vbraun commented Sep 20, 2018

Changed branch from u/tscrim/hilbert_functions-26243 to 5637e07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants