Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure for conversions of Sage number types for external Cython/Python packages (fpylll, pypolymake, ...) #21725

Closed
mkoeppe opened this issue Oct 19, 2016 · 42 comments

Comments

@mkoeppe
Copy link
Member

mkoeppe commented Oct 19, 2016

There is a current trend to develop Cython libraries as separate packages, rather than as part of Sage.

Many of these libraries take numbers of various kinds as input or produce them as output.

It would be nice if these Cython libraries could do conversions from/to Sage number types. Otherwise, on the Sage side we would have to provide a separate wrapper around the Cython wrapper.

Currently, this leads to

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

CC: @malb @jdemeyer @videlec @infinity0 @mezzarobba @sagetrac-tmonteil @dimpase @slel @defeo @w-bruns

Component: interfaces

Reviewer: Jeroen Demeyer

Issue created by migration from https://trac.sagemath.org/ticket/21725

@mkoeppe mkoeppe added this to the sage-7.5 milestone Oct 19, 2016
@jdemeyer
Copy link

Replying to @mkoeppe:

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

That's not going to happen.

@jdemeyer jdemeyer removed this from the sage-7.5 milestone Oct 19, 2016
@mezzarobba
Copy link
Member

comment:3

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location? This would at least make it possible to exchange multiple-precision numbers with sage without build-depending on it.

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 19, 2016

comment:4

Replying to @jdemeyer:

Replying to @mkoeppe:

We need to develop shared infrastructure (a new lightweight library, which does NOT depend on sagelib) for this.

That's not going to happen.

How do you know?

@jdemeyer
Copy link

comment:6

I mean, how could it ever be possible to convert something to Sage types (I assume you mean with an efficient C interface) without depending on Sage?

@jdemeyer
Copy link

comment:7

Replying to @mezzarobba:

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location?

Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a PyCapsule?

To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?

@jdemeyer
Copy link

Replying to @mkoeppe:

  • duplication of effort (for example, PyPolymake and its GMP wrappers)

Can you elaborate on this?

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 19, 2016

comment:9

For example, it would not seem so hard to provide a C library with a minimal interface such as

int mpz_from_python_integer(mpz_t target, PyObject *object)

that fills target with a GMP integer converted from object
and returns an error code if it can't.

By default (in plain Python), it would only deal with Python integers.

But libraries that provide new integer types such as sagelib can hook into this, for example by defining a method or any other of Python's dynamic mechanisms.

For the opposite conversion, for converting to the Python/Sage world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 19, 2016

comment:10

Replying to @jdemeyer:

Replying to @mkoeppe:

  • duplication of effort (for example, PyPolymake and its GMP wrappers)

Can you elaborate on this?

Various libraries that I am currently interested in (polymake, normaliz) need Cython wrappers, which should be usable with or without Sage. The interfaces are using integers (and integer vectors, matrices, and more complicated structures) - and one should be able to request Python integers or Sage integers from the Cython wrapper.
However, it should definitely not build-depend on sagelib (because we will need the opposite dependency -- sagelib will have to build-depend on pyPolymake, pyNormaliz etc.)

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 19, 2016

comment:11

Replying to @mkoeppe:

For the opposite conversion, for converting to the Python/Sage world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

For example C library prototype:

PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)

@videlec
Copy link
Contributor

videlec commented Oct 19, 2016

comment:12

Replying to @mkoeppe:

Replying to @mkoeppe:

For the opposite conversion, for converting to the Python/Sage world, one would pass an extra argument to the wrapped function that indicates the type of output (Python integer? Sage integer? ...)

For example C library prototype:

PyObject *mpz_to_python_integer(mpz_t source, PyObject *output_type)

This is exactly the kind of thing I am fighting with in pypolymake. Actually pypolymake could easily provide a function to set the interface

int set_pypolymake_mpz_wrapper((PyObject *) (mpz_t, PyObject *))

I am not sure what is the best approach here but +1 for having a standardized and Sage agnostic way of doing it! I am available to test as many approaches as suggested within pypolymake.

Note that we might want to do that for many basic types (integer, rational, floating point real and complexes, Zmodn, matrices, polynomials, ...). And this explosion might become a nightmare.

@jdemeyer
Copy link

comment:13

Replying to @mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

One thing we really need to consider is efficiency. We already have plenty of slow ways to do what you want (e.g. for integers we can go via Python ints, for other types we could use strings). So I would like a solution which is implementable 100% in Cython (in particular, there should be no Python methods/attributes involved).

@jdemeyer
Copy link

comment:15

There is this thing: https://pypi.python.org/pypi/gmpy2 which provides an interface to GMP and MPFR. It implements wrappers around the GMP/MPFR types, so it could probably be used to pass around mpz_t pointers.

I don't think it was created with this use case in mind, but I still think that this could be a good starting point. We obviously would need to add a Cython interface.

@mezzarobba
Copy link
Member

comment:16

Replying to @jdemeyer:

Replying to @mezzarobba:

If a Python call + a copy per number conversion it not too high a price to pay, then perhaps sage objects backed by a “well-known” C library could provide an interface that would make them read/write an mpz_t (resp. mpq_t, mpfr_t, mpfi_t...) from/to a specified location?

Sage types already have such an interface using Cython. So you want to add something like add using a Python interface? I presume using a PyCapsule?

Yes, something like that. Or perhaps the buffer protocol, if that's more convenient for some reason? Or even by asking the caller to pass an instance of Python extension class of a certain fixed form...

To be honest, I don't really understand the use case for this ticket. Why would you want to convert something to Sage without depending on Sage?

More or less for the same reason as the Debian packagers: to make it possible for Python interfaces to software using GMP, MPFR & co to be used in conjunction with Sage without necessarily being compiled with Sage support. For example, I am involved in a research project for which we developed a Python interface to Sollya. In most cases, this interface is used to call Sollya from pure Python programs. In this mode, all computations with multiple-precision types (Sollya is based on MPFI) are done by the Sollya library and stay there. However, occasionally, it is also very useful to write code that uses both Sollya and Sage, and passes MPFI intervals back and forth. Currently, in order to do that, we need either to somehow encode these intervals as Python objects, or to build the interface in a special mode with Sage support. Now, the Python-Sollya interface is semi-private code with about four users, so that's not much of an issue for us. But if we were to clean it up and distribute it, supporting both modes would cause all kinds of headaches for no good reason.

I don't think it is reasonable to move the Sage number types to an external package, if only because the Sage library is full of circular dependencies between modules. (And I don't see how to have conversions that don't go through Python at all without doing something like that...) However, I don't see the problem with trying to standardize a way to exchange, say, GMP, MPFR, MPFI, and perhaps FLINT/Arb objects between Python libraries (as originally suggested), or at least having an ad hoc solution to import/export these kinds of objects to/from Sage (as suggested in my comment).

@jdemeyer
Copy link

comment:18

I am going to add a gmpy2 optional package to check whether we can do something with it: #21727

@jdemeyer
Copy link

comment:19

I have been looking at gmpy2 for the last hour or so. I think it could work to use the gmpy2 types as intermediates for the proposed conversions.

The big issue is that gmpy2 currently does not expose a C or Cython interface. So either we need to work with upstream to implement that or we need to fork gmpy2.

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 19, 2016

comment:21

Replying to @jdemeyer:

Replying to @mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.

Of course, we should be making a Cython library with corresponding Cython prototypes.

@jdemeyer
Copy link

comment:22

Replying to @mkoeppe:

Replying to @jdemeyer:

Replying to @mkoeppe:

For example, it would not seem so hard to provide a C library

I hope that you mean a Python module and not a C library. Libraries are a pain in Python.

The C prototypes that I wrote are an illustration of what I would do if we were making the Python wrappers by hand instead of using Cython.

Of course, we should be making a Cython library with corresponding Cython prototypes.

And does it look like a good idea to use gmpy2 for that? It already implements thin wrappers around the GMP types, which is precisely what we need.

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 20, 2016

comment:23

Replying to @jdemeyer:

Replying to @mkoeppe:

Of course, we should be making a Cython library with corresponding Cython prototypes.

And does it look like a good idea to use gmpy2 for that? It already implements thin wrappers around the GMP types, which is precisely what we need.

I'll take a look at gmpy2 soon to find out.

@mkoeppe

This comment has been minimized.

@jdemeyer
Copy link

comment:25

My idea would be:

  • Add an __mpz__ method to various types. This will be used to convert an "arbitrary" object to a gmpy2 integer (which is just a thin wrapper around an mpz_t). This conversion will not be very fast, but probably fast enough for most use cases.

  • For the cases where we have complete control over the types, we could use Cython code to construct a gmpy2 integer using the C interface.

  • For conversion from gmpy2 to a type t, we just need to add some isinstance(x, gmpy2.mpz) branches in t.__init__.

As I said before, this would require non-trivial (but backwards compatible) changes to gmpy2.

@jdemeyer
Copy link

comment:26

Replying to @mkoeppe:

For example, it would not seem so hard to provide a C library with a minimal interface such as

int mpz_from_python_integer(mpz_t target, PyObject *object)

that fills target with a GMP integer converted from object
and returns an error code if it can't.

With my idea, this function would be just the constructor of the gmpy2.mpz type.

@videlec
Copy link
Contributor

videlec commented Oct 21, 2016

comment:28

Replying to @jdemeyer:

My idea would be:

  • Add an __mpz__ method to various types. This will be used to convert an "arbitrary" object to a gmpy2 integer (which is just a thin wrapper around an mpz_t). This conversion will not be very fast, but probably fast enough for most use cases.

  • For the cases where we have complete control over the types, we could use Cython code to construct a gmpy2 integer using the C interface.

  • For conversion from gmpy2 to a type t, we just need to add some isinstance(x, gmpy2.mpz) branches in t.__init__.

As I said before, this would require non-trivial (but backwards compatible) changes to gmpy2.

Why do you think it would be better to go through Python wrappers rather than direct C function calls? Note that such situation would create a dependency to gmpy2 for many projects. On the plus side, flint and arb have python-flint.

That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.

@jdemeyer
Copy link

comment:29

Replying to @videlec:

Why do you think it would be better to go through Python wrappers rather than direct C function calls?

I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around mpz_t pointers but you cannot literally do that in Python: you need to wrap the mpz_t pointer in some Python object. And that's precisely what gmpy2 does.

Let me also point out that gmpy2 does object caching (like Sage Integer), such that object creation/destruction can be very fast.

Note that such situation would create a dependency to gmpy2 for many projects.

You will need a dependency on something anyway. This feature needs to be implemented somewhere (not in thin air) and you will need to depend on that.

That being said, what would be the integration of these wrappers in the Sage ecosystem? For example with respect to coercion, would it make sense to have Sage types inheriting from these wrappers, etc.

I see those gmpy2 types purely as a way of passing around data. There should be a conversion from/to those types, but nothing more than that. Inheriting from those types wouldn't work because that's incompatible with the Sage coercion model.

We could still support those types in the Sage coercion model, like we currently do for the standard Python types like int. But that's not a priority.

@jdemeyer
Copy link

comment:30

Replying to @videlec:

On the plus side, flint and arb have python-flint.

I don't really understand what you mean with that comment. Are you saying that we could use this instead of gmpy2? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is that gmpy2 looks more optimized for speed than python-flint.

@videlec
Copy link
Contributor

videlec commented Oct 23, 2016

comment:31

Replying to @jdemeyer:

Replying to @videlec:

Why do you think it would be better to go through Python wrappers rather than direct C function calls?

I just don't see how one could implement that if you don't want build-time dependencies and you don't want Python objects to be involved. You need to pass around mpz_t pointers but you cannot literally do that in Python: you need to wrap the mpz_t pointer in some Python object. And that's precisely what gmpy2 does.

Let me also point out that gmpy2 does object caching (like Sage Integer), such that object creation/destruction can be very fast.

This is not exactly what I need for pypolymake. This library is just a Python interface to polymake (that is currently a mix of C++ and Perl). When the answer to some method is an integer I want the possibility to return a Sage integer (if it is used from Sage), a gmpy2 integer (if it is used from Python with gmpy2 available) or a Python integer.

This is the kind of template I have

cpdef my_integer_question():
    cdef mpz_t ans
    my_polymake_function(ans)
    return mpz_wrapper(ans)

And I want this mpz_wrapper parametrizable (on module loading or by the user). This is currently working but not very clean.

With your proposal, this function will have to go through gmpy2 and then call a (possibly pure Python) conversion gmpy2 -> whatever.

cpdef my_integer_question():
    cdef mpz_t ans
    my_polymake_function(ans)
    my_gmpy2 = mpz_wrapper(ans)
    return gmpy2_converter(my_gmpy2)

I am not sure I prefer the gmpy2 version.

@videlec
Copy link
Contributor

videlec commented Oct 23, 2016

comment:32

Replying to @jdemeyer:

Replying to @videlec:

On the plus side, flint and arb have python-flint.

I don't really understand what you mean with that comment. Are you saying that we could use this instead of gmpy2? One issue is the fact that GMP/MPIR types are more "standard" than the FLINT types. A second issue is that gmpy2 looks more optimized for speed than python-flint.

This is a follow-up comment: we also want to support flint and arb conversions and there are already small Python wrappers for that purpose. It is disjoint from the gmp problem but it was worth mentioning to help taking a decision.

@jdemeyer
Copy link

comment:33

Replying to @videlec:

With your proposal, this function will have to go through gmpy2 and then call a (possibly pure Python) conversion gmpy2 -> whatever.

At least it's possible to do a slow-but-working pure Python conversion.

@jdemeyer
Copy link

comment:34

I think we are talking about two different things:

  1. Implementing a way for Python projects to efficiently communicate integers between them, without build-time dependencies (the fpylll problem).

  2. A customizable make_whatever_integer_from_mpz() function (the pypolymake problem)

These are different problems and I consider this ticket to be about 1. (but the ticket description is not clear what it really is about).

@jdemeyer
Copy link

comment:35

And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable). Ideally, this would be added to some existing project because it seems a bit overkill to create a separate Python project for just those 10 lines.

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 27, 2016

comment:37

Replying to @jdemeyer:

And actually, 2. is totally trivial to implement in like 10 lines of Cython code (you just need to store a pointer to a conversion function in a global variable).

I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.

Also fpylll's current solution, namely making Sage integers instead of Python integers whenever it finds sage.all in the system, seems like a poor design decision as well. If another Python library uses fpylll, it may break when fpylll decides to use Sage integers.

@jdemeyer
Copy link

comment:38

Replying to @mkoeppe:

I would not like to see Cython libraries to use global variables for that; it seems a short-sighted design decision. If library A does that, but then libraries B and C use library A, they may need library A to use two different output conversions.

I actually think that a global variable is the whole point of the requested feature. You want all libraries to use a common given output format requested by the user (not other libraries).

@mkoeppe
Copy link
Member Author

mkoeppe commented Oct 27, 2016

comment:39

It's a cleaner design to pass an extra argument around.

In my experience, global variables never solve a problem but only create new ones.

@jdemeyer
Copy link

comment:40

Replying to @mkoeppe:

It's a cleaner design to pass an extra argument around.

Or just return something generic (like a gmpy2 instance) and let the user code convert it if needed.

In my experience, global variables never solve a problem but only create new ones.

Vincent is the one who asked for this feature, so I'll let him discuss it with you.

Anyway, I would still like to know what you think of my idea to use gmpy2 as "de facto" standard to pass around integers.

@jdemeyer
Copy link

jdemeyer commented Mar 1, 2017

comment:44

Given the unclear ticket description and the lack of interest, I suggest to close this as invalid.

I am not saying that there is nothing to be done here, just that this ticket is too vague. It seems that everybody has a different idea about what it means.

If you want some feature in the spirit of this ticket, please open a new ticket, clearly stating what you want.

@mkoeppe
Copy link
Member Author

mkoeppe commented Mar 1, 2017

comment:45

Uh, we are working on it. Greeting from Sage Days.

@jdemeyer
Copy link

jdemeyer commented Mar 1, 2017

comment:47

Replying to @mkoeppe:

Uh, we are working on it.

Good! I am looking forward to your update of the ticket description (I hope you have read [comment:44] and understand why I proposed to close this ticket).

@jdemeyer
Copy link

jdemeyer commented Mar 8, 2017

comment:48

ping

@videlec
Copy link
Contributor

videlec commented Mar 8, 2017

comment:49

These are two weeks Sage days. Why are you so picky about this ticket?

@jdemeyer
Copy link

jdemeyer commented Mar 9, 2017

comment:50

Does it really take 2 weeks to figure out what the ticket is about?

I'm picky about this one because I feel like I might be interested and I could help if only somebody would clarify what this ticket is about.

In reality, I guess that this ticket is really about several things and each of those things should get its own ticket.

@dimpase
Copy link
Member

dimpase commented Feb 6, 2018

comment:51

Another case at hand is #23024...

@jdemeyer
Copy link

jdemeyer commented Feb 6, 2018

comment:52

11 months and still no explanation.

@jdemeyer
Copy link

jdemeyer commented Feb 6, 2018

Reviewer: Jeroen Demeyer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants