Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3 #13

Merged
merged 23 commits into from
Jul 19, 2017
Merged

Python3 #13

merged 23 commits into from
Jul 19, 2017

Conversation

videlec
Copy link
Collaborator

@videlec videlec commented Apr 29, 2017

This branch is based on top of doctests. I had to make a little hack to make the test pass, namely providing the option IGNORE_EXCEPTION_DETAIL to doctest. There is something fishy going on here.

@defeo defeo self-requested a review May 12, 2017 14:28
Copy link
Member

@defeo defeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

I agree with most of the changes. In particular, the Cython macros look ok to me; I have no better idea anyway.

Ccan you please just address the 4 comments I made?

Examples:

>>> from cypari2.closure import objtoclosure
>>> def pymul(i,j): return i*j
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? I think that was meant explicitly to test lambdas.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lambda functions behavior are different between Python2 and Python3. In Python3, a lambda function has only one argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? What do you mean? I'm using Python 3.6, and I see no problem with the old test

>>> objtoclosure(lambda i,j: i*j)
(v1,v2,v3,v4,v5)->call_python(v1,v2,v3,v4,v5,140180350752424)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I do not know what happend in the first place. This is back with 418cf40.

5
>>> gen_to_integer(pari("Pol(42)"))
42
>>> gen_to_integer(pari("u"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, did you get to the bottom of this? Why was it failing on Travis?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. This is still failing on travis. However the option IGNORE_EXCEPTION_DETAIL in doctests make this failure disappear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may at least want to open an issue for this, or we may forget about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#14

@@ -589,7 +615,7 @@ cpdef gen_to_python(Gen z):
elif t == t_VEC or t == t_COL:
return [gen_to_python(x) for x in z.python_list()]
elif t == t_VECSMALL:
return z.python_list_small()
return z.python_list_small()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is weird!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! Fixed by d7b7a3e


EXAMPLES::

sage: pari.debugstack() # random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to leave the example undoctested?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No without a hook to doctest. This is what they did in cypari and what we might want to do... (they also have a hook for 32 vs 64 bits doctests)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, we can think about this later.

@videlec videlec mentioned this pull request May 16, 2017
@jdemeyer
Copy link
Collaborator

It's wrong to assume that every string coming from PARI is ASCII. I mean, people should be able to write PARI/GP code like print("Author: Jean-François Mestre").
When in doubt, use sys.getfilesystemencoding() which is what Python uses by default.

I also don't agree with changing <bytes?> to <bytes>: it's an extra safety check.

@jdemeyer
Copy link
Collaborator

This is also wrong, because you aren't calling str() anymore:

{name} = {name}.encode('ascii') if isinstance({name}, str) else bytes({name})

This should be something like

{name} = str({name})
if PYTHON3:
    {name} = {name}.encode(filesystemencoding)

@videlec
Copy link
Collaborator Author

videlec commented Jun 7, 2017

Changes

  • removing system.pxi and use compile_time_env in cythonization
  • functions to_bytes and to_string to avoid repetitions

Travis looks happy.

@videlec
Copy link
Collaborator Author

videlec commented Jun 15, 2017

ping!

@jdemeyer
Copy link
Collaborator

I don't think it is good style to hard-code the Python version number in the .c sources. Better make it a run-time check.

@jdemeyer
Copy link
Collaborator

Some more docs for to_string would be welcome. What do you mean with "string" in this context? Do you mean the str type (which is bytes in Python 2 and unicode in Python 3)?

If so, it seems more reasonable to have to_bytes and to_unicode functions, where to_string = to_bytes on Python 2 and to_string = to_unicode on Python 3. That would simplify things.

Also: what is acceptable input for to_string? Both bytes and unicode on Python 2 and Python 3?

@videlec
Copy link
Collaborator Author

videlec commented Jun 16, 2017

@jdemeyer concerning your harceded version, what is the point? and what are you referring to exactly? Note that only the major version (2 or 3) is taken into account. And a Python2 compiled version will not be Python3 compatible. And what is happening is during cythonization, it is not hardcoded in the compiled version.

@jdemeyer
Copy link
Collaborator

concerning your harceded version, what is the point?

You make the assumption that the version of Python which is used by Cython equals the version of Python which will use cypari2. There is no need to make that assumption.

Not making this assumption could allow to install cypari2 on systems which do not have Cython installed.

@videlec
Copy link
Collaborator Author

videlec commented Jun 16, 2017

concerning your harceded version, what is the point?

You make the assumption that the version of Python which is used by Cython equals the version of Python which will use cypari2. There is no need to make that assumption.

Agreed. But the target Python version has to be known at compilation time anyway (e.g. __cmp__ should not be there in Python3). How would you pass such parameter to the setup script?

Not making this assumption could allow to install cypari2 on systems which do not have Cython installed.

Then you would need two versions of the C files. One Python2 compatible and one Python3 compatible.

I don't think this point is relevant to the current PR.

@jdemeyer
Copy link
Collaborator

e.g. cmp should not be there in Python3

That's for Cython to decide, not for us. If you write def __cmp__ in an extension type, Cython knows to ignore that on Python 3.

Cython tries hard to output .c code that is compatible with both Python 2 and Python 3. Let's not destroy that by hard-coding the Python version number in those conversion functions.

@videlec
Copy link
Collaborator Author

videlec commented Jun 16, 2017

Funny cython behavior... with

IF PY_MAJOR_VERSION == 2:
    return to_bytes(s)
ELSE:
    return to_unicode(s)

I got with python2

Error compiling Cython file:
------------------------------------------------------------
...
    True
    >>> s1 == s2 == s3 == 'hello'
    True
    """
    IF PY_MAJOR_VERSION == 2:
        return to_bytes(s)
                      ^
------------------------------------------------------------    
cypari2/string_utils.pyx:71:23: Cannot convert 'bytes' object to str implicitly. This is not portable to Py3.

and with Python3

Error compiling Cython file:
------------------------------------------------------------
...
    True
    """
    IF PY_MAJOR_VERSION == 2:
        return to_bytes(s)
    ELSE:
        return to_unicode(s)
                        ^
------------------------------------------------------------    
cypari2/string_utils.pyx:73:25: Cannot convert Unicode string to 'str' implicitly. This is not portable and requires explicit encoding.
Traceback (most recent call last):

@videlec
Copy link
Collaborator Author

videlec commented Jun 16, 2017

e.g. cmp should not be there in Python3

That's for Cython to decide, not for us. If you write
def cmp in an extension type, Cython knows to
ignore that on Python 3.

Cython tries hard to output .c code that is compatible
with both Python 2 and Python 3. Let's not destroy that
by hard-coding the Python version number in those
conversion functions.

Let me repeat that it is not hardcoded anywhere. There is a constant Cython macro that enables/disables some portion of the code... and it is trivial to make it configurable.

@videlec
Copy link
Collaborator Author

videlec commented Jun 16, 2017

sadly network in Bordeaux is down... so Travis is not happy.

@videlec
Copy link
Collaborator Author

videlec commented Jun 19, 2017

network back, Travis happy!

Copy link
Collaborator

@jdemeyer jdemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no more comments about the general approach, but several minor remarks, see review.

>>> gen_to_integer(pari("1 - 2^64")) == -18446744073709551615
True
>>> import sys
>>> if sys.version_info.major == 2:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same test should be done with int instead of long (and you might as well do it on Python 2 and Python 3).

Copy link
Collaborator Author

@videlec videlec Jul 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see 5b66460

@@ -525,21 +535,21 @@ cpdef gen_to_python(Gen z):
[1, 2, 3]
>>> type(a1)
<... 'list'>
>>> map(type, a1)
>>> list(map(type, a1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ugly. Write [type(x) for x in a1].

Copy link
Collaborator Author

@videlec videlec Jul 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see 3040e65

[<... 'int'>, <... 'int'>, <... 'int'>]

>>> a2 = gen_to_python(z2); a2
[1, 3.4, [-5, 2], inf]
>>> type(a2)
<... 'list'>
>>> map(type, a2)
>>> list(map(type, a2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

Copy link
Collaborator Author

@videlec videlec Jul 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also 3040e65

[<... 'int'>, <... 'float'>, <... 'list'>, <... 'float'>]

>>> a3 = gen_to_python(z3); a3
[1, 5.2]
>>> type(a3)
<... 'list'>
>>> map(type, a3)
>>> list(map(type, a3))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

Copy link
Collaborator Author

@videlec videlec Jul 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also 3040e65

@@ -607,6 +617,6 @@ cpdef gen_to_python(Gen z):
else:
return -INFINITY
elif t == t_STR:
return str(z)
return to_string(<bytes> GSTR(g))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the <bytes> really needed here? (I haven't checked, maybe it is)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed not needed (see 92f558b)

@@ -38,6 +38,9 @@ AUTHORS:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that you rely on explicit conversions in many other places, I think it's better to be explicit everywhere. This means dropping cython: c_string_encoding=default and adding to_string or to_bytes where needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see b5e5375

>>> long(pari("Mod(2, 7)"))
2L
>>> if sys.version_info.major == 3:
... long = int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

cypari2/gen.pyx Outdated
vstr = kwds.keys() # Variables as Python strings
t0 = objtogen(kwds.values()) # Replacements
vstr = list(kwds.keys()) # Variables as Python strings
t0 = objtogen(list(kwds.values())) # Replacements
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list() should not be needed here: objtogen can deal with generators.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see fa24efb

cdef char s[2]
s[0] = c
s[1] = 0
sys.stdout.write(s)
sys.stdout.write(to_string(<bytes> s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this since you are converting bytes to str which sys.stdout immediately needs to convert back to bytes. In Python 3, you can use sys.stdout.buffer for the underlying stream which accepts bytes. So you could do something like:

try:
    stdout_bytes = sys.stdout.bytes
except AttributeError:
    stdout_bytes = sys.stdout
stdout_bytes.write(s)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not work as is since sys.stdout is sometimes redirected (e.g. doctest). I used

try:
    sys.stdout.buffer.write(s)
except AttributeError:
    sys.stdout.write(to_string(s))

See b4410fc


print("="*80)
print("Testing {}".format(mod.__name__))
test = doctest.testmod(mod, optionflags=doctest.ELLIPSIS|doctest.REPORT_NDIFF)
test = doctest.testmod(mod, optionflags=doctest.ELLIPSIS|doctest.REPORT_NDIFF|doctest.IGNORE_EXCEPTION_DETAIL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. In Python3, pari error are printed

Traceback (most recent call last):
...
cypari2.handle_error.PariError: MSG

while in Python2 it is

Traceback (most recent call last):
...
PariError: MSG

Copy link
Collaborator

@jdemeyer jdemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments, all easy to fix.

# avoid string conversion if possible
sys.stdout.buffer.write(s)
except AttributeError:
sys.stdout.write(to_string(s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to_string() here. The whole point is that you want to write bytes without conversion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you do. It is perfectly valid to redirect sys.stdout to some other streams that might not support the buffer thing (like the doctest module does). In other words the except is not only intended for Python2. In particular the to_string is mandatory in Python3 (and useless in Python2). Note that to_string does not perform any conversion in Python2.

# avoid string conversion if possible
sys.stdout.buffer.write(s)
except AttributeError:
sys.stdout.write(to_string(s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to_string() here. The whole point is that you want to write bytes without conversion.

cypari2/gen.pyx Outdated
@@ -4182,14 +4193,14 @@ cdef class Gen(Gen_auto):
return new_gen(gsubst(self.g, varn(self.g), t0.g))

# Call substvec() using **kwds
vstr = kwds.keys() # Variables as Python strings
t0 = objtogen(kwds.values()) # Replacements
vstr = iter(kwds.keys()) # Variables as Python strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be kwds.iterkeys()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterkeys does not exist in Python3: kwds.keys() is already an iterator.


import sys
encoding = sys.getfilesystemencoding()
cdef int PY_MAJOR_VERSION = sys.version_info.major
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more efficient to make this a compile-time constant with

cdef extern from *:
   int PY_MAJOR_VERSION

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in d9efdf8

else:
raise TypeError

cpdef unicode to_unicode(s):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe cpdef inline for efficiency.

encoding = sys.getfilesystemencoding()
cdef int PY_MAJOR_VERSION = sys.version_info.major

cpdef bytes to_bytes(s):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe cpdef inline for efficiency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was wrong to declare an inline function inside a .pyx that is used in other parts of the code (here in auto_instance.pxi)

@jdemeyer
Copy link
Collaborator

iterkeys does not exist in Python3:

It does exist in Cython, which is what matters here.

@jdemeyer
Copy link
Collaborator

It is perfectly valid to redirect sys.stdout to some other streams that might not support the buffer thing (like the doctest module does).

The question is: can it happen that sys.stdout does not support the buffer thing and that it also does not accept bytes?

@jdemeyer
Copy link
Collaborator

I thought it was wrong to declare an inline function inside a .pyx

It is wrong to declare it as inline in a .pxd file. But it's fine to define an inline function in a .pyx file.

@jdemeyer
Copy link
Collaborator

inlined to_string

That's not what I meant, but that's actually a good solution too.

@jdemeyer
Copy link
Collaborator

Not iter(vstr.iterkeys()) but just vstr.iterkeys().

@videlec
Copy link
Collaborator Author

videlec commented Jul 13, 2017

Not iter(vstr.iterkeys()) but just vstr.iterkeys().

does not work (at least in Python 3)

TypeError: dict_keys object is not an iterator

@videlec
Copy link
Collaborator Author

videlec commented Jul 13, 2017

It is perfectly valid to redirect sys.stdout to some other streams that might not support the buffer thing (like the doctest module does).

The question is: can it happen that sys.stdout does not support the buffer thing and that it also does not accept bytes?

Removing the to_string, while running the doctests in Python 3 you get hundreds of

Exception ignored in: 'cypari2.pari_instance.python_puts'
TypeError: string argument expected, got 'bytes'

@jdemeyer
Copy link
Collaborator

Removing the to_string, while running the doctests in Python 3

I see. This is probably a doctest-specific thing.

@jdemeyer
Copy link
Collaborator

does not work (at least in Python 3)

TypeError: dict_keys object is not an iterator

You're right. I though that dict.keys() in Python 3 returned the same thing as dict.iterkeys() in Python 2, but apparently, that's not the case. You get an iterable but not an iterator.

@jdemeyer
Copy link
Collaborator

OK, this is a mess but I guess it's the best we can do. For me, there are no further changes needed.

@videlec
Copy link
Collaborator Author

videlec commented Jul 13, 2017

@defeo do you have any comment?

@videlec
Copy link
Collaborator Author

videlec commented Jul 19, 2017

@defeo Let say that 6 days mean yes. Though it would be nice to have answered either "do without me" or "wait for me" or whatever...

@videlec videlec merged commit dea0639 into master Jul 19, 2017
@videlec videlec deleted the python3 branch July 19, 2017 09:53
@defeo
Copy link
Member

defeo commented Jul 19, 2017

I'm sorry. If this is urgent, do without me. When I have time to get back to this, I'll eventually open another ticket if I see any problem.

@videlec videlec mentioned this pull request Jul 20, 2017
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants