Port python bindings to Python 3 #66

Closed
wants to merge 8 commits into
from

Conversation

Projects
None yet

Kentzo commented Oct 27, 2014

Here I'm going to address remaining issues by merging changes from my diverged fork that made Protobuf 2.5.1 compatible with Python 3 by retaining compatibility with all other supported versions.

I kindly ask authorized developers to review and leave comments and I'll try to address them as soon as I can.

A few questions:

@Kentzo Kentzo changed the title from Port Protobuf to work under Python 3 to Port python bindings to Python 3 Oct 27, 2014

Member

xfxyjwf commented Oct 27, 2014

+ @mrovner to help take a look.

+ return _install(tarball, _build_install_args(options))
+
+if __name__ == '__main__':
+ sys.exit(main())
@tseaver

tseaver Nov 18, 2014

Contributor

+1. Replacing the old, Python2-only 'ez_setup.py' with one from the setuptools bitbucket repo is definitely the Right Thing. FTR, the permalink for "latest and greatest" ez_setup.py is: https://bootstrap.pypa.io/ez_setup.py

+ def is_sequence(other):
+ return isinstance(other, collections.Sequence)
+ def copy_reg_pickle(type, function):
+ return copyreg.pickle(type,function)
@tseaver

tseaver Nov 18, 2014

Contributor

+1. This kind of alternate block is typical for code which "straddles" Python2 and Python3.

# Attach the local methods to the message class.
- for key, value in locals().copy().iteritems():
+ for key, value in iteritems(locals().copy()):
@tseaver

tseaver Nov 18, 2014

Contributor

I'm not sure why we need the iteritems at all, given that we are going to iterate the whole thing. Even the copy() is unneded, as the loop body doesn't mutate the function's locals:

for key, value in locals().items(): 
@Kentzo

Kentzo Nov 18, 2014

Could be a workaround for some bug in older Python?

@@ -658,6 +675,6 @@ def FromString(string):
def _AddPropertiesForExtensions(message_descriptor, cls):
"""Adds properties for all fields in this protocol message type."""
extension_dict = message_descriptor.extensions_by_name
- for extension_name, extension_field in extension_dict.iteritems():
+ for extension_name, extension_field in iteritems(extension_dict):
constant_name = extension_name.upper() + '_FIELD_NUMBER'
@tseaver

tseaver Nov 18, 2014

Contributor

Likewise, we can forego the iteritems here:

for extension_name, extension_field in extension_dict.items():
-_NAN = _POS_INF * 0
+_POS_INF = float('inf')
+_NEG_INF = float('-inf')
+_NAN = float('nan')
@tseaver

tseaver Nov 18, 2014

Contributor

+1 for dropping support for Python < 2.6.

@Kentzo

Kentzo Nov 18, 2014

I asked for the min supported Python, but never got a response.

@traff

traff Apr 16, 2015

If it's possible, let's better not drop 2.4 and 2.5. They are still used.

@tseaver

tseaver Apr 16, 2015

Contributor

FWIW, Nick Coghlan (RH employee, core CPython developer) disagrees: he even argues against keeping 2.6 support.

@haberman

haberman Apr 16, 2015

Member

Unfortunately we are in a position where we are tied to supporting 2.5 support for the foreseeable future. AppEngine has committed to supporting its 2.5 runtime for now, and the AppEngine runtime uses this protobuf implementation.

@tseaver

tseaver Apr 16, 2015

Contributor

ISTM to me that any apps conservative to be running on an eight-year old Python version (five years since its last security / bugfix release) should be happy to stick with protobuf 2.6.x: such developers should be running away from latest-and-greatest protobuf 3.x library.

@traff

traff Apr 16, 2015

I'm PyCharm IDE lead and dev and just have checked the usage statistics for the 2014. Python 2.5 and Python 2.4 use 1.41% and 0.16% of PyCharm users respectively. So we probably will abandon supporting 2.4 soon, but 2.5 should be still supported. If protobuf will drop support for 2.5 we'll have to bundle 2 different versions and I doubt that it would be easy to match them with 2 different versions on Java side. So it could be quite a pain.

@Kentzo

Kentzo Apr 17, 2015

Maybe it would be better to split python bindings into different projects: one for python 2.x one for 3.x and one for common parts?

It's ridiculous how much time was spent on such simple issue :) New features are not introduced that often to consider supporting separate projects an impossible task.

@haberman

haberman Apr 17, 2015

Member

@tseaver You make a good point. I'm working with the AppEngine owners to see if they could just freeze their Python 2.5 runtime at the current version of protobuf. This would allow us to drop 2.5 and possibly 2.6 support in this repo.

The downside is that new features (like maps, JSON suport, etc) would not be available for Python 2.5. But I think this might be a necessary tradeoff. Maintaining 2.5 support in this repo has been an increasing burden.

-##!PY25 if ((float_bytes[3:4] in b'\x7F\xFF')
- and (float_bytes[2:3] >= b('\x80'))): ##PY25
-##!PY25 and (float_bytes[2:3] >= b'\x80')):
+ if ((float_bytes[3] in b('\x7F\xFF'))
@tseaver

tseaver Nov 18, 2014

Contributor

Why not use a bytes literal, given that the PR presumes we are dropping 2.5 support?

- and (double_bytes[0:7] != b('\x00\x00\x00\x00\x00\x00\xF0'))): ##PY25
+ if ((double_bytes[7] in b('\x7F\xFF'))
+ and (double_bytes[6] >= 240)
+ and (double_bytes[0:7] != b('\x00\x00\x00\x00\x00\x00\xF0'))):
return (_NAN, new_pos)
@tseaver

tseaver Nov 18, 2014

Contributor

Likewise, bytes literals?

@@ -390,8 +384,7 @@ def _VarintBytes(value):
pieces = []
_EncodeVarint(pieces.append, value)
- return "".encode("latin1").join(pieces) ##PY25
-##!PY25 return b"".join(pieces)
+ return b("").join(pieces)
@tseaver

tseaver Nov 18, 2014

Contributor

Bytes literal?

else:
- raise
+ raise BaseException()
@tseaver

tseaver Nov 18, 2014

Contributor

Would this benefit from using six.reraise() https://pythonhosted.org/six/#six.reraise ? It would require passing in the exc_type, exc_value, and exc_traceback, but would allow preserving the original exception context.

-SFixed64Encoder = _StructPackEncoder(wire_format.WIRETYPE_FIXED64, '<q')
-FloatEncoder = _FloatingPointEncoder(wire_format.WIRETYPE_FIXED32, '<f')
-DoubleEncoder = _FloatingPointEncoder(wire_format.WIRETYPE_FIXED64, '<d')
+Fixed32Encoder = _StructPackEncoder(wire_format.WIRETYPE_FIXED32, b('<I'))
@tseaver

tseaver Nov 18, 2014

Contributor

bytes literals?

-else:
- from io import BytesIO
+
+if sys.version_info >= (3,):
@tseaver

tseaver Nov 18, 2014

Contributor

Use the google.protobuf.internals.util.PY2?

@tseaver

tseaver Nov 18, 2014

Contributor

Also, we should just be able to use io.BytesIO unconditionally in Python >= 2.6.

@@ -237,7 +241,7 @@ def AddDecoder(wiretype, is_packed):
def _AddClassAttributesForNestedExtensions(descriptor, dictionary):
extension_dict = descriptor.extensions_by_name
- for extension_name, extension_field in extension_dict.iteritems():
+ for extension_name, extension_field in iteritems(extension_dict):
@tseaver

tseaver Nov 18, 2014

Contributor

As with cpp_message, we should be able to just use items() instead of dancing around iteritems().

@@ -819,7 +823,7 @@ def _AddSerializePartialToStringMethod(message_descriptor, cls):
"""Helper for _AddMessageMethods()."""
def SerializePartialToString(self):
- out = BytesIO()
+ out = SimIO()
@tseaver

tseaver Nov 18, 2014

Contributor

io.BytesIO().

@@ -845,7 +849,8 @@ def MergeFromString(self, serialized):
except (IndexError, TypeError):
# Now ord(buf[p:p+1]) == ord('') gets TypeError.
raise message_mod.DecodeError('Truncated message.')
- except struct.error, e:
+ except struct.error:
@tseaver

tseaver Nov 18, 2014

Contributor

Why not except struct.error as e:?

@@ -535,7 +539,7 @@ def testDefaultValues(self):
self.assertEqual(0.0, proto.optional_double)
self.assertEqual(False, proto.optional_bool)
self.assertEqual('', proto.optional_string)
- self.assertEqual(b'', proto.optional_bytes)
+ self.assertEqual(b(''), proto.optional_bytes)
@tseaver

tseaver Nov 18, 2014

Contributor

bytes literal should be OK.

+ #The first input has to be a string, and literals are unicode
+ MyProtoClass = reflection.GeneratedProtocolMessageType(str('MyProtoClass'),
+ (message.Message,),
+ {'DESCRIPTOR': mydescriptor})
@tseaver

tseaver Nov 18, 2014

Contributor

Is this change related to the Python3 port, or is it just ensuring better coverage of what the testcase name says it is about?

@@ -1657,7 +1660,7 @@ def testStringUTF8Encoding(self):
# Assignment of a unicode object to a field of type 'bytes' is not allowed.
self.assertRaises(TypeError,
- setattr, proto, 'optional_bytes', u'unicode object')
+ setattr, proto, 'optional_bytes', u('unicode object'))
@tseaver

tseaver Nov 18, 2014

Contributor

Necessary to support Python3.2, which doesn't have unicode literals.

@@ -1726,7 +1726,7 @@ def testStringUTF8Serialization(self):
# The pure Python API always returns objects of type 'unicode' (UTF-8
# encoded), or 'bytes' (in 7 bit ASCII).
badbytes = raw.item[0].message.replace(
- test_utf8_bytes, len(test_utf8_bytes) * b'\xff')
+ test_utf8_bytes, len(test_utf8_bytes) * b('\xff'))
@tseaver

tseaver Nov 18, 2014

Contributor

Bytes literals should be OK.

self.assertEqual(0, bytes_read)
self.assertTrue(proto.HasField('optional_nested_message'))
proto = unittest_pb2.TestAllTypes()
- proto.optional_nested_message.ParseFromString(b'')
+ proto.optional_nested_message.ParseFromString('')
@tseaver

tseaver Nov 18, 2014

Contributor

Shouldn't this be bytes? MergeFromString() and ParseFromString() both take serialized data.

@@ -1957,7 +1957,7 @@ def Test(i, expected_varint_size):
self.assertEqual(expected_varint_size + 1, self.Size())
Test(0, 1)
Test(1, 1)
- for i, num_bytes in zip(range(7, 63, 7), range(1, 10000)):
+ for i, num_bytes in zip(list(range(7, 63, 7)), list(range(1, 10000))):
@tseaver

tseaver Nov 18, 2014

Contributor

list(range(...)) shouldn't be needed inside zip(): it will iterate the result of range() directly.

try:
callable_obj()
- except exc_class as ex:
+ except exc_class:
@tseaver

tseaver Nov 18, 2014

Contributor

The as form should be preferred.

@@ -2705,7 +2706,7 @@ def testInitKwargs(self):
optional_int32=1,
optional_string='foo',
optional_bool=True,
- optional_bytes=b'bar',
+ optional_bytes=b('bar'),
@tseaver

tseaver Nov 18, 2014

Contributor

Bytes literals OK.

+ try:
+ from cStringIO import StringIO as SimIO
+ except:
+ from StringIO import StringIO as SimIO
@tseaver

tseaver Nov 18, 2014

Contributor

io.BytesIO should be preferred for Python >= 2.6.

+ return "b(\"" + CEscape(field.default_value_string()) + "\")";
+ }
+ case FieldDescriptor::CPPTYPE_MESSAGE:
+ return "None";
@tseaver

tseaver Nov 18, 2014

Contributor

Bytes literals should be OK.

Kentzo commented Nov 18, 2014

@xfxyjwf @mrovner Do we still need to support Python 2.5 in protobuf 2.6+ ?

mrovner commented Nov 18, 2014

Yes, we still have 2.5 clients.

On Tue, Nov 18, 2014 at 8:53 AM, Ilya Kulakov notifications@github.com
wrote:

@xfxyjwf https://github.com/xfxyjwf @mrovner
https://github.com/mrovner Do we still need to support Python 2.6 in
protobuf 2.6+ ?


Reply to this email directly or view it on GitHub
#66 (comment).

Thanks,
--Mike

Kentzo commented Nov 18, 2014

@tseaver Please edit your reviews given that answer :)

Contributor

tseaver commented Nov 18, 2014

@mrovner I thought the question was about keeping 2.6 support: are you saying that we need to ship a protobuf that spans 2.5 through 3.4?

Would bumping a major version to signal dropping 2.5 support while adding 3.x support be reasonable? Users stuck on Python 2.5 would then stay with protobuf 2.6.x, while those who needed Py3k could move to protobuf 2.7.x.

@@ -610,7 +624,10 @@ def FindInitializationErrors(self):
return self._cmsg.FindInitializationErrors()
def __str__(self):
- return str(self._cmsg)
+ if PY2:
@virtuald

virtuald Nov 21, 2014

Might make more sense to move this block up to line 44, instead of having the if statement down here.

@tseaver tseaver referenced this pull request in GoogleCloudPlatform/google-cloud-python Nov 24, 2014

Closed

Ensure Python 3 support #91

Is there any way to support this pull request?

Contributor

tseaver commented Dec 15, 2014

I note that the master is no aiming at a 3.0.0 release: that would seem like an ideal time to drop support for Python < 2.6.

@@ -172,6 +172,7 @@ def run(self):
'google.protobuf.internal.message_listener',
'google.protobuf.internal.python_message',
'google.protobuf.internal.type_checkers',
+ 'google.protobuf.internal.utils',
@tseaver

tseaver Dec 15, 2014

Contributor

Given that we rely on setuptools (boostrapping from ez_setup.py if it isn't present), we should be able to use its find_packages() rather than enumerating modules by hand.

davidB commented Dec 20, 2014

Hi,

Until support of python, I share my workaround, a script to convert to python3:

# more info at http://python3porting.com/bookindex.html
# download protobuf-2.6.1.tar.gz for python
cd modules
rm -Rf google
tar -xzvf ~/Téléchargements/protobuf-2.6.1.tar.gz protobuf-2.6.1/google
2to3-3.4 protobuf-2.6.1/google -o google -W --no-diffs -p -n
rm -Rf protobuf-2.6.1

2to3 is a tool provide by python3.
I used it be able to use protobuf with bendler (blender 2.72 include python 3.4.2)

Kentzo commented Jan 6, 2015

@mrovner I believe this PR can be closed without merging. Please confirm.

@googlebot googlebot added the cla: yes label Jan 6, 2015

tseaver added a commit to tseaver/protobuf that referenced this pull request Jan 13, 2015

Prepare for Python2-Python3 straddle.
- Remove PY25 cruft.

- Selectively apply cleanups from 'python-modernize':

  - New exception syntax.
  - Use 'six' to handle module renames.
  - Use 'six' to handle text / binary stuff.

This PR covers most of the work from #66 which falls inside `python`
(rather than the Python code generation stuff in 'src').
Contributor

tamird commented May 15, 2015

@Kentzo protobuf now officially doesn't need to support python 2.5 and 2.6. Would you like to rebase this such that only 2.7+ is supported?

Kentzo commented May 21, 2015

@tamird I don't have time to review that right now. We don't plan to migrate to 2.6 anytime soon.

Any help is welcome :)

Contributor

tamird commented Aug 22, 2015

@haberman probably time to close this guy :)

Member

haberman commented Feb 16, 2016

Ah yes, this has been addressed in other changes. Closing now.

@haberman haberman closed this Feb 16, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment