Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of Python 3 (3.4+) support? #646

Closed
Kentzo opened this issue Jul 28, 2015 · 42 comments
Closed

Status of Python 3 (3.4+) support? #646

Kentzo opened this issue Jul 28, 2015 · 42 comments

Comments

@Kentzo
Copy link

Kentzo commented Jul 28, 2015

There is a number of open PR regarding adding support for Python 3 (including my own). None of them are merged.

What's the current status? Which of PR's has most chances to be merged anytime soon? I would like to invest some time into this, but really don't want to waste doing work noone will merge.

@haberman
Copy link
Member

I am extremely interested in merging PRs to fully support Python 3.

I commented on #166 that I am interested in merging it, but @tseaver said he didn't have time to update it.

I wasn't sure the status of your PR #66. As @tamird mentioned we no longer need to support Python 2.5, and I'm interested in making a cleaner code-base that drops 2.5 support.

Here is what I am especially interested in:

  1. a code-base that works with 2.6, 2.7, and 3.4 without requiring 2to3.
  2. a build process that can test all of these different versions with tox (including both pure-Python and Python/C++ implementations).
  3. a way of running the conformance tests against all versions also.

I have been working on (2) but my progress is slow since I am not an expert in the Python tool ecosystem. I would greatly appreciate any help with this.

@Kentzo
Copy link
Author

Kentzo commented Jul 29, 2015

To run tests under multiple versions one would just set up virtualenv for each version and run tests under them.

What kind of build process are you looking for? How are you going to run it?

@tseaver
Copy link
Contributor

tseaver commented Jul 29, 2015

@Kentzo tox is a tool that automates that process.

@haberman
Copy link
Member

I want a tox setup that works, basically. As @tseaver mentioned, tox is designed to automate the process of testing with multiple virtualenvs. But tox won't work with our current setup. It throws errors given our current setup.py. And we have to figure out what command tox should run to do this actual testing (this is configurable).

@haberman
Copy link
Member

Ideally the tox setup would also install the actual package into the virtualenv, to verify that the packaging/install step is correct.

@mitchellrj
Copy link

At the very least, the documentation should be updated (changelog for v2.6.0) to remove claims that Python 3 is supported. This is misleading at best.

@tamird
Copy link
Contributor

tamird commented Aug 12, 2015

@Kentzo I would be more than happy to assist you with this. If you don't plan on working on this yourself, could you list out the steps we could take to get to 2.7/3.x support? I'm not a python expert, but I've been somewhat successful improving protobuf's python support.

@dano
Copy link
Contributor

dano commented Aug 13, 2015

I've taken the work @tseaver started and merged in the latest code from upstream here: https://github.com/dano/protobuf/tree/py2_py3_straddle. tox runs just fine, but a bunch of new tests that have been added are now failing on Python 3, along with a couple of regressions on Python 2.7. If I get some more free time I'll start trying to fix the issues, but it at least gets us closer...

@tseaver
Copy link
Contributor

tseaver commented Aug 13, 2015

@dano Thanks for rescuing that work!

@dano
Copy link
Contributor

dano commented Aug 14, 2015

Ok, I've fixed a couple of long/int compatibility issues on the Python 2 side. There are two Python 2 tests that still fail, but the reason for the regression is a symptom of a fairly large problem on the Python 3 side.

text_format.py has a MethodToString function that converts a protobuf to text. It's got an as_utf8 parameter, which determines if the output is utf-8 formatted or not. So far, so good. To collect the output from the protobuf, the Python 2-only version uses a cStringIO.cStringIO object, and passes that around the module. The Python 2/3 compatible version of this is io.StringIO - however, those two classes aren't quite the same. The StringIO module has this note in it's documentation:

The StringIO object can accept either Unicode or 8-bit strings, but mixing the two may take some care.

io.StringIO doesn't have this leniency - everything you pass to it must be Unicode. If you want to pass 8-bit strings, you need io.BytesIO. Unfortunately, the current code is badly abusing the ability to pass both Unicode and byte strings to cStringIO.cStringIO. Everywhere it's writing string literals, it's using out.write('string'), which are bytes strings, not unicode. It's also writing the same kind of literal, regardless of the value of the as_utf8 parameter. It should be choosing b'string' if as_utf8 == True and u'' if not. It should also decide whether or not to encode fields from the protobuf itself depending on the as_utf8 value, currently, it just uses the fields returned unconditionally, regardless of the type of string and the value of as_utf8

So, there's a whole bunch of work here to fix every single call to out.write(...) in text_format.py, to make sure it does the right thing depending on the value of as_utf8 and the type of string it's trying to write. The good news is it looks like that's the only thing still causing failures on Python 3, so hopefully this is the last big hurdle (aside from making sure things still work on Python 2.6 - I don't have it installed on my machine right now, so I haven't had a chance to test that yet. It should be a fairly simple exercise to get it working in any case).

@dano
Copy link
Contributor

dano commented Aug 14, 2015

@haberman You mention a desire to support Python 2.6 - it looks like the current master branch doesn't support 2.6 in its current state, lots of tests use 2.7+ only assertions, and proto_builder.py uses collections.OrderedDict, which was only added in Python 2.7. Has 2.6 support actually been dropped, or is the fact that it's no longer working actually a bug?

@haberman
Copy link
Member

@dano Thanks so much for this work! I am very eager to merge it.

I've been working on other things, but I will get to your detailed comments as soon as I can.

Re: Python 2.6, that is a good question. I think we might have to do a cost-benefit weighing. I'm told that some relevant versions of RedHat still support only 2.6. How much uglier does it make the code to support 2.6?

@dano
Copy link
Contributor

dano commented Aug 15, 2015

@haberman The quickest and easiest way to get Python 2.6 working would be to require that the backports of OrderedDict and the Python 2.7 version of unittest - ordereddict and unittest2, respectively - are installed, and just adjust the code to use those if they're available. That assumes there aren't some deeper issues I just haven't noticed yet, but if the backport route makes sense to you, I'll try it and see if that's all it takes.

@haberman
Copy link
Member

@dano Could we require the backports only for Python 2.6? Python 2.7+ users shouldn't need them right?

@dano
Copy link
Contributor

dano commented Aug 15, 2015

@haberman Right, we can make sure setup.py only installs the backports if we detect Python 2.6 is installed, and that we'll always favor the Python 2.7 versions of the modules in the code.

@haberman
Copy link
Member

Sounds great to me!

@dano
Copy link
Contributor

dano commented Aug 15, 2015

@haberman Ok, I've got all tests passing Python 2.6 and 2.7. The string/bytes issues goes even deeper than I originally thought, though; there's some non-standard encoding required in text_encoding.py that's complicating things. Do you want a pull request for the rest of the changes in the meantime? It would bring in tox, Python 2.6 support, and get us most of the way to Python 3 compatibility.

@dano
Copy link
Contributor

dano commented Aug 15, 2015

It also appears that the cpp implementation is broken on Python 3, because of http://bugs.python.org/issue22079. I hit TypeError: type 'google.protobuf.pyext._message.ScalarMapContainer' is not dynamically allocated but its base type 'MutableMapping' is dynamically allocated when trying to run python setup.py test --cpp_implementation. So there's also that to fix...

@haberman
Copy link
Member

That is super unfortunate, that this restriction on inheriting static types from dynamic types was added to Python. Without that, it will be much more difficult (if it's possible at all) to use the abstract base collections to make our maps implement all the methods that Python users expect from dicts.

To work around that we'll probably have to re-implement the built-in MutableMapping class in C++. :(

@dano
Copy link
Contributor

dano commented Aug 15, 2015

Could it be altered to inherit from dict, instead? Inheriting from built-ins should be well supported in C-extensions (and dict is written in C, anyway). It seems some other mapping types implemented in C (defaultdict, the C implementation of OrderedDict) use this approach.

@haberman
Copy link
Member

That wouldn't work -- dict provides its own storage, and it's important that ScalarMapContainer uses the storage of the underlying google::protobuf::Message object.

@dhirschfeld
Copy link

Just ran into this problem myself :(

@haberman
Copy link
Member

Ok, we should get your work merged ASAP. Let's disable the specific tests that fail for now (with some explanatory comments about what is broken). Then we can iterate on fixes for what is broken.

@dano, want to send a PR with your work on this? Thanks!

@haberman
Copy link
Member

Just to expand on my previous comments slightly: we need to make sure that the Travis build continues to pass. So if we have to disable the Python 3 tests for now to keep Travis passing, let's do that, but then iterate on fixes until it's possible re-enable the broken tests.

@Kentzo
Copy link
Author

Kentzo commented Aug 17, 2015

I agree about making PR. At the very least all remaining issues with syntax will be addressed.

@dano
Copy link
Contributor

dano commented Aug 17, 2015

@haberman Sounds good. For now I've got all Python 3 tests disabled via tox.ini, so only Python 2.6/2.7 will run. Running tox alone tests the pure Python implementation, and tox -- --cpp_implementation can be used to test the cpp implementation. We can tweak it so just running tox tests both, if that's preferable.

I'll try to get the pull request submitted tonight; I'm at a conference this week so my time during the day is a bit limited.

@haberman
Copy link
Member

Thanks @dano! I'll look forward to the PR.

By the way, I should mention one other thing I've been trying to get working that is related to your work. We have a set of conformance tests in https://github.com/google/protobuf/tree/master/conformance that are designed to be tested against all implementations. They currently run against C++, Java, Ruby, and C#. I'd really like to get them running against Python also, but ran into trouble getting this to work.

It would probably make the most sense to run these as a part of the tox run, so we get the same version testing matrix (ie. 2.6, 2.7, 3.3, 3.4) as we do for the regular Python tests.

I did wrote the actual code to implement this for Python: master...haberman:python-conformance2

I got this to work for the pure-Python implementation, but was not successful getting it to work with the C++ implementation. I couldn't figure out how to run it in such a way that it would find the .so correctly.

If this would be relatively easy to add to your work, that would be awesome. Or even if you just had tips for how to get this working.

@dano
Copy link
Contributor

dano commented Aug 18, 2015

I've created the pull request here: #722

@haberman I'm having some laptop issues making it hard to test the conformance test stuff, but when I tried running it, the pure Python stuff passed, while a few of the CPP tests failed with errors like this (this is not the complete output):

CONFORMANCE TEST BEGIN ====================================

ERROR, test=ProtobufInput.PrematureEofInPackedField.DOUBLE: Should have failed to parse, but didn't. request=protobuf_payload: "\322\002\001" requested_output_format: PROTOBUF, response=protobuf_payload: ""
ERROR, test=ProtobufInput.PrematureEofInPackedField.FLOAT: Should have failed to parse, but didn't. request=protobuf_payload: "\312\002\001" requested_output_format: PROTOBUF, response=protobuf_payload: ""
ERROR, test=ProtobufInput.PrematureEofInPackedField.INT64: Should have failed to parse, but didn't. request=protobuf_payload: "\202\002\001" requested_output_format: PROTOBUF, response=protobuf_payload: ""

It sounds like this isn't what you were seeing?

@haberman
Copy link
Member

@dano Those results are perfect and as expected! The C++ parser has a few bugs in it, which is why you're seeing conformance problems there that don't show up with pure-Python.

For each language we have a "failure list": the list of tests that are known to be broken. Here is the one for C++ -- I would expect Python/C++ to have the same list: https://github.com/google/protobuf/blob/master/conformance/failure_list_cpp.txt

@haberman
Copy link
Member

I didn't get to see this myself because I wasn't able to get it working with Python/C++. Maybe it could be a follow-up PR, but I would really love to get these tests going for Python!

@dano
Copy link
Contributor

dano commented Aug 18, 2015

@haberman Interesting - I basically took the code you provided, applied it to the py3 compatibility branch, and ran it with almost no modifications. I was expecting to hit whatever failure you were seeing when I ran it.

Anyway, when I get a chance I'll get what I tried checked in and submit a PR (or maybe just update the one I already opened, depending on where that stands).

@haberman
Copy link
Member

@dano I had run into issues with LD_LIBRARY_PATH/PYTHONPATH, especially since the location of the built .so file was hard to predict (as it varies based on the Python version and such). Also prior to your change we were running 2to3, and I was having trouble getting it the output of 2to3 instead of the original source files.

If your PR solves all of these problems then awesome!

Maybe let's have the conformance tests be a follow-up PR, to avoid growing the current one too much.

@caxton
Copy link

caxton commented Aug 21, 2015

Hi Guys,

Is this a known issue for python 3.4 with protobuf 3(I just simply import module compile via protoc in syntax "proto3")?

Traceback (most recent call last):
File "add_motiondata.py", line 14, in
import motiondata_pb2
File "/home/caxton/work_folder/python/protobuf/motiondata/build/gen/motiondata_pb2.py", line 10, in
from google.protobuf import descriptor_pb2
File "", line 2214, in _find_and_load
File "", line 2203, in _find_and_load_unlocked
File "", line 1191, in _load_unlocked
File "", line 1161, in _load_backward_compatible
File "/usr/local/lib/python3.4/dist-packages/protobuf-3.0.0a4.dev0-py3.4.egg/google/protobuf/descriptor_pb2.py", line 243, in
TypeError: decoding str is not supported

% protoc --version
libprotoc 3.0.0

% python -V
Python 3.4.0

@dano
Copy link
Contributor

dano commented Aug 22, 2015

@caxton Yes, there are issues with text handling in Python 3, and they'll still be there, even after my pull request gets merged. It looks like there's a decent amount of work to get the text handling to a place where it does the right thing for all versions of Python.

@dano
Copy link
Contributor

dano commented Aug 22, 2015

Ok, I've now got both python and cpp tests passing on Python 3.3, and the pure python tests passing on 3.4 (with very minimal changes, as it turns out). I have to admit, though, even though all the tests are passing, the way text handling works still feels wrong. The way I got things working is to have text_format.py write everything as bytes (via io.BytesIO) on Python 2.x, and as str (via io.StringIO`) on Python 3. Almost all literature on how to unify text handling between 2/3 tells you to decode everything to unicode immediately, and encode at the latest possible point, but the strange way protobuf treats utf-8 makes it really difficult to understand how that would work.

Once #722 is merged, I'll open a new PR for these changes so other folks can review.

edit: Just opened the PR - #728

@dano
Copy link
Contributor

dano commented Aug 28, 2015

@haberman I've got the Python 3.4 cpp implementation working now, too. As suggested in http://bugs.python.org/issue22079, I use PyType_FromSpecWithBases to dynamically allocate ScalarMapContainer_Type and MessageMapContainer_Type, which allows us to use MutableMapping as a base. Took a little while, since my C/C++ is more than a little rusty, but all builds are passing. See #772.

@haberman
Copy link
Member

Thanks @dano! btw. there seems to be a relatively simple problem with 2.6 in the C++ implementation on our release branch after merging our changes from Google-internal development. If you're on a roll please feel free to take a look. :)

#758

@Kentzo
Copy link
Author

Kentzo commented Dec 15, 2015

Could anyone update what's the current? :)

@dano
Copy link
Contributor

dano commented Dec 15, 2015

@Kentzo It's been quite a while since I looked, but after that last PR from 8/27 was merged, everything should have been working on from Python 2.6-3.4.

@Kentzo
Copy link
Author

Kentzo commented Dec 15, 2015

Cool. Can you think of a tag / raw sha which has stable proto 2.6 and solid python 3?

Best Regards
Ilya Kulakov

On 15 дек. 2015 г., at 21:19, dano notifications@github.com wrote:

@Kentzo It's been quite a while since I looked, but after that last PR from 8/27 was merged, everything should have been working on from Python 2.6-3.4.


Reply to this email directly or view it on GitHub.

@dano
Copy link
Contributor

dano commented Dec 15, 2015

@Kentzo Hmm, if you mean the first point at which everything should have been working, I guess it would be this commit: cee703d, where @haberman merged his fix for #758. Have things broken since then?

@xfxyjwf xfxyjwf closed this as completed Jan 21, 2016
@xfxyjwf
Copy link
Contributor

xfxyjwf commented Jan 21, 2016

Python 3 support should already be available from the last release (3.0.0b2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants