New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add back the namespace_packages arg in setup.py. #1298
Conversation
Improves protocolbuffers#1296. The problem: in the previous patch, we tweaked the __init__.py files to use namespaces, but no longer declared ourselves as a namespace package. The second half was unwise. Note that this only comes up when installing protobuf alongside another package that also installs into the google namespace; as of right now, the only PyPI package that does is googleapis-common-protos, though the GAE SDK also uses google.appengine. Installing either or both of those alongside this package now works. The case that still remains is the upgrade path, which is also what worried me in protocolbuffers#713. It seems that if protobuf 2.6.1 is installed, there's no way to safely upgrade that to work with a newer protobuf. However, `pip uninstall` && `pip install` does the trick.
@craigcitro is the best! |
Thanks a lot Craig! I came to the same conclusion with googleapis-common-protos and was just scratching my head if re-adding namespace_packages will help. Are we ok if this looks good to cut a release v3.0.0b2.post2 for this. We still need to solve the namespace sharing issue for cloudml/tensorflow (uses protobuf) and dataflow (uses google namespace). |
This seems fine to me. I don't understand the underlying issues fully, but it sounds like you are confident that this will fix the problems that were introduced in post1. |
Add back the namespace_packages arg in setup.py.
In the future, I recommend using test PyPI to test packaging-related stuff like this. As far as the upgrading a non-namespace package (2.6.1) to a namespace package - I'm really unsure. I'm also not sure the impact. It seems in practice developers using virtualenvs create, destroy, and recreate them rather than upgrade packages. But then again not all developers use virtualenvs. |
In protocolbuffers#713 and protocolbuffers#1296, the `google` package in protobuf sources was found to cause conflicts with other Google projects, because it was not properly configured as a namespace package [1]. The initial fix in 786f80f addressed part of the issue, and protocolbuffers#1298 fixed the rest. However, 786f80f (the initial fix) also made `google.protobuf` and `google.protobuf.pyext` into namespace packages. This was not correct: they are both regular, non-namespace, sub-subpackages. However (still), the follow-up protocolbuffers#1298 did not nominate them as namespace packages, so the namespace registration behavior has remained, but without benefit. This change removes the unnecessary namespace registration, which has substantial overhead, thus reducing startup time substantially when using protobufs. Because this change affects the import internals, quantifying the overhead requires a full tear-down/start-up of the Python interpreter. So, to capture the full cost for every run, I measured the time to launching a _fresh_ Python instance in a subprocess, varying the imports and code under test. In other words, I used `timeit` to measure the time to launch a _fresh_ Python subprocess which actually performs the imports. * Reference: normal Python startup (i.e., don't import protobuf at all). ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "pass"])' 10 loops, best of 3: 27.1 msec per loop ``` * Baseline: cost to import `google.protobuf.descriptor`, with extraneous namespace packages. ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "import google.protobuf.descriptor"])' 10 loops, best of 3: 133 msec per loop ``` * This change: cost to import `google.protobuf.descriptor`, without extraneous namespace packages. ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "import google.protobuf.descriptor"])' 10 loops, best of 3: 43.1 msec per loop ``` [1]: https://packaging.python.org/guides/packaging-namespace-packages/
In #713 and #1296, the `google` package in protobuf sources was found to cause conflicts with other Google projects, because it was not properly configured as a namespace package [1]. The initial fix in 786f80f addressed part of the issue, and #1298 fixed the rest. However, 786f80f (the initial fix) also made `google.protobuf` and `google.protobuf.pyext` into namespace packages. This was not correct: they are both regular, non-namespace, sub-subpackages. However (still), the follow-up #1298 did not nominate them as namespace packages, so the namespace registration behavior has remained, but without benefit. This change removes the unnecessary namespace registration, which has substantial overhead, thus reducing startup time substantially when using protobufs. Because this change affects the import internals, quantifying the overhead requires a full tear-down/start-up of the Python interpreter. So, to capture the full cost for every run, I measured the time to launching a _fresh_ Python instance in a subprocess, varying the imports and code under test. In other words, I used `timeit` to measure the time to launch a _fresh_ Python subprocess which actually performs the imports. * Reference: normal Python startup (i.e., don't import protobuf at all). ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "pass"])' 10 loops, best of 3: 27.1 msec per loop ``` * Baseline: cost to import `google.protobuf.descriptor`, with extraneous namespace packages. ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "import google.protobuf.descriptor"])' 10 loops, best of 3: 133 msec per loop ``` * This change: cost to import `google.protobuf.descriptor`, without extraneous namespace packages. ``` % python3 -m timeit -s 'import subprocess' -r 3 -n 10 'subprocess.call(["python3", "-c", "import google.protobuf.descriptor"])' 10 loops, best of 3: 43.1 msec per loop ``` [1]: https://packaging.python.org/guides/packaging-namespace-packages/
Improves #1296.
The problem: in the previous patch, we tweaked the
__init__.py
files to use namespaces, but no longer declared ourselves as a namespace package. The second half was unwise.Note that this only comes up when installing
protobuf
alongside another package that also installs into the google namespace; as of right now, the only PyPI package that does isgoogleapis-common-protos
, though the GAE SDK also usesgoogle.appengine
. However, this package is in fact a dependency ofgoogleapis-common-protos
, so that's an extremely likely combination. 馃槈With this PR, installing either or both of those alongside this package now works.
The case that still remains is the upgrade path, which is also what worried me in #713. It seems that if protobuf 2.6.1 is installed, there's no way to safely upgrade that to work with a newer protobuf. However,
pip uninstall
&&pip install
does the trick.PTAL @haberman -- in particular, if we really want this issue to go away forever, we should also make sure to add a tests that (1) deal with the upgrade cases and (2) also install another package that uses the google namespace, eg googleapis-common-protos.
/cc @jonparrott @tseaver and @dhermes for some Python wisdom. @jonparrott, with this change, the cases in your gist now pass (since we've fixed the protobuf vs. googleapis conflict). However, I'm worried that we can't seem to upgrade from 2.6.1; is there any hope there? Have other python packages dealt with this?
I'd also be curious about an easy way to trigger the gcloud tests with this PR; could I just switch this line to point directly to the commit in my repo and send a PR, thereby letting Travis do its thing?