Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: add support for invoking compiled Python files #638

Closed
wants to merge 1 commit into from

Conversation

yegorich
Copy link

@yegorich yegorich commented Feb 9, 2016

Some embedded Linux distributions can produce root file system
containing only compiled Python files (*.pyc, opt-1.pyc etc.).

Introduce a routine, that searches for such files too instead of
trying *.py files only.

Some embedded Linux distributions can produce root file system
containing only compiled Python files (*.pyc, opt-1.pyc etc.).

Introduce a routine, that searches for such files too instead of
trying *.py files only.
@oberstet
Copy link
Contributor

oberstet commented Feb 9, 2016

I think this is a hack that could have bad side effects. Simply create an image that contains a standard installation.

@oberstet oberstet closed this Feb 9, 2016
@yegorich
Copy link
Author

@oberstet what about changing the search algorithm. One can define a list of possible file extensions beginning with *.py and checking if such a file exists. If not then try *.pyc and other stuff. This way the initial behavior will be preserved and compiled Python files still can be used.

All other Python packages in Buildroot are working without a problem as compiled versions. So it would be great to keep this for crossbar.

@oberstet
Copy link
Contributor

The correct "search algorithm" is already implemented: in the Python run-time.

There is no valid reason to try to distribute .pyc/.pyo files. It's a mere run-time artifact of a specific Python implementation (CPython).

You should NOT strip off *.py files and only retain *.pyc. This is wrong. You should distribute *.py files - as everybody does.

I won't merge any fancy hack that tries to "solve" this, so don't waste your (and my) time.

@tpetazzoni
Copy link

@oberstet Except filesystem size, which is important on many embedded systems. Having all the .py + the generated .pyc files consumes a lot of storage space, which is precious on embedded systems. Moreover, having pre-generated .pyc files is great for embedded systems where the root filesystem is read-only.

So, yes, shipping both .py and .pyc is perfectly fine for desktop/server installations, but is not necessarily appropriate for space-constrained embedded systems.

@oberstet
Copy link
Contributor

Crossbar.io needs a Linux system, and that means MBs of storage anyway. Eg. the Arduino Yun with 16MB Flash is probably a lower bound. Given the dependencies of Crossbar.io, total pyc size is only a fraction. Also: root FS read only does not matter .. pyc won't be written then. It'll save some CPU cycles on CPython, since it won't be compiled anew. This is specific to CPython. On PyPy, there is no pyc/pyo at all. Eg. you cannot run a Python program on PyPy if you only have pyc's!

@oberstet
Copy link
Contributor

Have been thinking about how to deploy Crossbar.io to small devices a little - in general. I could imagine there are different scenarios.

Eg one where a firmware image is created that is flashed onto the device at manufactoring time, and where the image is fully cross built from a developer's PC. In this case, Crossbar.io will run from a full Python environment with quite a bunch of dependencies which have to be cross built. Should work, but requires some effort I guess.

Then there is the scenario where the device supports booting from a user supplied storage medium, and the device is running a Linux with writable root, and is capable of self-hosting a native build chain, so that Crossbar.io can be built from sources on the device. Like the Pi. This isn't a problem at all.

So I am wondering about other scenarios ..

@tpetazzoni
Copy link

@oberstet I did some filesystem size comparaison between having both .py and .pyc and having just .pyc files. As you probably know, both @yegorich and myself are working on Buildroot, a tool that allows to build small Linux root filesystem for embedded systems using cross-compilation.

So I've taken the example of a filesystem built for ARM, with the uClibc C library, that contains Busybox, the Python 3 interpreter, Python Crossbar and all its dependencies.

When built with both .py and .pyc files, the total filesystem size is 74 MB.

When built with only .pyc files, the total filesystem size is 46 MB.

So there is an overhead of 28 MB just to have the .py files present in the filesystem, which is about 37% of the total filesystem size!

I hope that with those numbers you will reconsider your position :)

In case you want to reproduce, you can compare the build of the following Buildroot configuration (which has only .pyc files)

BR2_arm=y
BR2_TOOLCHAIN_EXTERNAL=y
BR2_TOOLCHAIN_EXTERNAL_CUSTOM=y
BR2_TOOLCHAIN_EXTERNAL_DOWNLOAD=y
BR2_TOOLCHAIN_EXTERNAL_URL="http://autobuild.buildroot.org/toolchains/tarballs/br-arm-full-2016.02-3-g762b7c9.tar.bz2"
BR2_TOOLCHAIN_EXTERNAL_GCC_4_7=y
BR2_TOOLCHAIN_EXTERNAL_HEADERS_3_10=y
BR2_TOOLCHAIN_EXTERNAL_LOCALE=y
# BR2_TOOLCHAIN_EXTERNAL_HAS_THREADS_DEBUG is not set
BR2_TOOLCHAIN_EXTERNAL_INET_RPC=y
BR2_TOOLCHAIN_EXTERNAL_CXX=y
BR2_PACKAGE_PYTHON3=y
BR2_PACKAGE_PYTHON_CROSSBAR=y

Against the build of that one, which has both .py and .pyc files:

BR2_arm=y
BR2_TOOLCHAIN_EXTERNAL=y
BR2_TOOLCHAIN_EXTERNAL_CUSTOM=y
BR2_TOOLCHAIN_EXTERNAL_DOWNLOAD=y
BR2_TOOLCHAIN_EXTERNAL_URL="http://autobuild.buildroot.org/toolchains/tarballs/br-arm-full-2016.02-3-g762b7c9.tar.bz2"
BR2_TOOLCHAIN_EXTERNAL_GCC_4_7=y
BR2_TOOLCHAIN_EXTERNAL_HEADERS_3_10=y
BR2_TOOLCHAIN_EXTERNAL_LOCALE=y
# BR2_TOOLCHAIN_EXTERNAL_HAS_THREADS_DEBUG is not set
BR2_TOOLCHAIN_EXTERNAL_INET_RPC=y
BR2_TOOLCHAIN_EXTERNAL_CXX=y
BR2_PACKAGE_PYTHON3=y
BR2_PACKAGE_PYTHON3_PY_PYC=y
BR2_PACKAGE_PYTHON_CROSSBAR=y

Thanks!

@oberstet
Copy link
Contributor

oberstet commented May 1, 2016

@tpetazzoni Thanks for providing actual numbers for the overhead - and for putting those numbers into perspective! Almost 40% indeed sucks. I agree. So this isn't acceptable, and we should address it. I will look into this again .. - running nice on embedded is a prio.

My concerns (besides "it's not usual the way", which I get over): http://grokbase.com/t/python/pypy-dev/1159r328yz/cant-import-pyc-file

Curious: have you looked into PyPy? It works on ARM (at least a Pi sized one).

Slightly OT, but this might be of interest (the Docker image sizes) https://hub.docker.com/r/crossbario/crossbar/tags/ - the "plus" thing actually bundles a complete PyPy (with everything down to OpenSSL. Everything but a C run-time that is).

@oberstet
Copy link
Contributor

oberstet commented May 1, 2016

@meejah @hawkowl ^

@tpetazzoni
Copy link

Thanks for reconsidering!

We haven't looked at PyPy so far, but we probably should at some point.

@oberstet
Copy link
Contributor

oberstet commented May 1, 2016

So, looking again, I think we should be able to make such a change compatible with a regular PyPy (one that isn't built with --objspace-lonepycfiles).

Rgd PyPy: yeah, you should totally look into it;) The performance gains are huge. The JIT is top notch. Plus: it has an incremental GC, which makes network servers like Crossbar.io have consistent low latencies.

@meejah
Copy link
Contributor

meejah commented May 4, 2016

Hmm, having to re-create Python's search-stuff via regex plus trawling __pycache__ seems brittle and doomed to fail.

What if we invoke the script via -m crossbar.worker.process instead so that Python's internal "find code to run" is tickled and then we don't have to muck about in the filesystem ourselves, at all (nor care if e.g. PyPy uses __pycache__ or not, stale cached files, etc, etc, etc).

I tested the (upcoming) PR by doing this:

  • create a virtualenv
  • install crossbar
  • run an example crossbar setup
  • delete all .py files from the venv
  • run the example again (fails, without my PR)

@meejah
Copy link
Contributor

meejah commented May 4, 2016

@yegorich can you try #777 and confirm if it works for your use-case? Thanks 😃

@meejah
Copy link
Contributor

meejah commented May 5, 2016

I am closing this as #777 provides the same functionality; if this implementation is preferable for some reason, please re-open (note there are mixed tabs/spaces somewhere in this patch, though, according to Travis builders).

@meejah meejah closed this May 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants