Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MameToolkit #6197

Closed
M-J-Murray opened this issue Jul 16, 2019 · 7 comments
Closed

MameToolkit #6197

M-J-Murray opened this issue Jul 16, 2019 · 7 comments

Comments

@M-J-Murray
Copy link

Hello,

I would like to increase the size of my pypi repository from 60MB to 120MB (https://pypi.org/project/MAMEToolkit/#files).
I have to include a pre-compiled instance of MAME inside my repository for my code to work. I have just recompiled a newer instance of the library, and the size has increased, so the entire code is now 101MB. 120MB would be a safe padding.

Yours faithfully,
Michael Murray

@jamadden
Copy link
Contributor

Looking at https://pypi.org/project/MAMEToolkit/#files, there seem to be some issues with the metadata.

There are two files uploaded:

  • MAMEToolkit-1.0.2-py3-none-any.whl (60.3 MB)
  • MAMEToolkit-1.0.2.tar.gz (59.9 MB)

Both archives include the (199MB, once unpacked) native executable file:

./emulator/mame/mame: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=8919845a1e38cf7aa89341a1cb57e1c4734b0830, with debug_info, not stripped

The .tar.gz file is a source distribution, meaning it typically isn't supposed to contain compiled binaries. It's supposed to contain the source to allow anyone to build for their platform.

The py3-none-any.whl is a binary distribution, but the metadata associated with it in the filename say that it runs on Python 3 (py3) on any Python interpreter (none) and on any platform (any). That means that tools like pip will install this wheel on every platform (macOS, Linux, Raspberry PI, etc), but it won't run due to the platform-specific binary embedded in it. 😢

There's a solution though: distribute manylinux wheels: these are wheels whose filename ends in something like cp37-cp37m-manylinux1_x86_64.whl. Tools like pip will only install the binary wheel on the appropriate platforms; on other platforms, they'll use the source distribution, which should have all the necessary source needed to compile a working binary (on a supported platform); or, if there is no source distribution, they'll give up and tell the user the project isn't available.

But there's a small catch. One of the issues with distributing native binaries for Linux is the variety of shared libraries that are installed on individual platforms. The distributed mame executable links to these libraries:

 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libSDL2-2.0.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libutil.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libGL.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libasound.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libQt5Widgets.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libQt5Gui.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libQt5Core.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libX11.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libSDL2_ttf-2.0.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libfontconfig.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

The manylinux1 and manylinux2010 specifications outline a list of shared libraries that are safe to link to that enable the binary to work an a large variety of Linux systems. Many of the libraries linked to by this binary are not on that list, so this binary is unlikely to work on systems other than exactly the one that compiled it (or very close cousins).

There's a solution to that too: the manylinux project provides tools and examples to build wheels that include the required shared libraries so that the binary can work on most Linux systems.

I think an appropriate tag for the existing wheel would be something like py3-none-linux_x86_64. After going through the manylinux process, a new tag would be py3-none-manylinux1_x86_64. PyPI doesn't allow uploading plain linux_x86_64 wheels, though, because no tool will install them.

Would you be interested in updating your distributions (removing platform-specific compiled binaries from the source distribution, and either making the binary wheel manylinux-compliant or not distributing a binary wheel)?

@M-J-Murray
Copy link
Author

Thank you Jason, this is extremely helpful. I wasn't aware that I had incorrectly configured the project. I will definitely make the appropriate changes. Once the fixes are made, would you be able to increase my data limit?

@jamadden
Copy link
Contributor

Thank you Jason, this is extremely helpful. I wasn't aware that I had incorrectly configured the project. I will definitely make the appropriate changes.

Great! Glad I could help.

Once the fixes are made, would you be able to increase my data limit?

I'm not aware of any reason not to!

It might be a good idea to close this issue and open a new one when the packaging is ready. We try not to, but sometimes older issues get lost in the backlog.

@M-J-Murray
Copy link
Author

Hello Again,

I'm having a lot of issues trying exclude the mame binary from the tar.gz. I can only seem to exclude files from the whl or both, but not from the tar.gz using setuptools. How do I state which files go into which compressed file using setuptools?

@M-J-Murray M-J-Murray reopened this Jul 16, 2019
@jamadden
Copy link
Contributor

jamadden commented Jul 17, 2019

Because you're essentially building a native code extension, I think you need to tell setuptools about that so that it knows what is source and what is built code. Ideally, we would use the setuptools functionality to actually build the compiled file and include its sources in our sdist. We could potentially piggyback on the libraries= keyword to setuptools by subclassing either setuptools.command.build_clib.build_clib or just setuptools.Command and using the keyword argument cmdclass={'build_clib': 'my_build_clib'}.

If we expect to have the compiled file already built completely outside of the setuptools lifecyle, though, there's an easier, if hacky way by overriding the default build_py command.

NOTE: This is from by own experience with setuptools, not an endorsement of this being the "right" or "official" way to do things.

NOTE: This doesn't handle making the built binary manylinux1 compatible. The built binary is still system specific and this wheel won't be accepted for upload by PyPI.

Given this directory layout:

$ ls -R
.:
README.rst  binary  setup.cfg  setup.py  src

./binary:
binary

./src:
package

./src/package:
__init__.py  binary  module.py

./src/package/binary:
__init__.py

We have a built binary file at binary/binary that we would like to wind up in the wheel as package/binary/binary, beside package/binary/__init__.py and which we want to exclude from the sdist.

Given this setup.cfg:

[bdist_wheel]
python-tag = py3
plat-name = linux_x86_64

We can write this minimum setup.py to extend the default build_py and copy that file into place for built distributions:

import shutil
import os

from setuptools import setup
from setuptools import find_packages
from setuptools.command.build_py import build_py as orig_build

version = '1.0.0.dev0'

# A hacky way to build our executable dependency,
# treating it like a C library, by copying a built
# binary.
# Ideally this would actually build the executable.
class build(orig_build):

    def run(self):
        orig_build.run(self)
        binary_file = os.path.join('binary', 'binary')
        if os.path.exists(binary_file):
            # Hardcoding the destination within our package structure.
            dest_dir = os.path.join(self.build_lib, 'package', 'binary')
            if not os.path.isdir(dest_dir):
                os.makedirs(dest_dir)
            shutil.copy(binary_file, dest_dir)

setup(
    name='package',
    version=version,
    url="https://example.com",
    author="You",
    author_email="you@example.com",
    zip_safe=False,
    packages=find_packages('src'),
    package_dir={'': 'src'},
    include_package_data=True,
    cmdclass={
        'build_py': build,
    },
)

Building the wheel gets the binary file:

$ python setup.py bdist_wheel
running bdist_wheel
running build
running build_py
...
removing build/bdist.macosx-10.14-x86_64/wheel
$ unzip -l dist/package-1.0.0.dev0-py3-none-linux_x86_64.whl
Archive:  dist/package-1.0.0.dev0-py3-none-linux_x86_64.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  07-17-2019 10:58   package/__init__.py
        0  07-17-2019 10:58   package/module.py
        0  07-17-2019 10:59   package/binary/__init__.py
        0  07-17-2019 12:14   package/binary/binary
      192  07-17-2019 12:14   package-1.0.0.dev0.dist-info/METADATA
      101  07-17-2019 12:14   package-1.0.0.dev0.dist-info/WHEEL
        8  07-17-2019 12:14   package-1.0.0.dev0.dist-info/top_level.txt
      616  07-17-2019 12:14   package-1.0.0.dev0.dist-info/RECORD
---------                     -------
      917                     8 files

Build the sdist does not:

console
$ python setup.py sdist
running sdist
...
Creating tar archive
removing 'package-1.0.0.dev0' (and everything under it)
$ tar -tf dist/package-1.0.0.dev0.tar.gz
package-1.0.0.dev0/
package-1.0.0.dev0/PKG-INFO
package-1.0.0.dev0/README.rst
package-1.0.0.dev0/setup.cfg
package-1.0.0.dev0/setup.py
package-1.0.0.dev0/src/
package-1.0.0.dev0/src/package/
package-1.0.0.dev0/src/package/__init__.py
package-1.0.0.dev0/src/package/binary/
package-1.0.0.dev0/src/package/binary/__init__.py
package-1.0.0.dev0/src/package/module.py
package-1.0.0.dev0/src/package.egg-info/
package-1.0.0.dev0/src/package.egg-info/PKG-INFO
package-1.0.0.dev0/src/package.egg-info/SOURCES.txt
package-1.0.0.dev0/src/package.egg-info/dependency_links.txt
package-1.0.0.dev0/src/package.egg-info/not-zip-safe
package-1.0.0.dev0/src/package.egg-info/top_level.txt

@M-J-Murray
Copy link
Author

Hello Jason,

Thank you for providing such amazing in depth advice. It saved me a lot of time and effort, I wasn't aware of any of this. I really appreciate it, you are awesome! Sorry it took so long, my ubuntu died after a dist-upgrade.

Could you review the changes and let me know if this is suitable for the size limit change?

@M-J-Murray
Copy link
Author

Will move to new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants