Build steps? #119

takluyver · 2017-07-11T10:32:04Z

This is another speculative issue for discussion, not something that's necessarily going to be implemented. The question is whether & how to support build steps to run when producing a wheel.

Should we?

So far, I have always said that flit is a simple tool for simple packages, i.e. those with no build step. I think these are the vast majority of packages we publish, and the simplicity argument is still an important one against adding such features. I also don't know much about compilers, which are one of the most common sorts of build steps.

The best argument I see in favour is that you may start packaging something with flit, then add something that requires a build step. With no support for that, you have to throw away your packaging metadata and workflows to switch to another tool.

I expect that the majority of people looking at this issue will be those who want such features, so I'm not going to pay too much attention to +1s.

How would it work?

Briefly, the wheel build would copy files (the files in the package and... some others?) to a temporary location, and invoke external tools there to do the build.

It would need some way to figure out the compatibility effects of the hooks:

Some do not restrict compatibility (e.g. minifying JS code)
Some restrict compatibility towards the platform they're run on (e.g. standard compilation)
Some restrict compatibility towards a specific platform regardless of where they're run.
Some can be asked to target a particular platform (cross compilation)

It would also need to elegantly combine multiple build hooks - e.g. if you have a C extension module and a Qt designer .ui file to compile, it should be possible to specify those as two independent build steps, without either needing to be aware of the other. Or maybe this composition should be the responsibility of a proper build system which we invoke.

The text was updated successfully, but these errors were encountered:

dholth · 2017-07-11T18:18:03Z

Not rocket science to convert your metadata format to mine either. Or a standardized toml metadata based on setup() arguments could be used. Or flit could provide a service to write the dist-info folder and perhaps the manifest, and the build system could do everything else. Or flit could provide useful command line tools like 'publish' and the build step could do almost everything else including writing the wheel.
Tricky part would be communicating which files had been built and where they go between flit and build tool. Collaboration could be "write your .so to a new folder that is zipped up to be a wheel", build step may or may not do the .py copy itself, and flit could handle slightly tricky issue of wheel manifests and metadata.
https://bitbucket.org/dholth/enscons/src/f31e60b7e29a9dc1bc4e314b7457727e5619db9e/pyproject.toml?at=default&fileviewer=file-view-default

dholth · 2017-07-11T18:20:18Z

Also enscons doesn't really require the metadata to be in toml, it's all dicts internally.

takluyver · 2017-07-12T10:32:02Z

Yeah, there could be a flit2encsons tool which would help people upgrade to the more flexible packaging when they need it. :-)

techtonik · 2017-07-12T11:33:07Z

I expect that the majority of people looking at this issue will be those who want such features, so I'm not going to pay too much attention to +1s.

I am actually looking to see an SVG diagram of those "build steps" whatever they are.

takluyver · 2017-07-12T12:15:01Z

I'm not quite sure what you mean. The discussion is about supporting whatever build steps people might want (compiling, cython, JS minification...), not writing any specific build steps.

dholth · 2017-07-12T12:50:26Z

The cargo build script is a decent example. http://doc.crates.io/build-script.html

Cargo runs the script, the script does anything it wants to, and then cargo goes back to its own strength which is building Rust packages. Seems analogous to the place flit wants to occupy.

@techtonik here are a couple of examples of non-C-compiler build steps, with ascii art instead of SVG.

My pysdl2-cffi wrapper has a build dependency tree excerpted in part here, printed with SCons --tree=all:

    +-sdl/ttf.py
      +-sdl/__init__.py
      | +-builder/__init__.py
      | +-builder/build_image.py

ttf.py needs to be generated after sdl/init.py and they both depend on the contents of a special purpose code generator in builder/*.py (since editing the code generator is a big part of doing the project). The build system knows how to run the code generator to transform the inputs to the output by traversing this graph.

  +-pysdl2_cffi.egg-info
  | +-pysdl2_cffi.egg-info/PKG-INFO
  | | +-pyproject.toml
  | +-pysdl2_cffi.egg-info/entry_points.txt
  | | +-pyproject.toml
  | +-pysdl2_cffi.egg-info/requires.txt
  |   +-pyproject.toml

The .egg-info directory is built by generating PKG-INFO, entry_points.txt and requires.txt, which depend on the contents of pyproject.toml.

takluyver · 2017-07-12T14:15:40Z

Yup, the cargo build script is the sort of thing I'm thinking of.

ncoghlan · 2017-09-18T01:01:24Z

One option to consider would be to re-use PEP 517 and delegate the entire wheel building step (and get_requires_for_build_wheel), while leaving flit in control of the sdist generation and other aspects of the project's local development experience.

This would align with Daniel's suggestion of "flit could provide useful command line tools like 'publish' and the build step could do almost everything else including writing the wheel."

takluyver · 2017-12-14T14:28:24Z

One option for the 'build script' model is for flit to allow build steps which don't limit platform compatibility, like minifying Javascript, or generating Python UI code. But you would still need to use another tool if you have build steps like compiling native code, where the result is less portable than the source.

I'm still not sure whether this is worth the added complexity, though.

techtonik · 2017-12-31T11:21:59Z

There a plenty of build systems out there. Why not to make flit a plugin for those tools? For example, https://github.com/SCons

The reason for that is to increase transparency into what flit does. Integrating with other build systems will show which things are reinvented, which things are really hacks, and which best practices can be reused.

takluyver · 2018-01-08T15:52:54Z

I definitely don't want to reinvent a build system; I'm certainly not qualified to tackle that, and as you say, there are plenty of build systems out there. What I wonder about is whether we can (and should) make things easier for people who are going from a pure Python package with no build step to a package that maybe bundles some minified Javascript, or compiles a Cython file. It seems rough to make them throw away their flit packaging and start again with another tool, as we currently do.

If you want to integrate building Python packages with Scons, see @dholth's enscons. I have no plans to turn flit into that kind of tool, but I'd be interested to hear if you make any other such integrations with build systems.

takluyver · 2018-05-07T09:13:55Z

I'm warming slightly to the idea of a cargo-style build script: Flit would make the sdist, unpack to a temporary directory, and then run the build script there before packing up a wheel. In theory, this isn't much extra complexity, but... what does theory know?

Other complexities that have occurred to me:

Caching: should flit attempt to provide any support for caching, so that the build step doesn't have to do 100% of the work every single time? Or is it entirely up to whatever tools you're invoking to figure out caching if they need to?
Build steps that want to deal with the VCS: this won't work from the unpacked tempdir. I don't think packages should need VCS info, but there may be a desire to do things like embedding a commit ID in the package to precisely identify the version.

ncoghlan · 2018-05-07T09:27:46Z

For the latter concern, you may want to define a set of environment variables that flit makes available to the build script, and have one of them be FLIT_VCS_DIR.

That could also work if you receive requests for help with managing artifact caching - pass in a FLIT_CACHE_DIR setting. However, it's probably simpler to just let tools manage their own caches (e.g. someone wanting to speed up C/C++ builds is going to get more benefit out of setting up ccache than they are from learning a flit-specific artifact caching regime)

ipmb · 2018-05-07T14:07:00Z

One word of caution about copying things to a temporary directory for building... it takes several minutes for projects with large git histories, lots of node_modules, etc. Pip suffers from this: pypa/pip#2195

takluyver · 2018-05-07T14:55:28Z

I'd copy only the current VCS checkout - no history, no files that aren't version controlled - so the speed hopefully wouldn't be an issue. Of course, relying on and working with VCSs brings its own set of problems, but I've already chosen that path for building sdists.

It would need a fallback path for when there's no VCS info available, though; either because it's not a VCS checkout, or because the VCS software is not found. In that case, it would probably have to copy all the files and hope.

ipmb · 2018-05-07T15:04:52Z

It's probably a tradeoff. Thinking of an example webapp, either you copy node_modules or count on the build script to run npm install or whatever to recreate them. Both are slow, but a build script gives the developer more control to optimize the process.

takluyver · 2018-05-07T15:08:44Z

Yup, precisely. And a build script is also more predictable, because we can be sure that it was run on these files, not the files as they were two weeks ago.

Build tools can also cache data outside the package directory. I don't think npm does, but if a build script used pip install --target to bundle other packages, it would benefit from pip's caches.

techtonik · 2018-05-07T22:33:19Z

Because of read or because of write?

If flit could detect if tmp is memory based - can this speed things up? https://superuser.com/questions/45342/when-should-i-use-dev-shm-and-when-should-i-use-tmp

dholth · 2018-05-08T00:54:38Z

Cargo copies to build?

On Mon, May 7, 2018, at 6:33 PM, anatoly techtonik wrote: Because of read or because of write? If flit could detect if tmp is memory based - can this speed things up? https://superuser.com/questions/45342/when-should-i-use-dev-shm-and-when-should-i-use-tmp> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[1], or mute the thread[2].>

Links: 1. #119 (comment) 2. https://github.com/notifications/unsubscribe-auth/AAMskl--GbU_cicKvwgXe8v43Co8pyMnks5twMuwgaJpZM4OUCec

pradyunsg · 2018-05-16T07:37:44Z

It might be worth exploring 2 builds -- one before sdist, one before wheel.

I was thinking about Cython modules. Usually, one would want to compile the Cython code to C code before creating an sdist. And then before creating a wheel, compile the C files to libraries.

takluyver · 2018-05-16T17:20:57Z

I know a lot of people do things like that, but I think it's a leftover from the days before wheels, when we needed to make it easy to install from sdist. Now that we have both wheels and a good way to specify build requirements, I don't think this is necessary: sdists should be pure source, minus generated files. So any build steps would run as part of generating wheels.

If I do this, it will probably only handle build steps that produce platform-independent wheels, at least for the first version. So it would work for things like bundling Javascript, but not for building Cython modules.

pradyunsg · 2018-05-16T18:24:29Z

Sounds fair to me. :)

neumond · 2018-06-13T10:14:07Z

I have very complicated compiling workflow for one of my packages: first step is code generation, second is launching a virtual machine with different OS to actually compile binaries, then I use setuptools+wheel to pack wheels. I build binaries manually and "attach" them to a package using package_data.

I can't ship source dists, they would be absolutely useless, only wheels, it would be helpful to disable sdists in flit config entirely. In my case it is far better to prevent people installing fallback source dists and complaining why the package doesn't work for them.

Another thing to note is managing tags for several wheels. Currenly I have to edit setup.cfg every time I build wheels:

[bdist_wheel]
python-tag = cp36  ; cp35 cp37 etc
plat-name = win32  ; win64 etc

What I expect from flit is just making set of wheels (for different pythons and platforms) and uploading them to pypi, given set of binaries I want to "attach". It would be helpful if flit could make some validation, e.g. remind me if I didn't (re)build something to prevent uploading incorrect wheels (python hook for flit?).

Variant A

[tool.flit.binaries]
disable-sdist = true
python-tag = ["cp35", "cp36", "cp37"]
plat-name = ["win32", "win64"]
exclude = [
  { python-tag = "cp35", plat-name = "win64" },
]
paths = []
check-binary-hook = "project:check_binary"

def check_binary(binary_path):
    s = binary_path.stat()
    if not s.st_size:
        return False  # have to rebuild
    if time.time() - s.st_mtime > 300:
        return False
    return True

Variant B

[tool.flit.binaries]
disable-sdist = true
wheel-list = "project:wheel_list"
file-mapper = "project:file_mapper"
file-checker = "project:file_checker"

def wheel_list():
    for python_tag in ('cp35', 'cp36', 'cp37'):
        for plat_name in ('win32', 'win64'):
            if python_tag == 'cp35' and plat_name == 'win64':
                continue
            yield python_tag, plat_name

def file_mapper(python_tag, plat_name):
    folder = Path('{}_{}'.format(python_tag, plat_name))
    yield folder / 'module.pyd'
    yield folder / 'submodule.pyd'

def file_checker(binary_path):
    # probably instead checker we could define content function
    # returning chunk of bytes
    # e.g. for fetching binaries as artifacts from CI servers
    s = binary_path.stat()
    if not s.st_size:
        return False  # have to rebuild
    if time.time() - s.st_mtime > 300:
        return False
    return True

The goal of PEP517/PEP518 is making things more automatized for end user/developer, making less assumptions of build time requirements. But these PEPs work entirely at python side, they cannot pull compilers, visual studio, virtual machines with different OSes, external binary packages and third party applications, e.g. blender, firefox, even build systems like make/scons/cmake. It is unavoidable for developers to install all these packages manually following some guide in project's readme.

Before wheels pip could compile some packages for you from source dists. It requires installed compiler, python headers, may be cython, etc. But pip fails if your package depends on something external, like blender headers: you have to install prerequisites first, then pip install again. I think you can't automatize all possible building workflows and it is unavoidable to add ability of including raw binary files to wheels.

takluyver · 2018-06-13T12:54:35Z

I have very complicated compiling workflow for one of my packages...

Flit is probably not going to be the right tool for you, then. Even if we add build steps, it's focused on the simple use cases, not the complicated ones. You might want to look at a system like encsons instead.

But these PEPs work entirely at python side, they cannot pull compilers, visual studio, virtual machines with different OSes, external binary packages and third party applications, e.g. blender, firefox, even build systems like make/scons/cmake.

The PEPs don't specify a mechanism for that, but there's no reason that a build backend complying with these PEPs couldn't download and run a VM or a container to do the build. In fact, I'd be surprised if no-one makes a docker build backend for PEP 517.

neumond · 2018-06-13T16:27:28Z

Probably I wrote an overloaded post, but for me (regarding changes only in flit) it's quite enough to have:

Ability to disable sdists entirely
Ability to inject arbitrary files into wheels (package_data-alike)
Ability to produce several wheels for different architectural tags (flit build and I have multiple files in dist folder)

I have no plans to distribute buildable sdists with scripts, makefiles and docker.

dholth · 2018-06-13T17:20:41Z

@neumond you might actually be part of the small target "market" for enscons. It provides a few tools for building a wheel, which can contain anything, as a result of arbitrarily complex build steps. You don't have to include the sdist target if you don't want one. There is also the pre-enscons approach taken by https://bitbucket.org/dholth/sdl2_lib/src/default/, a library that only packages sdl2 dlls in a wheel and cannot have a reasonable sdist, which just implements a standalone wheel in a 130 line waf script.

neumond · 2018-06-15T20:59:53Z

@dholth Nice idea of building on top of scons, but for me scons is kinda counter-intuitive, I spend too much time searching APIs for every single function I want to use. Have you considered building on top of Waf?

UPD: oh, sorry, just realised your links above point exactly to waf-built wheels.

ncoghlan · 2018-06-16T04:34:26Z

@neumond In addition to being useful in its own right for folks that like Scons, enscons is considered an illustration of the principle of adapting an existing full-featured build system to the Python ecosystem's build system interface expectations, rather than writing a fresh build system from scratch. The entire purpose of the PEP 517/518 build system abstraction layer is to make that easier to do.

However, this aspect of the discussion has now shifted to be entirely off-topic for flit's issue tracker - this issue is about asking whether or not there's a useful middle ground between flit supporting full PEP 517 style build delegation to arbitrary build backends, and flit not supporting build steps at all.

neumond · 2018-06-16T12:42:36Z

@ncoghlan Right now I feel the need to have abstract wheel assembling library, not tied with flit and enscons. It looks like wheel package should do this, but I couldn't find clear and easy to use API there, which would clarify wheel contents & metadata, enforce constraints, warn on metadata contradictions (for humans™ ☺). Nevertheless I find wheel's source code very helpful at filling gaps of PEP425 and 427.

FRidh · 2019-11-05T12:09:24Z

What I wonder about is whether we can (and should) make things easier for people who are going from a pure Python package with no build step to a package that maybe bundles some minified Javascript, or compiles a Cython file. It seems rough to make them throw away their flit packaging and start again with another tool, as we currently do.

I started a thread about using Meson as build-system https://discuss.python.org/t/should-python-packaging-aim-for-meson-as-build-system-in-case-of-extension-modules/2579 which would function just like enscons.

See e.g. https://github.com/FRidh/mesonpep517examples/blob/master/pyproject.toml and the meson.build file in the root and subfolder. The pyproject.toml looks a lot like flit already and I think the steps that are needed to use meson here are small. We just need more and better examples probably.

pradyunsg · 2020-09-06T12:39:46Z

I've now got a project that fits the use case of "minifying the JS/CSS". Given that I now do have experience with the use case, here's what I can say we'd need for that usecase.

Specify a build script. I want to run npm install then gulp build.
Need to include the minified files (which are NOT tracked in version control)
Should exclude the non-minified files (which are tracked in version control)

I'm imagining a new tool.flit.wheel table, that has keys for each of these (following tool.flit.sdist):

[tool.flit.wheel]
build = "build.py"
include = ["src/projectname/scripts/bundle.min.js", "src/projectname/styles/bundle.css"]
exclude = ["src/projectname/assets/*"]

With this, I can then have a build.py file in the root of the repository like:

from subprocess import run

run(["npm", "install"], check=True)
run(["gulp", "build"], check=True)

I do think there's a few not-yet-decided-on items, so... well, here's my take on them: :P

Re caching: I think I'm personally 100% OK with flit saying "you manage the caching".
Re "Build steps that want to deal with the VCS": maybe we can use git worktree for this? We use it for pip's build pipeline: https://github.com/pypa/pip/blob/8d549f2cf87bff6ee0512638403d6f4e9c55e809/tools/automation/release/__init__.py#L165-L186
Re build: I imagine we'll need to decide whether we want to allow "drop in a build.py file and flit will run it" behavior around it. I don't have a good feel for this. :)

file ↓ \| `build` key →	specified	not specified
exists	run given	???
not exists	fail	status quo

@takluyver do you think this looks like a reasonable overview of the approach that flit could take for this?

pradyunsg · 2020-09-06T12:45:45Z

Another option is the build key be a table instead, that has an optional "requires" key whose value is returned by https://www.python.org/dev/peps/pep-0517/#get-requires-for-build-wheel:

build = { script = "build.py", requires = ["pylibsass"] }

If we make the requires key optional, we could also get away with:

build.script = "build.py"

which reads really nicely IMO. :)

pradyunsg · 2020-09-15T14:20:20Z

@takluyver gentle nudge If you have express any interest in this, I'm happy to put up a PR for the above. :)

takluyver · 2021-04-10T09:49:54Z

With the coming of PEP 621 - implemented but undocumented in Flit 3.2, hopefully to be documented once people have kicked the tyres a bit - I'm leaning towards saying no to build steps. I originally wrote in the issue description that:

The best argument I see in favour is that you may start packaging something with flit, then add something that requires a build step. With no support for that, you have to throw away your packaging metadata and workflows to switch to another tool.

PEP 621 should make the packaging metadata portable between build tools, so it will be easier to switch from Flit to something like enscons or setuptools, assuming they also get support. It probably won't be entirely seamless - changing build processes never is - but it should avoid having to rewrite the same metadata in a slightly different format. And that metadata can be most of what you give Flit.

I've also come to appreciate that building software is a massive, complicated topic, and I would rather not be responsible for a build system. We could limit the scope to make it an easier problem, e.g. only allowing build steps which don't affect platform compatibility, like minifying JS. But I think there are relatively few use cases like this between the much more common ones where there's either no build step at all, or we want to build native code for a specific target platform. So I don't think there's much value in adding build steps without support for the complex cases.

So I'm planning to close this issue in a week or two, unless someone makes a compelling argument in favour of supporting build steps despite PEP 621. Of course, that won't mean it's written in stone - we can always revisit the question if circumstances change or if there's a brilliant new idea for how to implement it.

takluyver · 2021-04-21T19:06:49Z

Closing for the reasons described in my previous comment. Maybe we'll revisit this some day in different circumstances, but for now, I think the drawbacks outweigh the advantages, and I don't think it's useful to have questions like this hanging in uncertainty indefinitely.

takluyver mentioned this issue Aug 30, 2017

support src/ directories #115

Closed

takluyver mentioned this issue Sep 17, 2017

Switch to setup.cfg for metadata specification? pypa/packaging.python.org#378

Closed

ncoghlan mentioned this issue Feb 18, 2018

Writing a setup.py is Hard pypa/packaging-problems#1

Open

2 tasks

takluyver mentioned this issue May 7, 2018

Option to Include files that are ignored by the VCS #178

Open

mbarkhau mentioned this issue Jul 16, 2018

How to package with flit? mbarkhau/three2six#1

Open

dhermes mentioned this issue Jul 16, 2018

Investigate using build system other than Python / distutils / setuptools dhermes/bezier#62

Closed

takluyver mentioned this issue Feb 7, 2019

New release? #247

Closed

chrahunt mentioned this issue Apr 30, 2019

Integrate with packaging tools to allow console_scripts-like usage chrahunt/quicken#19

Closed

takluyver mentioned this issue May 15, 2019

Clarify how flit is involved in package install for users #266

Closed

takluyver mentioned this issue Oct 29, 2019

Flit support for binary extension modules? #290

Closed

takluyver mentioned this issue Jun 24, 2020

Support for data files installation #358

Closed

pradyunsg mentioned this issue Sep 15, 2020

♻️ REFACTOR: Compile SCSS in pre-commit executablebooks/sphinx-panels#32

Merged

afshin mentioned this issue Mar 12, 2021

Proposal: Improved integration with setuptools jupyter/jupyter-packaging#69

Merged

takluyver closed this as completed Apr 21, 2021

takluyver mentioned this issue Dec 9, 2021

RFE: functionality to set __version__ from scm automatically #257

Open

Build steps? #119

Build steps? #119

Comments

takluyver commented Jul 11, 2017

Should we?

How would it work?

dholth commented Jul 11, 2017

dholth commented Jul 11, 2017

takluyver commented Jul 12, 2017

techtonik commented Jul 12, 2017

takluyver commented Jul 12, 2017

dholth commented Jul 12, 2017

takluyver commented Jul 12, 2017

ncoghlan commented Sep 18, 2017

takluyver commented Dec 14, 2017

techtonik commented Dec 31, 2017

takluyver commented Jan 8, 2018

takluyver commented May 7, 2018

ncoghlan commented May 7, 2018

ipmb commented May 7, 2018

takluyver commented May 7, 2018

ipmb commented May 7, 2018

takluyver commented May 7, 2018

techtonik commented May 7, 2018

dholth commented May 8, 2018 via email

pradyunsg commented May 16, 2018

takluyver commented May 16, 2018

pradyunsg commented May 16, 2018

neumond commented Jun 13, 2018 • edited Loading

Variant A

Variant B

takluyver commented Jun 13, 2018

neumond commented Jun 13, 2018

dholth commented Jun 13, 2018

neumond commented Jun 15, 2018 • edited Loading

ncoghlan commented Jun 16, 2018 • edited Loading

neumond commented Jun 16, 2018

FRidh commented Nov 5, 2019

pradyunsg commented Sep 6, 2020 • edited Loading

pradyunsg commented Sep 6, 2020

pradyunsg commented Sep 15, 2020 • edited Loading

takluyver commented Apr 10, 2021

takluyver commented Apr 21, 2021

neumond commented Jun 13, 2018 •

edited

Loading

neumond commented Jun 15, 2018 •

edited

Loading

ncoghlan commented Jun 16, 2018 •

edited

Loading

pradyunsg commented Sep 6, 2020 •

edited

Loading

pradyunsg commented Sep 15, 2020 •

edited

Loading