Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use setuptools_scm to manage package version and sdist file list #648

Merged
merged 8 commits into from
Sep 26, 2016

Conversation

anthrotype
Copy link
Member

setuptools_scm, from the Python Packaging Authority (PyPA), is a setuptools extension which uses Git (or Mercurial) to manage Python package versions.

Basically, it uses the revision control metadata to generate a unique version string for setup.py, and optionally a 'version.py' file which can be included in the distribution.

Depending on the number of commits intervened since the last tag, and on the clean/dirty state of the repository, it produces a unique version string, which guarantees that installed packages metadata is kept up-to-date with the source, without requiring tedious manual work, like manually checking in a version file that has to be modified at each release cycle.

I'm really keen on having this implemented here, as well as in the rest of the dependent packages.
Together with the PyPI deployment on tags, this will ensure that maintainers (i.e., us) will be encouraged to cut new releases more frequently, with all the benefits that ensue from that, for both users and developers.

Libraries could require other libraries as "abstract" dependencies in setup.py's install_requires, specifying only a name and a minimum required version, and without needing to resolve the dependency graph manually. Pip-installing one package will automatically pip-install the others.

Applications (e.g. fontmake) that are meant to be deployed in production could use pip's "requirements.txt" to specify concrete dependencies, with "pinned-down" version numbers (e.g. fonttools==3.1.0).

If we set it up for all the ten or something requirements of fontmake, we could finally put an end to the infamous "dependency hell".

@anthrotype
Copy link
Member Author

anthrotype commented Aug 6, 2016

For example, if you look at the CI log, the version string produced for this PR branch is:

3.1.dev934+nge316f9a

Since the last tag is 3.0, setuptools_scm increases the last digit by one, 3.1.

Then, after the dev tag, it appends the number of commits since the last tag; in our case, it's a whopping 934 -- i.e. almost one year of work.

The last part of the version, folowing +n (I guess it means "number"), is the commit hash. The little g between the n and the hash indicates that we are using git for version control.

If the commit coincided with a git tag (and if the working directory is not "dirty"), then the version string produced would simply be, e.g., 3.1.

If the working directory is dirty, another suffix (starting with d) is appended with the current timestamp, e.g. .d20160806.

@anthrotype
Copy link
Member Author

May I suggest we use three MAJOR.MINOR.PATCH digits for the next tag, i.e. 3.1.0, instead of 3.1?
This way setuptools_scm would increase the last digit by one as we add new commits, and we could try to conform to http://semver.org.

@lemzwerg
Copy link

lemzwerg commented Aug 7, 2016

May I suggest we use three MAJOR.MINOR.PATCH digits for the next
tag, i.e. 3.1.0, instead of 3.1? This way setuptools_scm would
increase the last digit by one as we add new commits, and we could
try to conform to http://semver.org.

Mhmm, I don't like redundant trailing zeroes, which are simply ugly.
I think it is straightforward to add some code to internally set X to
X.0.0 and X.Y to X.Y.0.

We shouldn't sacrifice readability just for dumb programming.

Werner

@anthrotype
Copy link
Member Author

I agree with Werner that the third "patch" digit could be seen as redundant when it's '0'.
However, I think that we should still keep the zero in the second "minor" digit, e.g. when we reach version 4.0.
I believe that even for humans, not just parsers, having a dot between numbers makes it look like a "version" string, and not simply a number.

I re-read the PEP 440, the official Python specification for version identification implemented by setuptools and pip to identify packages and resolve dependencies.

The "final release" segment of a version is defined as N(.N)*, or

one or more non-negative integer values, separated by dots.

When comparing release segments with different numbers of components, the shorter segment is padded out with additional zeros as necessary.

This conforms with what Werner suggested. However, the spec also notes that

While any number of additional components after the first are permitted under this scheme, the most common variants are to use two components ("major.minor") or three components ("major.minor.micro").

setuptools_scm does allow to use a custom version scheme instead of the default one. We just need to pass a callable as explained in the docs, and I can look into it.

Anyway, isn't it funny how we haven't tagged a release for a year, and now we find ourselves discussing whether it's going to be 3.1 or 3.1.0...
I'm sure we all agree that any version scheme is better than a never-changing one. :)

@anthrotype
Copy link
Member Author

OK, I implemented the custom version scheme as discussed above.
Now the third number can be omitted (defaults to 0), and the next development version will have it incremented by one.

Because the current tag is still 3.0, the generated version string for this PR branch is now:

3.0.1.dev940+ngccf3f2b

All the next release tags should be in the form MAJOR.MINOR[.MICRO], with the third number optional. The dot and the second number are not optional, even when 0.

This is because the default git describe command used by setuptools_scm to get the latest tag and the distance from it, uses --match option to filter out the available tags with a *.* glob pattern.

I could change the pattern to [0-9]*, but this would require further patching -- and I spent already too much time on this.

I think this is ready to be merged now.

@anthrotype
Copy link
Member Author

hmm, I can see the reaction hasn't been very enthusiastic so far... 😞

@anthrotype anthrotype force-pushed the setuptools_scm branch 3 times, most recently from 393e102 to c00bed9 Compare August 15, 2016 12:12
@adrientetar
Copy link
Member

this will ensure that maintainers (i.e., us) will be encouraged to cut new releases more frequently

I think we all know a release hasn't happened in a while, we don't need a new more complicated version numbering scheme for that (imho).

@behdad
Copy link
Member

behdad commented Aug 15, 2016

Can we keep it simple please? :)

@behdad
Copy link
Member

behdad commented Aug 15, 2016

I'm all for more automation, but adding new, niche, tools, just to remove the "pain" of changing a version number doesn't look that much of a gain to me. I understand we want more releases, but the reason we don't have more regular releases is because of multiple half-finished modules in tree; mostly mine! varLib, mtiLib.

Perhaps a facility to mark certain modules as unstable would get us closer to having releases. WDYT?

@anthrotype anthrotype force-pushed the setuptools_scm branch 3 times, most recently from c98abaa to b14dab8 Compare August 19, 2016 13:47
@anthrotype
Copy link
Member Author

anthrotype commented Aug 19, 2016

There are several interrelated problems at stake here.

We don't tag releases very often -- because, like Behdad said, there's stuff which is not completed, and many can only work in their spare time, etc. The less frequently we release, the more the very action of releasing becomes like.. "oh my god! This is going to be perfect".

As we don't release often, we don't often deploy to the official repository where Python users expect nowadays to find Python packages. Because of that, users will be inevitably led to install fonttools from the git repository -- which is supposed to be the in-development, unstable copy. (Not to mention that git tool is not pre-installed on all platforms, so it virtually is like a install requirement of fonttools!).

Now, the version number that shows in the pip list of installed packages will be 3.0 whether they installed fonttools yesterday or in August 2015. But almost 1000 commits have passed since then, and everybody knows this is no longer the same fonttools. Lots of new features were added, bug fixed, possibly even APIs broken, but the version always stays the same.
There is no way to know, once the package is installed, what is the "real" version, besides the one in the metadata which is stuck at 3.0 anyway.

It's not just users that can't tell what-the-fonttools they are being using, but other developers as well are put into an un uncomfortable situation. They would like to use fonttools (the library) in their own tools, but they can't simply add it to their install_requires like they do with any other python package in the world. This would give them a too old "version". So they have to tell their users that, in order to run their tools, they first need to install something called fonttools, explain to them that they can't get the one from PyPI, but "Behdad's fonttools... The one from Github!". (and they also try to make this look cool).

Alternatively, they maintain monstrous "requirements.txt" files with several git urls of all the dependencies of the dependencies of the dependencies..., which the user is supposed to pip install -r.... Some duly decide to freeze the requirements at (not very human-friendly) specific commit hashes, which they update from time to time. Others simply give up, and point requirements to the url of the master branches. As soon as anything in the dependencies change, all their tests start to fail, or users complain that stuff is suddenly broken.

This situation is bad, and it can't go on like that forever.

The thing is, fonttools and all its sister projects are being used for production. They need to be deployed to users' machines or to servers to make fonts. These products must be reproducible. There has to be, and in fact there are, better way to distribute these packages which does not require doing all this mess.

The proposal to add a automated versioning tool like setuptools_scm, using Git to do what it can do best (i.e. keeping track of revisions, and getting the complete list of files under revision control), is a way to try solve this problem.

  • it frees us, developers, to make "bump version to 3.1.4" commits. The simple act of creating a git tag becomes the same as bumping the version.
  • if anyone installs from git source, they get a unique version string (which can be very useful when reporting bugs, for example), which actually makes sense, and is derived from the current status of the repository in relation to the last tagged release. It is not complicated! Even one who hasn't read PEP440 can tell the difference between 3.0.1.dev3 and 3.0, that the former is somewhat more recent but is not deemed stable yet. The rest of the numbers helps making that local copy identifiable (a shortened commit hash, a timestamp if installation was done from an unclean repository). We can argue of the necessity of the "micro" number or similar, but there is no doubt this would be so much better than the current confusion where 3.0 is not actually 3.0.
  • the act of tagging also automatically deploys the wheels to PyPI, where pip expects them to be. The option to install from git source was added as an aid in development, not as a means of deploying to users or for production. Once the library is on PyPI, it immediately enters the virtuous circle of automatic dependency tracking. "You want to use woff2? Just pip install fonttools[woff2] and done!". Currently, well.. you first need to know there is such an external requirement ("go figure it out from the import statements!"), then clone that repo, install a C++ compiler, etc...

About marking some sub-packages as "unstable", as Behdad suggested, I believe anything which is merged in master has to be deemed stable enough. Maybe not bug-free-release-ready, but nevertheless, kind of useable. If one doesn't want it to be used yet, I think one should keep it in a separate branch, and rebase that until it gets ready.
We could tell to find_packages to exclude some subpackages that match some patterns, but it would be preferable to simply not merge them if not deemed stable -- instead of refraining from releasing the 90% of the library which is, as a matter of fact, used in production and hence stable.

I guess I've exhausted all arguments. I just would like to be able to take care of this, because I know I could solve it if you allow me to.

@benkiel
Copy link
Collaborator

benkiel commented Aug 19, 2016

@anthrotype +1

The thing is, fonttools and all its sister projects are being used for production. They need to be deployed to users' machines or to servers to make fonts. These products must be reproducible. There has to be, and in fact there are, better way to distribute these packages which does not require doing all this mess.

This is a big thing, having a way of being able to troubleshoot a user issue by knowing which version is installed (and not having to explain git to folks) is really important to the larger, not-as-technical-as-you-may-assume crowd.

@anthrotype anthrotype force-pushed the setuptools_scm branch 2 times, most recently from a09ea0c to c529e7d Compare August 20, 2016 10:54
@anthrotype
Copy link
Member Author

anthrotype commented Aug 22, 2016

For simplicity, I dropped the setuptools_scm_git_archive, and only kept setuptools_scm.

Also for further simplicity, I would like to drop the custom version scheme suggested early on by Werner (2756148), and simply use the default one with three major.minor.patch numbers, e.g 3.1.0 rather than 3.1 if the patch is zero.
The former is more consistent and makes explicit we follow semantic versioning (where the minor and patch number are not optional).

@anthrotype
Copy link
Member Author

anthrotype commented Sep 10, 2016

I used graphviz to visualize the dependency hell. If you like, you can show it in your slides in Warsaw ;)

fontmake dot

digraph fontmake_deps {
    # fonttools -> brotli [style=dotted]
    # fonttools -> zopfli [style=dotted]
    # fonttools -> unicodedata2 [style=dotted]
    # fonttools -> xattr [style=dotted]
    # fonttools -> PyQt5 [style=dotted]
    # fonttools -> AppKit [style=dotted]
    # fonttools -> pygtk [style=dotted]
    # fonttools -> reportlab [style=dotted]

    cu2qu -> fonttools
    cu2qu -> ufoLib
    # cu2qu -> defcon [style=dotted]

    ufoLib -> fonttools

    defcon -> fonttools
    defcon -> ufoLib
    # defcon -> compositor [style=dotted]

    ufo2ft -> fonttools
    ufo2ft -> ufoLib
    ufo2ft -> compreffor [style=dotted]
    ufo2ft -> cu2qu [style=dotted]
    # ufo2ft -> defcon [style=dotted]
    # ufo2ft -> unicodedata2 [style=dotted]

    booleanOperations -> fonttools
    booleanOperations -> ufoLib
    # booleanOperations -> defcon [style=dotted]

    compreffor -> fonttools

    fontMath -> fonttools
    fontMath -> ufoLib

    mutatorMath -> defcon
    mutatorMath -> fontMath

    glyphsLib -> fonttools
    glyphsLib -> defcon
    glyphsLib -> mutatorMath

    fontmake -> fonttools
    fontmake -> cu2qu
    fontmake -> defcon
    fontmake -> ufo2ft
    fontmake -> booleanOperations [style=dotted]
    fontmake -> mutatorMath [style=dotted]
    fontmake -> glyphsLib [style=dotted]
}

@benkiel
Copy link
Collaborator

benkiel commented Sep 10, 2016

@anthrotype looking at that graph, I wondered why booleanOperations needed defcon: checking the source, I don't see a dependency on it. One less circle of hell!

@anthrotype
Copy link
Member Author

It' true, it does not import from it, but I would argue it is a de facto dependency, as it expects something like a Defcon glyph as input. Robofab is no longer an option, but yes, fontparts glyph objects should be compatible with booleanOperations, though I haven't tried myself. I guess I can remove that arrow from the graph.

@moyogo
Copy link
Collaborator

moyogo commented Sep 10, 2016

@anthrotype maybe you can make the arrow line dotted to indicate a soft dependency.

@anthrotype
Copy link
Member Author

Thanks for the comments!
I updated the graph to only include the dependencies which are actually imported. I use dotted lines to indicate when the import is optional (i.e. not at top level). I removed the 'soft' dependencies (i.e. no import, duck typing) and those less frequently used to avoid cluttering the graph.

@anthrotype
Copy link
Member Author

oh... http://furius.ca/snakefood/ (I wish I had found this before!)

Cosimo Lupo added 6 commits September 26, 2016 23:56
setuptools_scm handles managing python package versions using git metadata instead of declaring them as the version argument or in a git-managed file.

https://github.com/pypa/setuptools_scm
Else Travis only clones the last 50 commits, and `git describe` doesn't work.
…_resources

Do not export 'version' from top-level fontTools.__init__ module, as it is
rarely used; importing pkg_resources here would slow down importing fontTools.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants