Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package naming policies #18

Closed
ocefpaf opened this issue Feb 10, 2016 · 46 comments
Closed

Package naming policies #18

ocefpaf opened this issue Feb 10, 2016 · 46 comments

Comments

@ocefpaf
Copy link
Member

ocefpaf commented Feb 10, 2016

My 2-cents:

  1. First try the package original name 😁
    A possible conflict would be packages like beautifulsoup4 (pypi) and beautiful-soup (anaconda name). I strongly believe that the majority of the users cannot find the anaconda package in a CLI search and end up installing the PyPI version.
  2. When conflicts arise, like a c-lib and its python bindings with the same name, add a py<c-libname>. For example: cdo and pycdo. (Or maybe python-cdo?)
  3. When adding a new package that has libraries used by other packages avoid naming it lib<package name>.

Note that anaconda names the netcdf package libnetcdf. However, that package is more than just the netcdf libs. Same for gdal and libdal, but in that case gdal is even more confusing because that is the python bindings only, and the libdal is the rest of the package. To me this behavior is a bad mix of the Linux world, that splits packages into lib, dev, headers, etc and the python bias that we have when packaging non-python packages.

(See the issues raised on #16.)

@ocefpaf
Copy link
Member Author

ocefpaf commented Feb 10, 2016

Question: both pip and conda are case insensitive, but PyPI allows for mixed case names on the web. Should we do as anaconda does and keep everything lower case?

@ocefpaf
Copy link
Member Author

ocefpaf commented Feb 10, 2016

Question: How to resolve a package name dispute (to avoid name squatters)?

If a discussion cannot resolve the naming we should stick with rule (1).

@jankatins
Copy link
Contributor

First try the package original name

Only for python, for r use r-<cran name>

At some point, there will also be subpackages naming (e.g. matplotlib and matplotlib-qt4.

Should we do as anaconda does and keep everything lower case?

My vote: All lower...

@pelson
Copy link
Member

pelson commented Feb 10, 2016

Thanks guys. Great start! @ocefpaf - is there an oracle somewhere in the Deb/Fed world on this kind of thing. If I'm honest, I'd personally prefer to follow their naming convention than Anaconda's...

Everything you've said so far makes a lot of sense to me. At the risk of pulling in too many chefs for Github's linear issues (at which point we should probably drop out to the conda-forge mailing list), I'd love to hear the opinions of @ericdill, @rmcgibbo and @mwcraig (and anybody else with experience in this realm).

@jankatins
Copy link
Contributor

debian: https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Source

Package names (both source and binary, see Package, Section 5.6.7) must consist only of lower case letters (a-z), digits (0-9), plus (+) and minus (-) signs, and periods (.). They must be at least two characters long and must start with an alphanumeric character.

There is also the python policy: https://www.debian.org/doc/packaging-manuals/python-policy/ch-module_packages.html#s-package_names

Public modules used by other packages must have their binary package name prefixed with python-. It is recommended to use this prefix for all packages with public modules as they may be used by other packages in the future. Python 3 modules must be in a separate binary package prefixed with python3- to preserve run time separation between python and python3. The binary package for module foo should preferably be named python-foo, if the module name allows, but this is not required if the binary package ships multiple modules. In the latter case the maintainer chooses the name of the module which represents the package the most. For subpackages such as foo.bar, the recommendation is to name the binary packages python-foo.bar and python3-foo.bar.

@pelson pelson mentioned this issue Feb 10, 2016
13 tasks
@ChrisBarker-NOAA
Copy link
Contributor

@JanSchulz posted:
"consist only of lower case letters (a-z), digits (0-9), plus (+) and minus (-) signs, and periods (.). They must be at least two characters long and must start with an alphanumeric character."

seems reasonable, let's just adopt it.

Python packages should mirror the name on pypi whenever possible, and the actual python package name if there is no py_py name.

[no idea why continuum did what it did with beautiful soup, but what can we do?)

Python bindings to a c lib should be called py_the_lib_name, unless it already has another name on pypi.

c libs should be called lib[the lib name]

  • continuum is pretty inconsistent on this, though:
    • libpng
    • jpeg
    • freetype
    • libtiff

And Hey! I just noticed that we're maintaining a "beutifulsoup4" package -- making @ocefpaf 's point!

As for GDAL -- that's a mess -- it's great that the GDAL project provided python bindings, but having them built and installed at at once has been a pain for ages (and not just with conda) And continuum has now split them off, which is good, but they are stuck with the old naming (bad). So let's try not to do that. There are times when multiple things are in one source repo that really should be multiple packages -- so let's keep it that way when we can.

We should put this in a doc somewhere, that we can all edit, rather than this linear discussion...

In the end, "A foolish consistency is the hobgoblin of little minds" -- so let's establish some default standards, and then not get too uptight about it.

@pelson
Copy link
Member

pelson commented Feb 10, 2016

In the end, "A foolish consistency is the hobgoblin of little minds" -- so let's establish some default standards, and then not get too uptight about it.

👍 for that. Who doesn't have access to the wiki on this repo that would like it? That would seem an obvious first place for the content.

@jankatins
Copy link
Contributor

I would be very much for a md document in this repo which gets built into the website. PRs and the possibility to comment and repush are IMO better than wiki style changes...

@ChrisBarker-NOAA
Copy link
Contributor

@JanSchulz: yeah the end goal is probably user-readable docs, so that's a good idea. However, right now the .io site is static html -- is is possible to mix that with Jekyll (Or sphinx?)?

If the longer term goal is a Jekyll site, then starting with .md in the wiki is not so bad....

@jankatins
Copy link
Contributor

@ChrisBarker-NOAA My (limited but usually bad) experience with discussion around wiki docs tells me to go with a simple md doc in the repo and simply link to the repo from the html files :-) Comments are much better here...

@jankatins
Copy link
Contributor

"conda name = pypi name" is also great for this:

{% set data = load_setuptools() %}

requirements:
  build:
    - python
    {% for req in data.get('install_requires', []) -%}
    - {{req}}
    {% endfor %}

@jankatins
Copy link
Contributor

[BTW, it would be really great if conda had a way to submit the pip names and get conda names back...]

@ChrisBarker-NOAA
Copy link
Contributor

I'm not sure what you mean here -- could you clarify?

On Feb 13, 2016, at 4:21 AM, Jan Schulz notifications@github.com wrote:

[BTW, it would be really great if conda had a way to submit the pip names
and get conda names back...]


Reply to this email directly or view it on GitHub
#18 (comment)
.

@jankatins
Copy link
Contributor

@ChrisBarker-NOAA I'm no sure how many packages do have different names than the original pypi package, but there is probably are a few. If tehre would be a function which returned the pip package names as conda package names, then specifying the package dependencies could be as simple as

requirements:
  build:
    - python
    {% for req in translate_to_conda(data.get('install_requires', [])) -%}
    - {{req}}
    {% endfor %}

Or even easier:

requirements:
  build:
    - python
    - {{ TRANSLATED_SETUPPY_REQUIREMENTS}}

...but I should probably take this to the conda-build repo...

@ChrisBarker-NOAA
Copy link
Contributor

hmm, handy, yes -- where would that code go? in conda? a totally separate
utility?

I suppose a conda recipe could have a field in it for the the original pip
name or something. But I suspect this is simply going to require us to
handle special cases in a few places. :-(

-CHB

On Sat, Feb 13, 2016 at 11:56 AM, Jan Schulz notifications@github.com
wrote:

@ChrisBarker-NOAA https://github.com/ChrisBarker-NOAA I'm no sure how
many packages do have different names than the original pypi package, but
there is probably are a few. If tehre would be a function which returned
the pip package names as conda package names, then specifying the package
dependencies could be as simple as

requirements:
build:
- python
{% for req in translate_to_conda(data.get('install_requires', [])) -%}
- {{req}}
{% endfor %}

Or even easier:

requirements:
build:
- python
- {{ TRANSLATED_SETUPPY_REQUIREMENTS}}


Reply to this email directly or view it on GitHub
#18 (comment)
.

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

@jankatins
Copy link
Contributor

It needs changes in conda, and conda build as this field needs to be included in the index.

@ocefpaf
Copy link
Member Author

ocefpaf commented Mar 23, 2016

Recently we named the module cobra as python-cobra in conda-forge. I am OK with that, but we should try to avoid confusion. At some point people will starting submitting <package>-python, <package>-py, py_<package>, etc.

We should add a rule to always disambiguate package names by adding the prefix python-. It is less damaging and clear.

@PythonCHB
Copy link
Contributor

I' a bit confused. We need a "Python name" for Python packages that wrap c libs, but use the same name on PyPi. But it looks like cobra ( also referred to as "cobrapy" requires libgplk, and is called "cobra" on PyPi. So why a new name?

Anyway, for those cases where we do need to disambiguate between c libs and Python wrappers, yes, a consistent naming scheme is good.

I would prefer "py_libname", but python_libname is fine, too. (Though underscores are better than dashes, dashes mean something in package names, to pip anyway)

Unfortunately, plenty of PyPi psckages already use their own, inconsistent naming schemes, but we'll just have to live with that.

@jakirkham
Copy link
Member

One related point. On the lib and Python bindings point, if the library provides Python bindings, I think it is good if we can ensure both the library and Python bindings are installed in the same package. Combining these together so we aren't managing two related pieces and dealing with breaks between them will make maintenance and install more straightforward. Just my thought. If there are other opinions on this, please share.

@ChrisBarker-NOAA
Copy link
Contributor

@jakirkham: it all depend on whether the lib is likely to be needed/used outside of the python package.

I guess we should follow the original package developers lead here --f they bundle the lib on pypi, then we should too. (pyproj is a good example of that -- proj is certainly useful outside of python but the author packages it in with the python bindings)

@pelson
Copy link
Member

pelson commented Mar 25, 2016

If they bundle the lib on pypi, then we should too. (pyproj is a good example of that -- proj is certainly useful outside of python but the author packages it in with the python bindings)

I don't agree necessarily. By necessity, if you package on PyPI (as a binary wheel) then you must bundle the lib dependencies as you have no other tool at your disposal. We do have a tool at our disposal to install non python dependencies, and should be using it IMO (otherwise we could have called this wheel-forge 😉). I completely accept it isn't always clear cut though - if in doubt, we should see what Debian/other packagers have done.

@ocefpaf
Copy link
Member Author

ocefpaf commented Mar 25, 2016

I don't agree necessarily. By necessity, if you package on PyPI (as a binary wheel) then you must bundle the lib dependencies as you have no other tool at your disposal. We do have a tool at our disposal to install non python dependencies, and should be using it IMO (otherwise we could have called this wheel-forge 😉). I completely accept it isn't always clear cut though - if in doubt, we should see what Debian/other packagers have done.

Agreed. As I said in #56 (comment) pyproj is a special case.

If they bundle the lib on pypi

Most packager will read that: add the lib as a dependency. I can say for Fedora and OpenSUSE that there is nothing similar to the bundling done in wheels.

@jakirkham
Copy link
Member

We do have a tool at our disposal to install non python dependencies, and should be using it IMO (otherwise we could have called this wheel-forge 😉)

Down with wheel-forge! 😆 All joking aside. 😉

We do have a tool at our disposal to install non python dependencies, and should be using it IMO

Agreed. For instance, take a case like pyzmq, I do think we should be teasing out the libraries it is bundling (am trying to do that at present). It is possible that things like libsodium or libzmq, which pyzmq will bundle for you, will be useful for other things and we want to avoid having the same thing twice. Also, we want to be able to leverage things at all levels of the stack easily as we write new recipes.

@ChrisBarker-NOAA
Copy link
Contributor

I don't agree necessarily. By necessity, if you package on PyPI (as a
binary wheel) then you must bundle the lib dependencies as you have no
other tool at your disposal.

True -- and folks usually statically link -- see matplotlib, for instance.

So let me rephrase: if the package authors bundle the source to a
dependency, then we should follow suit. I.e. Pyproj.

CHB

We do have a tool at our disposal to install non python dependencies, and
should be using it IMO (otherwise we could have called this wheel-forge [image:
😉]). I completely accept it isn't always clear cut though - if in
doubt, we should see what Debian/other packagers have done.

Agreed. As I said in #56 (comment)
#56 (comment)
pyproj is a special case.

If they bundle the lib on pypi

Most packager will read that: add the lib as a dependency. I can say for
Fedora and OpenSUSE that there is nothing similar to the bundling done in
wheels.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#18 (comment)

@jakirkham
Copy link
Member

Not familiar with Pyproj, but I think I agree. protobuf is one that came to mind before. They bundle C++, Python, etc. source code together.

@jakirkham
Copy link
Member

cc @ccordoba12

@ocefpaf
Copy link
Member Author

ocefpaf commented Jun 24, 2016

Currently we have a few packages that diverge from defaults names. See https://github.com/conda-forge/python-simplegeneric-feedstock/issues/1#issuecomment-227316114

  • simplegeneric
  • drmaa
  • decorator
  • pathlib2

In conda-forge those have the python- prefix.

@ChrisBarker-NOAA
Copy link
Contributor

I would argue we leave python-drmaa as is.

agreed.

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

@jankatins
Copy link
Contributor

I agree. any package that is only (or primarily) a python package doesn't need a python- or py- prefix.

I don't think this is sustainable: what happens if any other ecosystem (mostly the native one, but also lua/R/...) starts a new package with the same name? IMO it would pay off to be consistent and take a bit of pain due to renames (+ transitional packages?)

@ChrisBarker-NOAA
Copy link
Contributor

On Sun, Jun 26, 2016 at 9:20 AM, Jan Schulz notifications@github.com
wrote:

I agree. any package that is only (or primarily) a python package doesn't
need a python- or py- prefix.

I don't think this is sustainable: what happens if any other ecosystem
(mostly the native one, but also lua/R/...) starts a new package with the
same name? IMO it would pay off to be consistent and take a bit of pain due
to renames (+ transitional packages?)

It would be whoever comes "second"'s job to come up with a new name.

But we have a key problem here -- no none is controlling the namespace. For
PyPi, it's a simple "first come, first serve" system -- which has its
problems, but at least is clear and simple.

Is conda-forge accepted enough that we could call it an authority? i.e. if
a name is used in Anaconda or conda-forge, than it is taken.

calling something py-the_name_on_pypi sort of solves the problem for python
packages, but doesn't address it for anything else anyway.

But maybe you're right -- solving part of the problem may be good enough --
though it's a bit late, lots of folks expect the conda package name to
match what you import in python....(or at least the pypi name)

-CHB

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

@jankatins
Copy link
Contributor

jankatins commented Jun 27, 2016

Isn't it easier to teach users to use prefixes on everythin apart from real user apps (aka libsomething for native packages, python-something for pyhon packages than let users find out whether jq is the native package, the python lib or the r lib, depending who was first? IMO it should be jq the commandline app, libjq the library part and r-jq and python-jq the r and python packages which wrap libjq?

Edit:

lots of folks expect the conda package name to match what you import in python....(or at least the pypi name)

This is an expectation which has to change and probably will change if users use conda for other ecosystems.

@mwiebe
Copy link

mwiebe commented Jun 27, 2016

I like this idea of transitioning to consistent prefixing conventions. There's definitely pain in doing so, but for the long-term I think it's important if the anaconda, conda, conda-forge communities want to be agnostic to particular languages and runtimes. I think it fits in the same general category as the work decoupling MSVC compiler runtimes from Python versions.

@jakirkham
Copy link
Member

jakirkham commented Jun 30, 2016

It is true that prefixing is nice. I have generally been an advocate, but I'm also a pragmatist. We are hurting the user experience.

For instance, one person had there Anaconda install go totally broken because of this. If we want to pursue prefixing, we need to do it with a transition period in mind. This could be with metapackages that share the old name, leaving defaults packages with the same name, or something else people would like to contribute.

The most important point IMHO is we should not be spending precious cycles resolving issues that amount to us making a choice that does not accept the current state of affairs with defaults.

@pelson
Copy link
Member

pelson commented Jun 30, 2016

I unfortunately agree with @jakirkham. The sooner we are tooled to make this call on a merit basis the better, but in the meantime, we do need to maintain consistency with what has already taken place in defaults.

@jjhelmus
Copy link
Contributor

I agree also. If conda-forge does not play nicely with defaults I think it will drive away users and possible contributors. At this staged I think we need to cater to our user base not make it difficult for them to use our packages.

Perhaps at a later time we can figure out a way to gracefully transition to using prefixed package names.

@ChrisBarker-NOAA
Copy link
Contributor

God yes!

I thought this was only about packages that are not currently in default
anyway.

But still -- there is a convention in default -- most python packages are
named the same as their pypi names, and, indeed, most of those are names
what gets imported.

And folks really do expect this to be the case!

-CHB

On Thu, Jun 30, 2016 at 8:14 AM, Jonathan J. Helmus <
notifications@github.com> wrote:

I agree also. If conda-forge does not play nicely with defaults I think
it will drive away users and possible contributors. At this staged I think
we need to cater to our user base not make it difficult for them to use our
packages.

Perhaps at a later time we can figure out a way to gracefully transition
to using prefixed package names.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#18 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AA38YPM3_g2yPGz52PVso1qTGFcHnbPNks5qQ91fgaJpZM4HXVdg
.

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@noaa.gov

@ocefpaf
Copy link
Member Author

ocefpaf commented Jun 30, 2016

I thought this was only about packages that are not currently in default anyway.

A long, long time ago it was 😄

@patricksnape
Copy link

This problem seems to be very similar to the issue of Visual Studio versions/compiler level features to me. Disambiguating how a package is built/what it is built for seems to be an integral part of conda that hasn't been sorted out yet. For example, rather than manually prefixing, would it be easier if packages could be 'tagged' with which languages it provides implementations for? Then the user could be given the choice to select a particular language at install time?

@ocefpaf
Copy link
Member Author

ocefpaf commented Jul 25, 2016

I believe we can close this issue (not that the discussion is over 😄). Any new "big" changes in the current naming polices should probably be submitted as an enhancement proposal. (Or at least added to the agenda for the meetings.)

@ocefpaf ocefpaf closed this as completed Jul 25, 2016
@mwcraig
Copy link
Contributor

mwcraig commented Jul 25, 2016

Just in case someone in the future stumbles across this closed issue, see this gist: https://gist.github.com/mcg1969/da5aec380d2ed083b79ddcf151ca16f1

@ocefpaf
Copy link
Member Author

ocefpaf commented Jul 25, 2016

Thanks @mwcraig!

@ChrisBarker-NOAA
Copy link
Contributor

PIng!

This just got linked to from another discussion -- looking over this thread, it was proposed that an MD page get created, ultimately to be included (Or linked to) from the conda-forge docs.

@mwcraig started on that with the gist linked to above:

https://gist.github.com/mcg1969/da5aec380d2ed083b79ddcf151ca16f1

looks like the discussion petered out in May...

@mwcraig
Copy link
Contributor

mwcraig commented Nov 2, 2017

Just to clarify, I didn't write that gist, I just linked to it. I'm not capable of thinking that deeply about package name-spacing 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

10 participants