Add page with list of trove classifiers #1300

nlhkabu · 2016-06-30T05:42:59Z

For package authors, we will need to add a list of all trove classifiers. I am not sure where this is best placed - maybe in the footer?

ping @dstufft for an opinion on this.

timofurrer · 2016-07-08T02:37:42Z

Is this really a page that belongs on the PyPI site?

PEP 301 calls them distutils trove classifiers. Maybe a better place to give more information about those to the package authors would be the Python Packaging User Guide?

I think a link somewhere close to the classifiers shown in the project details page would be useful.

nlhkabu · 2016-07-08T05:27:14Z

Yes, sure. I am not opposed to moving it out of the site and into the docs. I do think we need to continue to link directly to it though, as this is something that is currently done on PyPI (so users may be accustomed to finding the list via this link).

takluyver · 2017-12-11T12:13:07Z

It's valuable to have the list in a convenient plain form, not embedded in HTML docs, for tools to fetch. Flit does this, for instance. The list changes often enough (e.g. adding new frameworks) that I don't want to bundle a static copy.

At present it seems you can get this from the legacy API at https://pypi.org/pypi?%3Aaction=list_classifiers - can we rely on that URL working for the foreseeable future?

dstufft · 2017-12-11T12:27:07Z

Yes that will continue to work, although ideally for automated consumption we come up with some newer, better JSON based API.

takluyver · 2017-12-11T13:12:09Z

Great, thanks. There's no pressure from my side for a JSON version of this unless someone makes a classifier with a newline in.

ncoghlan · 2018-01-27T07:04:52Z

I think @timofurrer's question does raise an interesting UX question: would a file in the specifications section of the PyPUG and/or some other PyPA repo that anyone can submit a PR to be a better primary data source for this information than the PyPI database?

Then PyPI would just be a consumer of that file (presenting it via the web URL), rather than the source of the official list.

waldyrious · 2018-03-22T21:53:56Z

would a file in the specifications section of the PyPUG and/or some other PyPA repo that anyone can submit a PR to be a better primary data source for this information than the PyPI database?

As an additional data point, doing this would be more transparent and would make issues like this easier to address.

theacodes · 2018-03-25T04:21:58Z

I agree with @ncoghlan - we should have some canonical location for this that's easy to modify.

Could we pull these classifiers out of the warehouse database and just store them in a datafile in this repository?

Alternatively, I could see us establishing a new project that holds the canonical list that warehouse and pypug depends on.

Related: #3028

dstufft · 2018-03-25T04:54:50Z

Easy to modify isn't exactly ideal here, there are a few different types of modifications you can make:

Addition: This is easy to cope with, since it's purely addition, Warehouse can simply add it.
Deletion: This is less easy to cope with, because there are really two kinds of deletion possible:
- Deletion where we want to expunge the record from all releases. This is technically easy, but unlikely to actually be what we want (and it would make the PyPI metadata and the package metadata disagree, which is undesirable)
- Deletion where we simply want to disallow new uploads, containing the classifier, but still want to retain it for historical record.
Rename: This is hard to deal with, because you don't want to suddenly start rejecting previous versions of the classifier, it would break people's uploads for little reason, but you want to treat the old and the new name as equivalent.

This also makes a simple text file not really well suited for it, because you can't really different between deletion to expunge from deletion to block from renames. In addition, internally in Warehouse (and legacy PyPI) the trove classifiers are represented as a rows in a database that we foreign key against, so something that we depend on isn't going to be a workable solution unless we do something janky like try to automatically reconcile our database against that dependency (which then starts to get into all of the problems I listed above with having to figure out what sort of change it was made).

Beyond all of that though, regardless of what we call the list in some other location or the list inside of the DB the "canonical" location, practically speaking, PyPI is going to be the defacto canonical location in every way that anyone actually cares about (since 99% of the time, what someone cares about when looking at classifiers, is whether PyPI will accept them or not).

Ultimately, I think the canonical location being on PyPI makes things easier to maintain and manage and it allows us to provide a better user experience for end users as well. It lets us put structured data in the database, while providing a UI to actually manage it that glosses over the details of actually managing that structured data. It also lets us tailor what the list we give people contains, based on what the context of us giving them that list is. For example, in documentation we would almost certainly exclude any legacy aliases for renamed classifiers or deleted classifiers that we still have the record for but are no longer accepting, but for an API endpoint that something like flit might call, we'd want to include all of the classifiers we are currently accepting (legacy alias or not) but none of the ones that we are not. I imagine there'd even be an API that reports all classifiers past and present and their current status.

ncoghlan · 2018-03-25T05:06:19Z

Thanks @dstufft, given that extra information, I agree it makes sense to have the database remain the official classifier listing, with various APIs to extract the current state of the list.

theacodes · 2018-03-25T05:35:05Z

So the question is now is how should we bring the legacy API forward to provide this canonical list via both an API and UI in Warehouse?

…

On Sat, Mar 24, 2018, 10:06 PM Nick Coghlan ***@***.***> wrote: Thanks @dstufft <https://github.com/dstufft>, given that extra information, I agree it makes sense to have the database remain the official classifier listing, with various APIs to extract the current state of the list. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1300 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAPUczWy8EYMBcpUIPF8DMXTneTWflFRks5thyXOgaJpZM4JB0d8> .

brainwane · 2018-04-16T21:03:32Z

Since https://pypi.org/pypi?%3Aaction=list_classifiers works adequately I'm moving this out of the "Shut down legacy PyPI" milestone and into the milestone of work that can happen after April 30th. But @waseem18 and other volunteers should continue to work on this and related trove classifier display issues if they would like to!

waseem18 · 2018-04-17T10:32:07Z

We can change the legacy API to return JSON (or build a new API) and use that to populate the UI. So is the new UI going to be a part of PyPI or will it go under Python Packaging User Guide?

takluyver · 2018-04-17T10:40:42Z

Please do build that as a new endpoint (a different URL), and leave the legacy API returning a plain text list - there's code that fetches the list from that legacy API, and changing it to JSON would break that.

(It could be smart and look at the 'Accept' request header to decide which format to return. But I think it's better to build the modern API at a nicer URL anyway.)

waseem18 · 2018-04-17T10:43:21Z

Yup. I'll go ahead with creating a new API.

di · 2018-04-19T11:47:33Z

There is an existing issue for getting classifiers via the JSON API: #1244

brainwane · 2018-04-20T20:40:03Z

The page is now at http://pypi.org/classifiers . Thanks @di.

waldyrious · 2018-04-21T11:04:05Z

@brainwane, @di and all: is there a place where the suggestion made by @ncoghlan above:

would a file in the specifications section of the PyPUG and/or some other PyPA repo that anyone can submit a PR to be a better primary data source for this information than the PyPI database?

...could be tracked, e.g. by opening a new issue? Or has that already been decided against?

FWIW, I still think that a public and collaborative ("PR-able") data source would be preferable to a private database table. At least the table definitions for recreating the database could be made available in a repo somewhere, similar to https://noc.wikimedia.org, and for similar reasons.

ncoghlan · 2018-04-21T15:46:11Z

I withdrew the basic suggestion of a flat text file based on Donald's comments at #1300 (comment)

That doesn't rule out the possibility of a "classifier log" format though, that tracks the possible operations as a series of historical events:

addition of new classifiers
renaming of classifiers
prohibition of a classifier in new uploads (rare)
removal of a classifier from all published metadata records (incredibly rare due to the resulting inconsistent with the artifact's internal metadata)

The way to pursue the idea further would be as a new issue proposing to derive the contents of the classifier table from a source controlled log of classifier changes, and then after discussing a suitable design with the Warehouse devs, working on a PR to actually implement that.

waldyrious · 2018-04-22T07:33:53Z

@ncoghlan thanks for the detailed response. I'll open a new issue, then. Until that process is completed, though, could you clarify what process is currently in place for tackling edits like this proposed change?

nlhkabu added the UX/UI design, user experience, user interface label Jun 30, 2016

nlhkabu added this to the 3) Feature parity with PyPI milestone Jun 30, 2016

takluyver mentioned this issue Dec 11, 2017

More validation of packaging data pypa/flit#154

Merged

brainwane mentioned this issue Jan 25, 2018

Document the license argument to setup.py pypa/packaging.python.org#95

Open

di added this to Milestone 5: Shut Down Legacy PyPI in Warehouse rollout Mar 21, 2018

di mentioned this issue Apr 11, 2018

Add ability to deprecate classifiers #3628

Closed

brainwane removed this from Milestone 5: Shut Down Legacy PyPI in Warehouse rollout Apr 16, 2018

brainwane modified the milestones: 5: Shut Down Legacy PyPI, 6. Post Legacy Shutdown Apr 16, 2018

di mentioned this issue Apr 19, 2018

Deprecated classifiers #3771

Merged

di closed this as completed in #3771 Apr 20, 2018

waldyrious mentioned this issue Apr 22, 2018

Derive list of classifiers from a public, version-controlled source #3786

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add page with list of trove classifiers #1300

Add page with list of trove classifiers #1300

nlhkabu commented Jun 30, 2016

timofurrer commented Jul 8, 2016 •

edited

Loading

nlhkabu commented Jul 8, 2016

takluyver commented Dec 11, 2017

dstufft commented Dec 11, 2017

takluyver commented Dec 11, 2017

ncoghlan commented Jan 27, 2018

waldyrious commented Mar 22, 2018 •

edited

Loading

theacodes commented Mar 25, 2018

dstufft commented Mar 25, 2018

ncoghlan commented Mar 25, 2018

theacodes commented Mar 25, 2018 via email

brainwane commented Apr 16, 2018

waseem18 commented Apr 17, 2018

takluyver commented Apr 17, 2018 •

edited

Loading

waseem18 commented Apr 17, 2018

di commented Apr 19, 2018 •

edited

Loading

brainwane commented Apr 20, 2018

waldyrious commented Apr 21, 2018

ncoghlan commented Apr 21, 2018

waldyrious commented Apr 22, 2018 •

edited

Loading

Add page with list of trove classifiers #1300

Add page with list of trove classifiers #1300

Comments

nlhkabu commented Jun 30, 2016

timofurrer commented Jul 8, 2016 • edited Loading

nlhkabu commented Jul 8, 2016

takluyver commented Dec 11, 2017

dstufft commented Dec 11, 2017

takluyver commented Dec 11, 2017

ncoghlan commented Jan 27, 2018

waldyrious commented Mar 22, 2018 • edited Loading

theacodes commented Mar 25, 2018

dstufft commented Mar 25, 2018

ncoghlan commented Mar 25, 2018

theacodes commented Mar 25, 2018 via email

brainwane commented Apr 16, 2018

waseem18 commented Apr 17, 2018

takluyver commented Apr 17, 2018 • edited Loading

waseem18 commented Apr 17, 2018

di commented Apr 19, 2018 • edited Loading

brainwane commented Apr 20, 2018

waldyrious commented Apr 21, 2018

ncoghlan commented Apr 21, 2018

waldyrious commented Apr 22, 2018 • edited Loading

timofurrer commented Jul 8, 2016 •

edited

Loading

waldyrious commented Mar 22, 2018 •

edited

Loading

takluyver commented Apr 17, 2018 •

edited

Loading

di commented Apr 19, 2018 •

edited

Loading

waldyrious commented Apr 22, 2018 •

edited

Loading