Generate Misc/NEWS from individual files #6

Closed
brettcannon opened this Issue Dec 10, 2016 · 126 comments

Comments

Projects
None yet
@brettcannon
Member

brettcannon commented Dec 10, 2016

As pulled from PEP 512:

Traditionally the Misc/NEWS file [19] has been problematic for changes which spanned Python releases. Oftentimes there will be merge conflicts when committing a change between e.g., 3.5 and 3.6 only in the Misc/NEWS file. It's so common, in fact, that the example instructions in the devguide explicitly mention how to resolve conflicts in the Misc/NEWS file [21] . As part of our tool modernization, working with the Misc/NEWS file will be simplified.

The planned approach is to use an individual file per news entry, containing the text for the entry. In this scenario each feature release would have its own directory for news entries and a separate file would be created in that directory that was either named after the issue it closed or a timestamp value (which prevents collisions). Merges across branches would have no issue as the news entry file would still be uniquely named and in the directory of the latest version that contained the fix. A script would collect all news entry files no matter what directory they reside in and create an appropriate news file (the release directory can be ignored as the mere fact that the file exists is enough to represent that the entry belongs to the release). Classification can either be done by keyword in the new entry file itself or by using subdirectories representing each news entry classification in each release directory (or classification of news entries could be dropped since critical information is captured by the "What's New" documents which are organized). The benefit of this approach is that it keeps the changes with the code that was actually changed. It also ties the message to being part of the commit which introduced the change. For a commit made through the CLI, a script could be provided to help generate the file. In a bot-driven scenario, the merge bot could have a way to specify a specific news entry and create the file as part of its flattened commit (while most likely also supporting using the first line of the commit message if no specific news entry was specified). If a web-based workflow is used then a status check could be used to verify that a new entry file is in the pull request to act as a reminder that the file is missing. Code for this approach has been written previously for the Mercurial workflow at http://bugs.python.org/issue18967 . There is also tools from the community like https://pypi.python.org/pypi/towncrier , https://github.com/twisted/newsbuilder , and http://docs.openstack.org/developer/reno/ .

Discussions at the Sep 2016 Python core-dev sprints led to this decision compared to the rejected approaches outlined in the Rejected Ideas section of this PEP. The separate files approach seems to have the right balance of flexibility and potential tooling out of the various options while solving the motivating problem.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Jan 23, 2017

Member

Rejected Ideas

Deriving Misc/NEWS from the commit logs

As part of the discussion surrounding Handling Misc/NEWS , the suggestion has come up of deriving the file from the commit logs itself. In this scenario, the first line of a commit message would be taken to represent the news entry for the change. Some heuristic to tie in whether a change warranted a news entry would be used, e.g., whether an issue number is listed.

This idea has been rejected due to some core developers preferring to write a news entry separate from the commit message. The argument is the first line of a commit message compared to that of a news entry have different requirements in terms of brevity, what should be said, etc.

Deriving Misc/NEWS from bugs.python.org

A rejected solution to the NEWS file problem was to specify the entry on bugs.python.org [5] . This would mean an issue that is marked as "resolved" could not be closed until a news entry is added in the "news" field in the issue tracker. The benefit of tying the news entry to the issue is it makes sure that all changes worthy of a news entry have an accompanying issue. It also makes classifying a news entry automatic thanks to the Component field of the issue. The Versions field of the issue also ties the news entry to which Python releases were affected. A script would be written to query bugs.python.org for relevant new entries for a release and to produce the output needed to be checked into the code repository. This approach is agnostic to whether a commit was done by CLI or bot. A drawback is that there's a disconnect between the actual commit that made the change and the news entry by having them live in separate places (in this case, GitHub and bugs.python.org). This would mean making a commit would then require remembering to go back to bugs.python.org to add the news entry.

Member

brettcannon commented Jan 23, 2017

Rejected Ideas

Deriving Misc/NEWS from the commit logs

As part of the discussion surrounding Handling Misc/NEWS , the suggestion has come up of deriving the file from the commit logs itself. In this scenario, the first line of a commit message would be taken to represent the news entry for the change. Some heuristic to tie in whether a change warranted a news entry would be used, e.g., whether an issue number is listed.

This idea has been rejected due to some core developers preferring to write a news entry separate from the commit message. The argument is the first line of a commit message compared to that of a news entry have different requirements in terms of brevity, what should be said, etc.

Deriving Misc/NEWS from bugs.python.org

A rejected solution to the NEWS file problem was to specify the entry on bugs.python.org [5] . This would mean an issue that is marked as "resolved" could not be closed until a news entry is added in the "news" field in the issue tracker. The benefit of tying the news entry to the issue is it makes sure that all changes worthy of a news entry have an accompanying issue. It also makes classifying a news entry automatic thanks to the Component field of the issue. The Versions field of the issue also ties the news entry to which Python releases were affected. A script would be written to query bugs.python.org for relevant new entries for a release and to produce the output needed to be checked into the code repository. This approach is agnostic to whether a commit was done by CLI or bot. A drawback is that there's a disconnect between the actual commit that made the change and the news entry by having them live in separate places (in this case, GitHub and bugs.python.org). This would mean making a commit would then require remembering to go back to bugs.python.org to add the news entry.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Jan 23, 2017

Member

The only other potential solution other than individual files is a bot or script which collects news entry from messages in PR comments. The pro/con list is:

Individual files

Pro

  • Keeps NEWS entry with commit
  • All PR submitters learn about the NEWS file and thus a little bit more about our process

Con

  • Requires the PR submitter to create the NEWS entry which makes contributions slower

Bot

Pros

  • Don't need to lean on PR submitters to create the NEWS entry

Cons

  • Another bot to maintain would be annoying
  • Script would need a way to keep track of what entries have been added to the NEWS file thus far so that there wasn't just a constant growth of the number of PRs that had to be scraped
Member

brettcannon commented Jan 23, 2017

The only other potential solution other than individual files is a bot or script which collects news entry from messages in PR comments. The pro/con list is:

Individual files

Pro

  • Keeps NEWS entry with commit
  • All PR submitters learn about the NEWS file and thus a little bit more about our process

Con

  • Requires the PR submitter to create the NEWS entry which makes contributions slower

Bot

Pros

  • Don't need to lean on PR submitters to create the NEWS entry

Cons

  • Another bot to maintain would be annoying
  • Script would need a way to keep track of what entries have been added to the NEWS file thus far so that there wasn't just a constant growth of the number of PRs that had to be scraped
@dhellmann

This comment has been minimized.

Show comment
Hide comment
@dhellmann

dhellmann Jan 24, 2017

Member

Managing individual files by hand also has the benefit of having the ability to communicate separately to reviewers via commit messages and consumers via the release notes. We've had good luck with that in OpenStack where the consumer of the software is often interested in very different information from the other contributors who may be reviewing patches.

Member

dhellmann commented Jan 24, 2017

Managing individual files by hand also has the benefit of having the ability to communicate separately to reviewers via commit messages and consumers via the release notes. We've had good luck with that in OpenStack where the consumer of the software is often interested in very different information from the other contributors who may be reviewing patches.

@dhellmann

This comment has been minimized.

Show comment
Hide comment
@dhellmann

dhellmann Jan 25, 2017

Member

Hmm, it looks like the sample repo in python/cpython doesn't have any tags indicating when specific versions were released. Reno currently relies on tags for version identification. I could update it to look at something else -- how does one determine the current point release on a given branch? Is it in a file somewhere?

Member

dhellmann commented Jan 25, 2017

Hmm, it looks like the sample repo in python/cpython doesn't have any tags indicating when specific versions were released. Reno currently relies on tags for version identification. I could update it to look at something else -- how does one determine the current point release on a given branch? Is it in a file somewhere?

@orsenthil

This comment has been minimized.

Show comment
Hide comment
@orsenthil

orsenthil Jan 25, 2017

Member

@dhellmann , yeah, that's a problem with the current python/cpython repo. Tags were not replicated and pushed. We will have to repush the repo with tags (perhaps using tool and/or instructions from https://github.com/orsenthil/cpython-hg-to-git) which has been tried.

Member

orsenthil commented Jan 25, 2017

@dhellmann , yeah, that's a problem with the current python/cpython repo. Tags were not replicated and pushed. We will have to repush the repo with tags (perhaps using tool and/or instructions from https://github.com/orsenthil/cpython-hg-to-git) which has been tried.

@dhellmann

This comment has been minimized.

Show comment
Hide comment
@dhellmann

dhellmann Jan 25, 2017

Member

@orsenthil oh, good, if the plan is to eventually have the tags in place then that's no problem. I assumed there was some other mechanism in place already.

Member

dhellmann commented Jan 25, 2017

@orsenthil oh, good, if the plan is to eventually have the tags in place then that's no problem. I assumed there was some other mechanism in place already.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 11, 2017

Member

This tool should probably make this pretty easy https://pypi.org/project/towncrier/

Member

dstufft commented Feb 11, 2017

This tool should probably make this pretty easy https://pypi.org/project/towncrier/

@vstinner

This comment has been minimized.

Show comment
Hide comment
@vstinner

vstinner Feb 22, 2017

Member

I started a thread on python-dev because Misc/NEWS became a blocker point with the new GitHub workflow:
https://mail.python.org/pipermail/python-dev/2017-February/147417.html


FYI I wrote a tool which computes the released Python versions including a change from a list of commits to generate a report on Python vulnerabilities:
https://github.com/haypo/python-security/blob/master/render_doc.py

The core of the feature is "git tag --contains sha1". Then I implemented a logic to select which versions should be displayed. Output:
http://python-security.readthedocs.io/vulnerabilities.html

The tool also computes the number of days between the vulnerability disclosure date, commit date and release date. I chose to ignore beta and release candidate versions.

But I guess that there are other existing projects which would fit Misc/NEWS requirements better than my tool! (reno, twoncrier, something else?)

Member

vstinner commented Feb 22, 2017

I started a thread on python-dev because Misc/NEWS became a blocker point with the new GitHub workflow:
https://mail.python.org/pipermail/python-dev/2017-February/147417.html


FYI I wrote a tool which computes the released Python versions including a change from a list of commits to generate a report on Python vulnerabilities:
https://github.com/haypo/python-security/blob/master/render_doc.py

The core of the feature is "git tag --contains sha1". Then I implemented a logic to select which versions should be displayed. Output:
http://python-security.readthedocs.io/vulnerabilities.html

The tool also computes the number of days between the vulnerability disclosure date, commit date and release date. I chose to ignore beta and release candidate versions.

But I guess that there are other existing projects which would fit Misc/NEWS requirements better than my tool! (reno, twoncrier, something else?)

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Feb 26, 2017

changelog filenames

  • CHANGELOG.rst
  • HISTORY.rst
  • whatsnew.rst
  • Misc/NEWS

Escaping Markup

NOTE: commit logs may contain (executable) markup

Tools

Projects:

westurner commented Feb 26, 2017

changelog filenames

  • CHANGELOG.rst
  • HISTORY.rst
  • whatsnew.rst
  • Misc/NEWS

Escaping Markup

NOTE: commit logs may contain (executable) markup

Tools

Projects:

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Feb 26, 2017

Additional requirements/requests for Misc/NEWS'ification

  • SEC: Security
    • indicate that a releasenotedchange is security-relevant
    • link w/ @haypo render_doc.py
      • reno YAML would be flexible here

Additional requirements/requests for Misc/NEWS'ification

  • SEC: Security
    • indicate that a releasenotedchange is security-relevant
    • link w/ @haypo render_doc.py
      • reno YAML would be flexible here
@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Feb 26, 2017

Member

Since there seem to be multiple options for pre-existing tools and writing our own, we should start by identifying the requirements of what we want to be supported in the final NEWS file:

Features we want

  • Sectioned by Python release with the release date
  • Release sub-sectioned by topic
  • Issues referenced along with explanation of what changed
  • Simple bullet list
  • Single file

Nice-to-haves would be:

  • Reference changed/affected module(s) for per-module grouping (and a "general" for multi-module changes)

Am I missing anything we need/want the solution to cover?

What if we were writing a tool from scratch?

Now what would a greenfield solution look like (to help set a baseline of what we might want a tool to do)? To me, we would have a directory to contain all news-related details. In that top-level directory would be subdirectories for each subsection of the release (e.g. "Core and Built-ins", "Library", etc.). Each NEWS entry file would then go into the appropriate subdirectory. The filename would be the issue related to the change, e.g. bpo-12345.rst (in all honesty we should have an issue per change in the NEWS file as a requirement since if it's important enough to be listed in the NEWS files then we need a way to track any past and future discussions related to the change, and yes, we can come up with a standard for listing multiple issue numbers). If we wanted to support listing affected module(s) we could have some convention in the files to list that sort of detail (e.g. "modules: importlib").

Then upon release the RM would run a script which would read all the files, generate the appropriate text (which includes line-wrapping, etc.), and write out the file. Now we can either keep a single file with all entries (which gets expensive to load and view online, which also means we should add an appropriate .rst file extension), or a file per feature release (which makes searching the complete history a little harder as you then have to use something like grep to search multiple files quickly). I'm assuming we won't feel the need to drag NEWS entries forward -- e.g. entries made for 3.6.1 won't be pulled into the master branch for 3.7 -- since most changes in bugfix branches will also be fixed in the master branch and thus be redundant. But if people really care then a single PR to copy the update NEWS file(s) to master upon release could be done.

Next steps

First step is to figure out what our requirements are of any solution. So if I'm missing anything please reply here (I think I covered what @warsaw mentioned on python-dev above). Once we have the requirements known we can look at the available tools to see how close they come to meeting our needs and decide if we like any of them or want to write our own.

Member

brettcannon commented Feb 26, 2017

Since there seem to be multiple options for pre-existing tools and writing our own, we should start by identifying the requirements of what we want to be supported in the final NEWS file:

Features we want

  • Sectioned by Python release with the release date
  • Release sub-sectioned by topic
  • Issues referenced along with explanation of what changed
  • Simple bullet list
  • Single file

Nice-to-haves would be:

  • Reference changed/affected module(s) for per-module grouping (and a "general" for multi-module changes)

Am I missing anything we need/want the solution to cover?

What if we were writing a tool from scratch?

Now what would a greenfield solution look like (to help set a baseline of what we might want a tool to do)? To me, we would have a directory to contain all news-related details. In that top-level directory would be subdirectories for each subsection of the release (e.g. "Core and Built-ins", "Library", etc.). Each NEWS entry file would then go into the appropriate subdirectory. The filename would be the issue related to the change, e.g. bpo-12345.rst (in all honesty we should have an issue per change in the NEWS file as a requirement since if it's important enough to be listed in the NEWS files then we need a way to track any past and future discussions related to the change, and yes, we can come up with a standard for listing multiple issue numbers). If we wanted to support listing affected module(s) we could have some convention in the files to list that sort of detail (e.g. "modules: importlib").

Then upon release the RM would run a script which would read all the files, generate the appropriate text (which includes line-wrapping, etc.), and write out the file. Now we can either keep a single file with all entries (which gets expensive to load and view online, which also means we should add an appropriate .rst file extension), or a file per feature release (which makes searching the complete history a little harder as you then have to use something like grep to search multiple files quickly). I'm assuming we won't feel the need to drag NEWS entries forward -- e.g. entries made for 3.6.1 won't be pulled into the master branch for 3.7 -- since most changes in bugfix branches will also be fixed in the master branch and thus be redundant. But if people really care then a single PR to copy the update NEWS file(s) to master upon release could be done.

Next steps

First step is to figure out what our requirements are of any solution. So if I'm missing anything please reply here (I think I covered what @warsaw mentioned on python-dev above). Once we have the requirements known we can look at the available tools to see how close they come to meeting our needs and decide if we like any of them or want to write our own.

@methane methane referenced this issue in python/cpython Feb 27, 2017

Merged

Reduce conflict on Misc/NEWS #212

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Feb 27, 2017

Member

Thanks @brettcannon

I was thinking something similar, but slightly different to your implementation outline. I like naming files with their bug number, e.g. bpo-12345.rst but I was thinking that the top-level organizational directory would contain subdirectories per release, e.g. 3.7, 3.6, 3.5, etc. Symlinks or hardlinks would take care of issues that span multiple versions. If that doesn't work because of platform limitations, then a simple cross reference inside similarly named files in different directories would work just as well.

I was thinking maybe we don't need subcategory subdirectories within there. Everything else can be specified using metadata in the .rst file. Kind of like the way Pelican supports metadata directives for the date, category, tags, and slugs. E.g.

News/3.7/bpo-28598.rst

:section: core
:alsofixes: bpo-12345
:versionadded: 3.7.0a1
:backports: 3.6.1
:authors: Martijn Pieters

Support __rmod__ for subclasses of str being called before str.__mod__.

Generally, you wouldn't have an :alsofixes: directive, and you wouldn't have a :backports: directive for new features. Oh, and +1 for strongly recommending (and maybe requiring) anything in NEWS to have a bug. We could even add a CI gate so that any merge must have an entry in News/X.Y/

I don't know whether those directives are a reST feature or something added by Pelican. We could also use the RFC-822 style headers found in PEPs.

Member

warsaw commented Feb 27, 2017

Thanks @brettcannon

I was thinking something similar, but slightly different to your implementation outline. I like naming files with their bug number, e.g. bpo-12345.rst but I was thinking that the top-level organizational directory would contain subdirectories per release, e.g. 3.7, 3.6, 3.5, etc. Symlinks or hardlinks would take care of issues that span multiple versions. If that doesn't work because of platform limitations, then a simple cross reference inside similarly named files in different directories would work just as well.

I was thinking maybe we don't need subcategory subdirectories within there. Everything else can be specified using metadata in the .rst file. Kind of like the way Pelican supports metadata directives for the date, category, tags, and slugs. E.g.

News/3.7/bpo-28598.rst

:section: core
:alsofixes: bpo-12345
:versionadded: 3.7.0a1
:backports: 3.6.1
:authors: Martijn Pieters

Support __rmod__ for subclasses of str being called before str.__mod__.

Generally, you wouldn't have an :alsofixes: directive, and you wouldn't have a :backports: directive for new features. Oh, and +1 for strongly recommending (and maybe requiring) anything in NEWS to have a bug. We could even add a CI gate so that any merge must have an entry in News/X.Y/

I don't know whether those directives are a reST feature or something added by Pelican. We could also use the RFC-822 style headers found in PEPs.

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Feb 27, 2017

Member

More about using major release numbers as the organizational structure:

  • It would make grepping for a change's scope easier (I could grep the entire News/ subdirectory, or just News/3.6/
  • ls News/3.7 would give a really useful high level view of changes in a particular version.
  • The tool to generate NEWS.rst could be version limited by passing the sub-subdirectory, or generate news for all tracked releases by passing in News/
Member

warsaw commented Feb 27, 2017

More about using major release numbers as the organizational structure:

  • It would make grepping for a change's scope easier (I could grep the entire News/ subdirectory, or just News/3.6/
  • ls News/3.7 would give a really useful high level view of changes in a particular version.
  • The tool to generate NEWS.rst could be version limited by passing the sub-subdirectory, or generate news for all tracked releases by passing in News/
@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 27, 2017

Member

Do we want to keep the news fragments around forever? I assumed that once they've been integrated into a actual news file that they would be deleted.

(+1 is supporting deletion, -1 is to keep files around forever.)

Member

dstufft commented Feb 27, 2017

Do we want to keep the news fragments around forever? I assumed that once they've been integrated into a actual news file that they would be deleted.

(+1 is supporting deletion, -1 is to keep files around forever.)

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Feb 27, 2017

Member

I was thinking we'd keep them around forever. They probably don't take much space, and they compress well, so it would be very handy to be able to do the historical greps from the master branch.

Member

warsaw commented Feb 27, 2017

I was thinking we'd keep them around forever. They probably don't take much space, and they compress well, so it would be very handy to be able to do the historical greps from the master branch.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Feb 27, 2017

Member

The problem I see for keeping them around is simply the amount of files the repo will grow by which might affect checkout speed and maybe upset some file systems. Do we have any idea how many files 3.4 would have if we kept the files around for the life of the feature release? And if we generate a single file per feature release then you can grep that one file plus entry files to get your version search without keeping duplicate data around.

As for the feature version subdirectories, the reason I didn't suggest that is it seemed unnecessary in a cherry picking workflow. If a file is in the e.g. 3.6 branch then you know it's for 3.6 thanks to its mere existence. And so even if we did use feature branch directories it won't matter if a fix is in the 3.7 directory in the 3.6 branch as its existence means it applies to Python 3.6 itself, it just happens to have been committed to 3.7 first. But as I said, if the existence of the file means the change applies then sticking it in a feature branch directory only comes up if you keep the individual files around.

Lastly, adding cross-references between feature branch subdirectories complicates cherry picks as it adds a mandatory extra step. By leaving out cross-references it allows cherry picks that merge cleanly to not require any more work beyond opening the PR while cross-referencing adds a required step for every cherry pick PR.

Member

brettcannon commented Feb 27, 2017

The problem I see for keeping them around is simply the amount of files the repo will grow by which might affect checkout speed and maybe upset some file systems. Do we have any idea how many files 3.4 would have if we kept the files around for the life of the feature release? And if we generate a single file per feature release then you can grep that one file plus entry files to get your version search without keeping duplicate data around.

As for the feature version subdirectories, the reason I didn't suggest that is it seemed unnecessary in a cherry picking workflow. If a file is in the e.g. 3.6 branch then you know it's for 3.6 thanks to its mere existence. And so even if we did use feature branch directories it won't matter if a fix is in the 3.7 directory in the 3.6 branch as its existence means it applies to Python 3.6 itself, it just happens to have been committed to 3.7 first. But as I said, if the existence of the file means the change applies then sticking it in a feature branch directory only comes up if you keep the individual files around.

Lastly, adding cross-references between feature branch subdirectories complicates cherry picks as it adds a mandatory extra step. By leaving out cross-references it allows cherry picks that merge cleanly to not require any more work beyond opening the PR while cross-referencing adds a required step for every cherry pick PR.

@terryjreedy

This comment has been minimized.

Show comment
Hide comment
@terryjreedy

terryjreedy Feb 27, 2017

Member

+1 to Barry's variation. I like self-identifying files. For one thing, if a contributor submits a file with the wrong section, I presume it would be easier to edit the section line than to move the file.

What I would really like is auto-generation of backport pull-requests. If this should not always be done, then use a symbol like '@' to trigger the auto-generation.

On dstufft's comment, I am not sure if thumbs-up is for 'keep forever' or 'delete when integrated'. 1000s of files under 500 bytes on file systems using 4000 bytes'file (Windows, last I knew) is a bit wasteful. I am for deleting individual files after they are sorted and concatenated into a 'raw' (unformatted) listing. This would accomplish compression without loss of information. (One would still grep the metadata.) It would allow more than one formatting.

News entries can have non-ascii chars (for names) but must be utf-8 encoded. I suspect that we will occasionally get mis-encoded submissions. Will there be an auto encoding check somewhere?

Member

terryjreedy commented Feb 27, 2017

+1 to Barry's variation. I like self-identifying files. For one thing, if a contributor submits a file with the wrong section, I presume it would be easier to edit the section line than to move the file.

What I would really like is auto-generation of backport pull-requests. If this should not always be done, then use a symbol like '@' to trigger the auto-generation.

On dstufft's comment, I am not sure if thumbs-up is for 'keep forever' or 'delete when integrated'. 1000s of files under 500 bytes on file systems using 4000 bytes'file (Windows, last I knew) is a bit wasteful. I am for deleting individual files after they are sorted and concatenated into a 'raw' (unformatted) listing. This would accomplish compression without loss of information. (One would still grep the metadata.) It would allow more than one formatting.

News entries can have non-ascii chars (for names) but must be utf-8 encoded. I suspect that we will occasionally get mis-encoded submissions. Will there be an auto encoding check somewhere?

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Feb 27, 2017

Member

Moving a file isn't difficult; git will just delete the old one and add it again under the new name (and implicitly pick up the file was moved).

As for auto-generating cherry picks, see GH-8.

For voting on Donald's comment, I view voting +1 is for deletion. I have edited the comment to make it more clear (I also just looked at how @warsaw voted and voted the opposite 😉 ).

And for the encoding check, I'm sure we will implement a status check that will do things like verify formatting, check the text encoding is UTF-8, etc.

Member

brettcannon commented Feb 27, 2017

Moving a file isn't difficult; git will just delete the old one and add it again under the new name (and implicitly pick up the file was moved).

As for auto-generating cherry picks, see GH-8.

For voting on Donald's comment, I view voting +1 is for deletion. I have edited the comment to make it more clear (I also just looked at how @warsaw voted and voted the opposite 😉 ).

And for the encoding check, I'm sure we will implement a status check that will do things like verify formatting, check the text encoding is UTF-8, etc.

@dhellmann

This comment has been minimized.

Show comment
Hide comment
@dhellmann

dhellmann Feb 27, 2017

Member

@brettcannon At least in the case of reno, the directory structure isn't needed for organizing the files because it pulls the content out of the git history, not from the currently checked out workspace.

If we're worried about the number of note files, then a bunch of individual notes could be collapsed into one big file just before each release. Backports of fixes could still use separate files to allow for clean cherry-picks.

Member

dhellmann commented Feb 27, 2017

@brettcannon At least in the case of reno, the directory structure isn't needed for organizing the files because it pulls the content out of the git history, not from the currently checked out workspace.

If we're worried about the number of note files, then a bunch of individual notes could be collapsed into one big file just before each release. Backports of fixes could still use separate files to allow for clean cherry-picks.

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Feb 27, 2017

In order to JOIN this metadata {description, issue numbers) with other metadata; something like YAML (YAML-LD?) with some type of URI would be most helpful.

JSON-LD:

{"@id": "http://.../changelog#entry/uri/<checksum?>",
 "description": "...",
 "issues": [
   {"n": n,
     "description":""}]
}
  • reno adds a unique id to the filenames

In order to JOIN this metadata {description, issue numbers) with other metadata; something like YAML (YAML-LD?) with some type of URI would be most helpful.

JSON-LD:

{"@id": "http://.../changelog#entry/uri/<checksum?>",
 "description": "...",
 "issues": [
   {"n": n,
     "description":""}]
}
  • reno adds a unique id to the filenames
@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Feb 27, 2017

Member

One other thing I forgot to mention about why I suggested subdirectories for classification is it does away with having to look up what potential classification options there are. I know when I happen to need to add an entry early on in a branch I always have to scroll back to see what sections we have used previously to figure out which ones fits best. If we had directories we never deleted then that guesswork is gone.

@dhellmann so are you saying reno looks at what files changed in the relevant commit to infer what to classify the change as (e.g. if only stuff in Lib/ changed then it would automatically be classified as "Library")?

Member

brettcannon commented Feb 27, 2017

One other thing I forgot to mention about why I suggested subdirectories for classification is it does away with having to look up what potential classification options there are. I know when I happen to need to add an entry early on in a branch I always have to scroll back to see what sections we have used previously to figure out which ones fits best. If we had directories we never deleted then that guesswork is gone.

@dhellmann so are you saying reno looks at what files changed in the relevant commit to infer what to classify the change as (e.g. if only stuff in Lib/ changed then it would automatically be classified as "Library")?

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 27, 2017

Contributor

The problem I see for keeping them around is simply the amount of files the repo will grow by which might affect checkout speed and maybe upset some file systems.

Easily solved. Consider that git itself is storing kajillions of files in its object store. It manages this by employing a fascinating feature--available in all modern operating systems!--called a "subdirectory".

In the case of git's object store, it snips off the first two characters of the object's hexified hash, and that becomes the subdirectory name. So there are potentially 256 subdirectories, and 1/256 on average of the files go in each subdir.

In our case, I'd suggest that

  • the tool globs all files in the entire directory tree automatically, sorting them internally, and
  • the tool creates the text file per checkin in a "yyyy.mm" subdirectory.
Contributor

larryhastings commented Feb 27, 2017

The problem I see for keeping them around is simply the amount of files the repo will grow by which might affect checkout speed and maybe upset some file systems.

Easily solved. Consider that git itself is storing kajillions of files in its object store. It manages this by employing a fascinating feature--available in all modern operating systems!--called a "subdirectory".

In the case of git's object store, it snips off the first two characters of the object's hexified hash, and that becomes the subdirectory name. So there are potentially 256 subdirectories, and 1/256 on average of the files go in each subdir.

In our case, I'd suggest that

  • the tool globs all files in the entire directory tree automatically, sorting them internally, and
  • the tool creates the text file per checkin in a "yyyy.mm" subdirectory.
@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 27, 2017

Member

I'll say that I don't see a whole lot of value to keeping the files around once they've been "compiled" into the relevant NEWS file.

Member

dstufft commented Feb 27, 2017

I'll say that I don't see a whole lot of value to keeping the files around once they've been "compiled" into the relevant NEWS file.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 27, 2017

Contributor

I'll say that I don't see a whole lot of value to keeping the files around once they've been "compiled" into the relevant NEWS file.

In that case, why have the discrete files in the first place! Just have people add their entries directly to the NEWS file.

... y'see, that's the problem we're trying to solve. When I cut a Python release, I always have a big painful merge I have to do by hand on the NEWS file. If we keep the discrete files, I can just regenerate NEWS and know it's going to be correct.

Contributor

larryhastings commented Feb 27, 2017

I'll say that I don't see a whole lot of value to keeping the files around once they've been "compiled" into the relevant NEWS file.

In that case, why have the discrete files in the first place! Just have people add their entries directly to the NEWS file.

... y'see, that's the problem we're trying to solve. When I cut a Python release, I always have a big painful merge I have to do by hand on the NEWS file. If we keep the discrete files, I can just regenerate NEWS and know it's going to be correct.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 27, 2017

Member

In that case, why have the discrete files in the first place! Just have people add their entries directly to the NEWS file.

Because many people adding and changing lines to the NEWS files causes merge conflicts. One person periodically "compiling" that NEWS file during a release does not.

Member

dstufft commented Feb 27, 2017

In that case, why have the discrete files in the first place! Just have people add their entries directly to the NEWS file.

Because many people adding and changing lines to the NEWS files causes merge conflicts. One person periodically "compiling" that NEWS file during a release does not.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 27, 2017

Contributor

I don't want to go down that road.

  • TOOWTDI. There should be one canonical place for NEWS entries. You're proposing there be two.
  • If we allow the canonical place to be either, then somebody's going to say "I don't want to deal with this new file format! Why do you change things when they were perfectly fine before!" or "It doesn't integrate well with my workflow!" and then they'll be the special snowflake who edits NEWS on their own, and causes conflicts.

I propose that we design the tool to use the discrete files, and Misc/NEWS is only ever generated from that tool. If, down the road, we decide that it's really okay, we can anoint Misc/NEWS as a second canonical location and delete the discrete files.

What problem does keeping the discrete files cause? Is it just that you don't like it for some aesthetic reason?

Contributor

larryhastings commented Feb 27, 2017

I don't want to go down that road.

  • TOOWTDI. There should be one canonical place for NEWS entries. You're proposing there be two.
  • If we allow the canonical place to be either, then somebody's going to say "I don't want to deal with this new file format! Why do you change things when they were perfectly fine before!" or "It doesn't integrate well with my workflow!" and then they'll be the special snowflake who edits NEWS on their own, and causes conflicts.

I propose that we design the tool to use the discrete files, and Misc/NEWS is only ever generated from that tool. If, down the road, we decide that it's really okay, we can anoint Misc/NEWS as a second canonical location and delete the discrete files.

What problem does keeping the discrete files cause? Is it just that you don't like it for some aesthetic reason?

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 27, 2017

Member

@larryhastings Err, no the canonical place is not either. Entries for the next release of Python get a discrete file, period. The tool consumes the previous NEWS file, reads the new news fragments and outputs a new NEWS file combining the two by pre-pending onto the previous NEWS file. If we can't trust core committers not to touch the previously generated NEWS files, we can make a status check for it.

Having discrete files that stick around forever makes working with the canonical files harder to do. Either you don't namespace them by version and you end up with a huge list of files (inside or outside of sub directories, it doesn't matter) which makes actually mucking with them harder. Like for example, if I'm trying to look through what's changed thus far in 3.6.1rc1 even though it hasn't been released.

In addition to that, tracking what version something got released with becomes much harder. In the "always keep a discrete file" method you have to either add it in as part of the contents of the file itself or you need to namespace it by a directory per file. In either case it makes backporting changes much more error prone and adds busy work because beyond just cherry-picking now I need to ensure that I properly update or move the file for each branch I'm backporting it to. By keeping that information out of band, it means that singular file can easily move between branches transparently and will automatically DTRT during backports.

The above problem gets a lot worse when someone might create a PR and have it going through the review process but not get merged, then a release is cut. Now suddenly every PR to that branches release notes are invalidated and wrong and every single PR to say, master needs to have someone go through and fix it so that the existing PRs all have the new, correct version.

I don't see any problems caused by "compiling" the the Misc/NEWS file each release and deleting the old files, except that maybe a core committer will outright refuse to follow the new defined procedures of the project and will go trying to manually create Misc/NEWS entries... in which case, seems like they shouldn't be a committer at all.

Member

dstufft commented Feb 27, 2017

@larryhastings Err, no the canonical place is not either. Entries for the next release of Python get a discrete file, period. The tool consumes the previous NEWS file, reads the new news fragments and outputs a new NEWS file combining the two by pre-pending onto the previous NEWS file. If we can't trust core committers not to touch the previously generated NEWS files, we can make a status check for it.

Having discrete files that stick around forever makes working with the canonical files harder to do. Either you don't namespace them by version and you end up with a huge list of files (inside or outside of sub directories, it doesn't matter) which makes actually mucking with them harder. Like for example, if I'm trying to look through what's changed thus far in 3.6.1rc1 even though it hasn't been released.

In addition to that, tracking what version something got released with becomes much harder. In the "always keep a discrete file" method you have to either add it in as part of the contents of the file itself or you need to namespace it by a directory per file. In either case it makes backporting changes much more error prone and adds busy work because beyond just cherry-picking now I need to ensure that I properly update or move the file for each branch I'm backporting it to. By keeping that information out of band, it means that singular file can easily move between branches transparently and will automatically DTRT during backports.

The above problem gets a lot worse when someone might create a PR and have it going through the review process but not get merged, then a release is cut. Now suddenly every PR to that branches release notes are invalidated and wrong and every single PR to say, master needs to have someone go through and fix it so that the existing PRs all have the new, correct version.

I don't see any problems caused by "compiling" the the Misc/NEWS file each release and deleting the old files, except that maybe a core committer will outright refuse to follow the new defined procedures of the project and will go trying to manually create Misc/NEWS entries... in which case, seems like they shouldn't be a committer at all.

@ned-deily

This comment has been minimized.

Show comment
Hide comment
@ned-deily

ned-deily Feb 27, 2017

Member

I agree with @dstufft here, IIUC. If we had the "combining" tool, we could add a step to the release process so that the release manager would be responsible for doing a final combine and delete as part of tagging a release. I don't see any value in keeping the individual files around and I do see the necessity and value of continuing to have a single combined Misc/NEWS file as part of a release.

Member

ned-deily commented Feb 27, 2017

I agree with @dstufft here, IIUC. If we had the "combining" tool, we could add a step to the release process so that the release manager would be responsible for doing a final combine and delete as part of tagging a release. I don't see any value in keeping the individual files around and I do see the necessity and value of continuing to have a single combined Misc/NEWS file as part of a release.

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Feb 28, 2017

Member
Member

warsaw commented Feb 28, 2017

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 28, 2017

Contributor

I don't see any problems caused by "compiling" the the Misc/NEWS file each release and deleting the old files, except [problem it causes]

I see more problems, see below.

I wouldn't claim that any workflow here doesn't cause problems. I think anything we do will have some edge case. But I'm dead certain that only having one canonical storage location for Misc/NEWS entries--rather than two, and which one is being used is context-sensitive--is simpler and will lead to fewer problems.

If we can't trust core committers not to touch the previously generated NEWS files, we can make a status check for it. [...] except that maybe a core committer will outright refuse to follow the new defined procedures of the project and will go trying to manually create Misc/NEWS entries... in which case, seems like they shouldn't be a committer at all.

I guarantee that somebody you don't want to argue with (e.g. Guido) will be the special snowflake editing NEWS by hand. I think it's better to obviate the argument entirely.

Having discrete files that stick around forever makes working with the canonical files harder to do.

What do you mean by "canonical files" here?

If you mean Misc/NEWS itself, then may I remind you that in my proposal the "discrete files" are the canonical source for that information, and Misc/NEWS is merely an output file designed for convenient human consumption. Nobody will be "working with" Misc/NEWS per se; we release managers would be the only people changing it, and in the case of merge conflicts we'd feel totally emboldened to blow it away, recreate it from scratch, and tell Git "ignore merge conflicts, check in what I've got". (In fact I propose that that be the exact workflow every time.) So this simply isn't an issue.

We could even go so far as to not check in Misc/NEWS. Building it would be part of the release workflow, and if anybody wanted to read it they could rebuild it themselves. This doesn't seem like much of a burden, for a file people rarely examine during development.

tracking what version something got released with becomes much harder. In the "always keep a discrete file" method you have to either add it in as part of the contents of the file itself or you need to namespace it by a directory per file.

Can we ask Git itself? Would "git log" on the discrete file tell us the branch it was checked in under? (I just tried, and it seemed like the answer was "no", but I'm far from a Git expert so I wouldn't take my findings as authoritative.)

In any case, I don't understand your position here. You state this as if this is a critique of keeping the discrete files forever. But if this is really a problem, then we'll have this problem if we use the discrete files at all. Anybody cherry-picking / forward-porting / back-porting will have to deal with this.

Worse yet, it means that there are two different workflows, depending on whether the cherry-pick is done before or after a release is cut. "Oh, you realized you need to backport ? Well, I just released 3.8b1, so your discrete NEWS entry file got deleted, you'll have to ". I suspect this second worfklow would include re-creating the discrete file in the backported branch, either by going back to before it was deleted and rescuing it, or by creating a new file and copying and pasting in from the generated NEWS file. I don't see that as a workflow improvement over "when you backport, make sure you move the Misc/NEWS entry too" in all cases. Again, TOOWTDI.

Right now we have a miserable workflow wrt Misc/NEWS entries, particularly when back-porting / forward-porting / cherry-picking. I think the discrete files approach will make this less-bad. Having the canonical location be in one of two places muddies the waters, making the workflow more complicated and context-sensitive.

On the other hand, keeping the discrete files forever should make it easier to correlate a Misc/NEWS entry back to its checkins. Right now you have to run "blame" on Misc/NEWS and glean the revision from that. And if it's been through a couple of forward-merges and edits you may need to run several versioned "blames" to figure it out. With discrete files, you find the particular file with a (recursive?) grep, then examine the file's revision log. And again, if sometimes it's stored in a discrete file, and sometimes it now lives in Misc/NEWS, that makes the workflow more complicated.

FWIW, in my "MergeNEWS" prototype, the generated filename was

Misc/NEWS.d/<version>/<section>.<timestamp>.<md5hash>.txt

so clearly I thought it best to store the version as part of the path. This worked well (in theory) with forward-merging; if you fixed a bug in 3.5, then forward-merged to 3.6, the Misc/NEWS.d entry would show up in the 3.6 Misc/NEWS file, albeit under 3.5 (which is kind of accurate anyway). This is less pleasant when back-porting, which I gather is the preferred approach with the new Git-based workflow, but this is a design choice we can easily revisit.

Ned sez:

If we had the "combining" tool, we could add a step to the release process so that the release manager would be responsible for doing a final combine and delete as part of tagging a release.

My prototype generates Misc/NEWS from scratch every time. FWIW, making it parse Misc/NEWS so it can find the proper insertion points for all the new items would make it more complicated. Not an unmitigated disaster of complexity, but I prefer simple where I can get it.

I do see the necessity and value of continuing to have a single combined Misc/NEWS file as part of a release.

Nobody is suggesting otherwise. Running the build process to produce Misc/NEWS would be part of the release process. Whether the NEWS entries are stored in the discrete files or partially in discrete files and partially in the NEWS file doesn't affect that.

Contributor

larryhastings commented Feb 28, 2017

I don't see any problems caused by "compiling" the the Misc/NEWS file each release and deleting the old files, except [problem it causes]

I see more problems, see below.

I wouldn't claim that any workflow here doesn't cause problems. I think anything we do will have some edge case. But I'm dead certain that only having one canonical storage location for Misc/NEWS entries--rather than two, and which one is being used is context-sensitive--is simpler and will lead to fewer problems.

If we can't trust core committers not to touch the previously generated NEWS files, we can make a status check for it. [...] except that maybe a core committer will outright refuse to follow the new defined procedures of the project and will go trying to manually create Misc/NEWS entries... in which case, seems like they shouldn't be a committer at all.

I guarantee that somebody you don't want to argue with (e.g. Guido) will be the special snowflake editing NEWS by hand. I think it's better to obviate the argument entirely.

Having discrete files that stick around forever makes working with the canonical files harder to do.

What do you mean by "canonical files" here?

If you mean Misc/NEWS itself, then may I remind you that in my proposal the "discrete files" are the canonical source for that information, and Misc/NEWS is merely an output file designed for convenient human consumption. Nobody will be "working with" Misc/NEWS per se; we release managers would be the only people changing it, and in the case of merge conflicts we'd feel totally emboldened to blow it away, recreate it from scratch, and tell Git "ignore merge conflicts, check in what I've got". (In fact I propose that that be the exact workflow every time.) So this simply isn't an issue.

We could even go so far as to not check in Misc/NEWS. Building it would be part of the release workflow, and if anybody wanted to read it they could rebuild it themselves. This doesn't seem like much of a burden, for a file people rarely examine during development.

tracking what version something got released with becomes much harder. In the "always keep a discrete file" method you have to either add it in as part of the contents of the file itself or you need to namespace it by a directory per file.

Can we ask Git itself? Would "git log" on the discrete file tell us the branch it was checked in under? (I just tried, and it seemed like the answer was "no", but I'm far from a Git expert so I wouldn't take my findings as authoritative.)

In any case, I don't understand your position here. You state this as if this is a critique of keeping the discrete files forever. But if this is really a problem, then we'll have this problem if we use the discrete files at all. Anybody cherry-picking / forward-porting / back-porting will have to deal with this.

Worse yet, it means that there are two different workflows, depending on whether the cherry-pick is done before or after a release is cut. "Oh, you realized you need to backport ? Well, I just released 3.8b1, so your discrete NEWS entry file got deleted, you'll have to ". I suspect this second worfklow would include re-creating the discrete file in the backported branch, either by going back to before it was deleted and rescuing it, or by creating a new file and copying and pasting in from the generated NEWS file. I don't see that as a workflow improvement over "when you backport, make sure you move the Misc/NEWS entry too" in all cases. Again, TOOWTDI.

Right now we have a miserable workflow wrt Misc/NEWS entries, particularly when back-porting / forward-porting / cherry-picking. I think the discrete files approach will make this less-bad. Having the canonical location be in one of two places muddies the waters, making the workflow more complicated and context-sensitive.

On the other hand, keeping the discrete files forever should make it easier to correlate a Misc/NEWS entry back to its checkins. Right now you have to run "blame" on Misc/NEWS and glean the revision from that. And if it's been through a couple of forward-merges and edits you may need to run several versioned "blames" to figure it out. With discrete files, you find the particular file with a (recursive?) grep, then examine the file's revision log. And again, if sometimes it's stored in a discrete file, and sometimes it now lives in Misc/NEWS, that makes the workflow more complicated.

FWIW, in my "MergeNEWS" prototype, the generated filename was

Misc/NEWS.d/<version>/<section>.<timestamp>.<md5hash>.txt

so clearly I thought it best to store the version as part of the path. This worked well (in theory) with forward-merging; if you fixed a bug in 3.5, then forward-merged to 3.6, the Misc/NEWS.d entry would show up in the 3.6 Misc/NEWS file, albeit under 3.5 (which is kind of accurate anyway). This is less pleasant when back-porting, which I gather is the preferred approach with the new Git-based workflow, but this is a design choice we can easily revisit.

Ned sez:

If we had the "combining" tool, we could add a step to the release process so that the release manager would be responsible for doing a final combine and delete as part of tagging a release.

My prototype generates Misc/NEWS from scratch every time. FWIW, making it parse Misc/NEWS so it can find the proper insertion points for all the new items would make it more complicated. Not an unmitigated disaster of complexity, but I prefer simple where I can get it.

I do see the necessity and value of continuing to have a single combined Misc/NEWS file as part of a release.

Nobody is suggesting otherwise. Running the build process to produce Misc/NEWS would be part of the release process. Whether the NEWS entries are stored in the discrete files or partially in discrete files and partially in the NEWS file doesn't affect that.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 28, 2017

Member

If you delete after compilation, then when does this happen? At release time? Occasionally during the release process?

At release time.

Either way, you're going to have a massive commit that deletes a ton of those little files and adds/updates NEWS. What if someone does that accidentally, or there's a bug in the script? Once the source files are gone, you will have to play games with git to get back to a clean pre-compilation state. That's why reproducible and idempotent are useful.

Er, you mean if someone accidentally does it, commits it, pushes it to a branch, makes a PR for it, waits for the status checks to pass, and then merges it? Since we can't push directly to branches this seems extremely far fetched. But for the sake of argument let's say someone manages to do all of that accidentally. The solution is pretty easy, git revert which I really don't consider "playing games with git".

I'm not at all concerned about tons of little files as @larryhastings says. I can't imagine it will make any dent in cloning speeds. Of course, you don't know until you measure it, but I imagine there's a ton of good compression all up and down the stack. Git is supposed to be good at managing text files, so let it!

I'm not at all concerned about cloning speed. What I am concerned with is the UX of working with a bunch of tiny files. This is manageable (IMO) if we reduce the scope of the tiny files to just the latest . For example, if I notice a typo in the release notes, locating the tiny file is a lot more difficult than being able to edit that file in place in the browser. If we go farther and never commit the resulting output then we even lose the ability to easily search the news files using browser native searching.

Member

dstufft commented Feb 28, 2017

If you delete after compilation, then when does this happen? At release time? Occasionally during the release process?

At release time.

Either way, you're going to have a massive commit that deletes a ton of those little files and adds/updates NEWS. What if someone does that accidentally, or there's a bug in the script? Once the source files are gone, you will have to play games with git to get back to a clean pre-compilation state. That's why reproducible and idempotent are useful.

Er, you mean if someone accidentally does it, commits it, pushes it to a branch, makes a PR for it, waits for the status checks to pass, and then merges it? Since we can't push directly to branches this seems extremely far fetched. But for the sake of argument let's say someone manages to do all of that accidentally. The solution is pretty easy, git revert which I really don't consider "playing games with git".

I'm not at all concerned about tons of little files as @larryhastings says. I can't imagine it will make any dent in cloning speeds. Of course, you don't know until you measure it, but I imagine there's a ton of good compression all up and down the stack. Git is supposed to be good at managing text files, so let it!

I'm not at all concerned about cloning speed. What I am concerned with is the UX of working with a bunch of tiny files. This is manageable (IMO) if we reduce the scope of the tiny files to just the latest . For example, if I notice a typo in the release notes, locating the tiny file is a lot more difficult than being able to edit that file in place in the browser. If we go farther and never commit the resulting output then we even lose the ability to easily search the news files using browser native searching.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Feb 28, 2017

Member

To answer the "how many files are we talking about" question, I ran (master) > grep -c "^- " Misc/NEWS against the master branch and got 2,800 (and that covers all the way back to 3.5.0a1). And if you do find . -type f | wc -l you get 3,837; subtract out find .git -type f | wc -l and we end up with 3,776 files in a checkout of CPython.

Member

brettcannon commented Feb 28, 2017

To answer the "how many files are we talking about" question, I ran (master) > grep -c "^- " Misc/NEWS against the master branch and got 2,800 (and that covers all the way back to 3.5.0a1). And if you do find . -type f | wc -l you get 3,837; subtract out find .git -type f | wc -l and we end up with 3,776 files in a checkout of CPython.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 28, 2017

Contributor

One further idea: if we became concerned about zillions of historical files that nobody cares about (e.g. Misc/NEWS entries from 2.6), we could conglomerate those entries into single data files that actually contained multiple entries, and make sure that worked nicely with the tool. Thus the 2.6.0 directory could contain "Misc/NEWS.d/2.6.0/Core & Libraries.0.txt" which contained all items for that section in 2.6.0. Furthermore, we could hide historical versions in a subdirectory ("Misc/NEWS.d/archive/2.6.0/Core & Libraries.0.txt") so people wouldn't have to stare at them forever. We could then definitely automate that process (conglomerate and archive), which could be run at any time--as part of the release process, or later, or whenever.

Contributor

larryhastings commented Feb 28, 2017

One further idea: if we became concerned about zillions of historical files that nobody cares about (e.g. Misc/NEWS entries from 2.6), we could conglomerate those entries into single data files that actually contained multiple entries, and make sure that worked nicely with the tool. Thus the 2.6.0 directory could contain "Misc/NEWS.d/2.6.0/Core & Libraries.0.txt" which contained all items for that section in 2.6.0. Furthermore, we could hide historical versions in a subdirectory ("Misc/NEWS.d/archive/2.6.0/Core & Libraries.0.txt") so people wouldn't have to stare at them forever. We could then definitely automate that process (conglomerate and archive), which could be run at any time--as part of the release process, or later, or whenever.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Feb 28, 2017

Contributor

To answer the "how many files are we talking about" question, I ran (master) > grep -c "^- " Misc/NEWS against the master branch and got 2,800

I just experimentally ran my "splitnews" against a "current" hg trunk. (Yeah, sorry.) It produced 2809 discrete files and 33 directories under Misc/NEWS.d.

Contributor

larryhastings commented Feb 28, 2017

To answer the "how many files are we talking about" question, I ran (master) > grep -c "^- " Misc/NEWS against the master branch and got 2,800

I just experimentally ran my "splitnews" against a "current" hg trunk. (Yeah, sorry.) It produced 2809 discrete files and 33 directories under Misc/NEWS.d.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Feb 28, 2017

Member

I guarantee that somebody you don't want to argue with (e.g. Guido) will be the special snowflake editing NEWS by hand. I think it's better to obviate the argument entirely.

I've never been afraid of arguing with Guido :), but given he's just about the only person who can decide by fiat that they're going to edit Misc/NEWS and that's that and everyone else just has to deal with it, I'd suggest instead of nebulous "but Guido might care!" we should just ask @gvanrossum.

What do you mean by "canonical files" here?

I do not mean Misc/NEWS, I mean all the little tiny files. cross file boundaries in the browser is less than great. Yes, grep can do either instance with roughly the same power, but grep isn't the lowest common denominator.

Can we ask Git itself? Would "git log" on the discrete file tell us the branch it was checked in under? (I just tried, and it seemed like the answer was "no", but I'm far from a Git expert so I wouldn't take my findings as authoritative.)

No. git has no idea what branch something was committed under. The only thing git can tell you about a commit is what branches/tags it currently exists in.

In any case, I don't understand your position here. You state this as if this is a critique of keeping the discrete files forever. But if this is really a problem, then we'll have this problem if we use the discrete files at all. Anybody cherry-picking / forward-porting / back-porting will have to deal with this.

You need some mechanism to indicate what version a particular "news snippet" is for.

If you keep each tiny file for all versions forever, then you need to structure either the file or the filename so as to keep that data. If that data is part of the filename or the file then it gets invalidated anytime a PR lives longer than the release it was originally created for.

If you have a singular file that has all of the "historical" news (i.e. not the release we're currently working on) and a directory that olds all of the "current" news snippets (i.e. the ones that are new in the release that will be released next) then you no longer have to track versions because it's either historical (in which case you don't care in the tooling) or it's for the next release (in which case it's in the directory as a snippet).

Worse yet, it means that there are two different workflows, depending on whether the cherry-pick is done before or after a release is cut. "Oh, you realized you need to backport ? Well, I just released 3.8b1, so your discrete NEWS entry file got deleted, you'll have to ". I suspect this second worfklow would include re-creating the discrete file in the backported branch, either by going back to before it was deleted and rescuing it, or by creating a new file and copying and pasting in from the generated NEWS file. I don't see that as a workflow improvement over "when you backport, make sure you move the Misc/NEWS entry too" in all cases. Again, TOOWTDI.

No, that's not how git works. If I do this:

$ touch Misc/News/my-cool-change.rst
$ git add Misc/News/my-cool-change.rst
$ git commit -m 'add my cool change'  # pretend this gave us the commit, 1a05a1a
$ compile-news  # This implicitly deletes Misc/News/* and compiles it to NEWS.rst
$ git commit add News.rst
$ git commit -m 'Release 3.7'
$ git checkout 3.6
$ git cherry-pick 1a05a1a

Then absolutely nothing you did after the 3rd command is going to have any affect on the cherry picked commit. I can't imagine a VCS working the way you seem to be describing. It's going to pull the entire commit into that branch.

Right now we have a miserable workflow wrt Misc/NEWS entries, particularly when back-porting / forward-porting / cherry-picking. I think the discrete files approach will make this less-bad. Having the canonical location be in one of two places muddies the waters, making the workflow more complicated and context-sensitive.

You basically never need to touch the "compiled" News.rst in this workflow. You can completely ignore it as a contributor and things will work just fine. The only singular time that you would need to directly edit it outside of running the release script is if you're editing a typo or something in a "historical" (i.e. already happened) news entry. In which case you are extremely unlikely to get a merge conflict unless two people edited the exact same line (in which case, no matter what you're getting a merge conflict).

My prototype generates Misc/NEWS from scratch every time. FWIW, making it parse Misc/NEWS so it can find the proper insertion points for all the new items would make it more complicated. Not an unmitigated disaster of complexity, but I prefer simple where I can get it.

I think this is possibly where we're not looking at the same thing with regards to the single file proposal. In this proposal you would not reparse the existing file at all. It's just a big chunk of arbitrary text that you prepend the new data onto. The only "parsing" you need to do is isntead of doing a "dumb" prepend ala cat file1 file2, you have to be slightly smart enough to prepend after the

+++++++++++
Python News
+++++++++++

block. Everything else is immaterial to the tool and simply doesn't matter, it is opqaue to it. This makes it even easier to convert to this process because we don't have to try and force all of the previous release NEWS into the correct shape. It's just more opqaue text.

Member

dstufft commented Feb 28, 2017

I guarantee that somebody you don't want to argue with (e.g. Guido) will be the special snowflake editing NEWS by hand. I think it's better to obviate the argument entirely.

I've never been afraid of arguing with Guido :), but given he's just about the only person who can decide by fiat that they're going to edit Misc/NEWS and that's that and everyone else just has to deal with it, I'd suggest instead of nebulous "but Guido might care!" we should just ask @gvanrossum.

What do you mean by "canonical files" here?

I do not mean Misc/NEWS, I mean all the little tiny files. cross file boundaries in the browser is less than great. Yes, grep can do either instance with roughly the same power, but grep isn't the lowest common denominator.

Can we ask Git itself? Would "git log" on the discrete file tell us the branch it was checked in under? (I just tried, and it seemed like the answer was "no", but I'm far from a Git expert so I wouldn't take my findings as authoritative.)

No. git has no idea what branch something was committed under. The only thing git can tell you about a commit is what branches/tags it currently exists in.

In any case, I don't understand your position here. You state this as if this is a critique of keeping the discrete files forever. But if this is really a problem, then we'll have this problem if we use the discrete files at all. Anybody cherry-picking / forward-porting / back-porting will have to deal with this.

You need some mechanism to indicate what version a particular "news snippet" is for.

If you keep each tiny file for all versions forever, then you need to structure either the file or the filename so as to keep that data. If that data is part of the filename or the file then it gets invalidated anytime a PR lives longer than the release it was originally created for.

If you have a singular file that has all of the "historical" news (i.e. not the release we're currently working on) and a directory that olds all of the "current" news snippets (i.e. the ones that are new in the release that will be released next) then you no longer have to track versions because it's either historical (in which case you don't care in the tooling) or it's for the next release (in which case it's in the directory as a snippet).

Worse yet, it means that there are two different workflows, depending on whether the cherry-pick is done before or after a release is cut. "Oh, you realized you need to backport ? Well, I just released 3.8b1, so your discrete NEWS entry file got deleted, you'll have to ". I suspect this second worfklow would include re-creating the discrete file in the backported branch, either by going back to before it was deleted and rescuing it, or by creating a new file and copying and pasting in from the generated NEWS file. I don't see that as a workflow improvement over "when you backport, make sure you move the Misc/NEWS entry too" in all cases. Again, TOOWTDI.

No, that's not how git works. If I do this:

$ touch Misc/News/my-cool-change.rst
$ git add Misc/News/my-cool-change.rst
$ git commit -m 'add my cool change'  # pretend this gave us the commit, 1a05a1a
$ compile-news  # This implicitly deletes Misc/News/* and compiles it to NEWS.rst
$ git commit add News.rst
$ git commit -m 'Release 3.7'
$ git checkout 3.6
$ git cherry-pick 1a05a1a

Then absolutely nothing you did after the 3rd command is going to have any affect on the cherry picked commit. I can't imagine a VCS working the way you seem to be describing. It's going to pull the entire commit into that branch.

Right now we have a miserable workflow wrt Misc/NEWS entries, particularly when back-porting / forward-porting / cherry-picking. I think the discrete files approach will make this less-bad. Having the canonical location be in one of two places muddies the waters, making the workflow more complicated and context-sensitive.

You basically never need to touch the "compiled" News.rst in this workflow. You can completely ignore it as a contributor and things will work just fine. The only singular time that you would need to directly edit it outside of running the release script is if you're editing a typo or something in a "historical" (i.e. already happened) news entry. In which case you are extremely unlikely to get a merge conflict unless two people edited the exact same line (in which case, no matter what you're getting a merge conflict).

My prototype generates Misc/NEWS from scratch every time. FWIW, making it parse Misc/NEWS so it can find the proper insertion points for all the new items would make it more complicated. Not an unmitigated disaster of complexity, but I prefer simple where I can get it.

I think this is possibly where we're not looking at the same thing with regards to the single file proposal. In this proposal you would not reparse the existing file at all. It's just a big chunk of arbitrary text that you prepend the new data onto. The only "parsing" you need to do is isntead of doing a "dumb" prepend ala cat file1 file2, you have to be slightly smart enough to prepend after the

+++++++++++
Python News
+++++++++++

block. Everything else is immaterial to the tool and simply doesn't matter, it is opqaue to it. This makes it even easier to convert to this process because we don't have to try and force all of the previous release NEWS into the correct shape. It's just more opqaue text.

@ericvsmith

This comment has been minimized.

Show comment
Hide comment
@ericvsmith

ericvsmith Feb 28, 2017

Member

Isn't the tool to conglomerate many entries into a single data file called "tar", or maybe "zip"?

Member

ericvsmith commented Feb 28, 2017

Isn't the tool to conglomerate many entries into a single data file called "tar", or maybe "zip"?

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 17, 2017

Contributor

@1st1: It wouldn't be hard to make blurb edit automatically fill in the issue number, but would that be a really big feature?

Right now there's no convenient way to re-edit the existing one. I didn't put the issue number in the filename because it didn't seem helpful--after all, it's not a one-to-one mapping, as there are occasionally multiple entries in NEWS for the same issue number, and also some NEWS entries mention multiple issues. On the other hand, if the main use case for this is simply "update the Misc/NEWS entry", something simple like remembering the last 20 Misc/NEWS entries you edited and opening the editor for you on the one of your choice might be sufficient.

As for cherry-picking: as designed, you cherry-pick the NEWS entry when you cherry-pick the change, and it automatically goes in the right place. The person cherry-picking literally doesn't have to do anything. If you read the documentation for blurb, see the section about The "next" directory.

@brettcannon: Sure, that wouldn't be hard. Right now the only metadata I have is the the section name. I could add more; for "What's New" I could add a notable flag. I'd store that simply as an extra bit of text in the filename, probably just after the section.

As implemented this wouldn't work well with blame tidy, but on the other hand my idea was that blame tidy wouldn't be run until a version was pretty moribund.

Alternately, I could make the data storage format for blurb more sophisticated. Right now the contents of each file is simply the text that goes into Misc/NEWS. I could add a metadata blob to the top. The "tidy" files would probably become even more sophisticated, like a simple archive format.

Contributor

larryhastings commented Mar 17, 2017

@1st1: It wouldn't be hard to make blurb edit automatically fill in the issue number, but would that be a really big feature?

Right now there's no convenient way to re-edit the existing one. I didn't put the issue number in the filename because it didn't seem helpful--after all, it's not a one-to-one mapping, as there are occasionally multiple entries in NEWS for the same issue number, and also some NEWS entries mention multiple issues. On the other hand, if the main use case for this is simply "update the Misc/NEWS entry", something simple like remembering the last 20 Misc/NEWS entries you edited and opening the editor for you on the one of your choice might be sufficient.

As for cherry-picking: as designed, you cherry-pick the NEWS entry when you cherry-pick the change, and it automatically goes in the right place. The person cherry-picking literally doesn't have to do anything. If you read the documentation for blurb, see the section about The "next" directory.

@brettcannon: Sure, that wouldn't be hard. Right now the only metadata I have is the the section name. I could add more; for "What's New" I could add a notable flag. I'd store that simply as an extra bit of text in the filename, probably just after the section.

As implemented this wouldn't work well with blame tidy, but on the other hand my idea was that blame tidy wouldn't be run until a version was pretty moribund.

Alternately, I could make the data storage format for blurb more sophisticated. Right now the contents of each file is simply the text that goes into Misc/NEWS. I could add a metadata blob to the top. The "tidy" files would probably become even more sophisticated, like a simple archive format.

@methane

This comment has been minimized.

Show comment
Hide comment
@methane

methane Mar 18, 2017

Member

As my current understanding:

towncrier reno blurb
individual file format simple rest yaml simple rest
individual file path Misc/News/<section>/bpo-NNNN.feature releasenotes/notes/bpo-NNNN-xxxxxxxx.yaml Misc/NEWS.d/next/<section>.XXXXX.rst
creating an en entry create file reno new blurb
released version Merged into Misc/NEWS. individual files are removed. ??? <next> is renamed to the version number. (e.g. 3.5.0a1)
  • NNNN is issue number, xxxxxxxxxxx is some hex string, XXXX is number unrelating issue number.
  • <section> is c-api in towncrier, C API in blurb.
  • How many files are in worktree if individual entry files are kept? In blurb example, find Misc/NEWS.d | wc -l is 1432.
  • towncrier don't require normal contributers to install it. It's only used for updating NEWS.
Member

methane commented Mar 18, 2017

As my current understanding:

towncrier reno blurb
individual file format simple rest yaml simple rest
individual file path Misc/News/<section>/bpo-NNNN.feature releasenotes/notes/bpo-NNNN-xxxxxxxx.yaml Misc/NEWS.d/next/<section>.XXXXX.rst
creating an en entry create file reno new blurb
released version Merged into Misc/NEWS. individual files are removed. ??? <next> is renamed to the version number. (e.g. 3.5.0a1)
  • NNNN is issue number, xxxxxxxxxxx is some hex string, XXXX is number unrelating issue number.
  • <section> is c-api in towncrier, C API in blurb.
  • How many files are in worktree if individual entry files are kept? In blurb example, find Misc/NEWS.d | wc -l is 1432.
  • towncrier don't require normal contributers to install it. It's only used for updating NEWS.
@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 18, 2017

Member

@larryhastings I suspect a notable flag as you suggest would definitely cover the basics. My suspicion, though, is taking the section descriptions from What's New to classify the impact would be more helpful (basically what towncrier's file extension support would get us). I guess that would mean the file's metadata would have the section of where a change landed and the reason a change is notable to even be a NEWS entry (and key reasons get called out in What's New).

Member

brettcannon commented Mar 18, 2017

@larryhastings I suspect a notable flag as you suggest would definitely cover the basics. My suspicion, though, is taking the section descriptions from What's New to classify the impact would be more helpful (basically what towncrier's file extension support would get us). I guess that would mean the file's metadata would have the section of where a change landed and the reason a change is notable to even be a NEWS entry (and key reasons get called out in What's New).

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 18, 2017

Contributor

@methane: That's close but not 100% accurate for blurb. The "individual file format" for blurb is ReST but doesn't start with "-". The "released version" for all three tools is Misc/NEWS. blurb's aggregation format (the "tidy" files) was merely an experiment, and it may be short-lived.

@brettcannon: Let's say blurb stored a second, optional bit of metadata. Let's call it whatsnew for clarity. It can be empty, or it can be one of the section names from a What's New document ("New Features", "Optimizations", "Improved Modules"). Does that cover all our metadata needs?

Where I'm going with this: computer scientists solve for 0, 1, or infinity. I'm mildly willing to special-case it if our metadata needs are 2 ("section" and "whatsnew"), but if it's getting any more complicated than that then I'd want to go back to the drawing board a little and have blurb handle an arbitrary amount of (simple) metadata.

Contributor

larryhastings commented Mar 18, 2017

@methane: That's close but not 100% accurate for blurb. The "individual file format" for blurb is ReST but doesn't start with "-". The "released version" for all three tools is Misc/NEWS. blurb's aggregation format (the "tidy" files) was merely an experiment, and it may be short-lived.

@brettcannon: Let's say blurb stored a second, optional bit of metadata. Let's call it whatsnew for clarity. It can be empty, or it can be one of the section names from a What's New document ("New Features", "Optimizations", "Improved Modules"). Does that cover all our metadata needs?

Where I'm going with this: computer scientists solve for 0, 1, or infinity. I'm mildly willing to special-case it if our metadata needs are 2 ("section" and "whatsnew"), but if it's getting any more complicated than that then I'd want to go back to the drawing board a little and have blurb handle an arbitrary amount of (simple) metadata.

@methane

This comment has been minimized.

Show comment
Hide comment
@methane

methane Mar 18, 2017

Member

@larryhastings sorry, fixed. 1400 files PR were too huge to find actual sample.

The "released version" for all three tools is Misc/NEWS. blurb's aggregation format (the "tidy" files) was merely an experiment, and it may be short-lived.

Do you mean it will be removed after NEWS generated?
Why NEWS for already released version is split in your PR?

Member

methane commented Mar 18, 2017

@larryhastings sorry, fixed. 1400 files PR were too huge to find actual sample.

The "released version" for all three tools is Misc/NEWS. blurb's aggregation format (the "tidy" files) was merely an experiment, and it may be short-lived.

Do you mean it will be removed after NEWS generated?
Why NEWS for already released version is split in your PR?

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Mar 18, 2017

Contributor

The /next/ addition to blurb addresses my main previous automation related concern with it (which was having version numbers in path names). Thanks Larry.

However, I'm still +1 on the towncrier proposal, and only +0 on the others, as the towncrier proposal is the only one that doesn't require all contributors to learn a new tool.

Because towncrier uses an external namespacing mechanism to inherently avoid naming conflicts, we can just tell people "Name the NEWS file after the BPO issue you're working on and the kind of change it is. If there isn't an issue, go create one, and name it after that".

By contrast, both reno and blurb place new barriers in the way of potential contributors:

  • reno requires that all contributors be able to run reno new (or else core devs have to do it for them)
  • blurb requires that all contributors be able to run blurb add (or else core devs have to do it for them)

That means we'd be putting ourselves on the hook for getting arbitrary people up and running with reno and/or blurb on arbitrary operating systems just so they can generate a non-conflicting filename for their NEWS entry, and I really want to avoid that extra educational and support work.

I think it's also worth noting that I know Amber (the creator of towncrier) well enough that I'm not worried about our ability to get reasonable feature requests reviewed and approved if we need to tweak anything about the exact details of the input or output formats. I also think our particular needs as a project are better aligned with those of Twisted than they are OpenStack's (i.e. primarily volunteer contributors rather than primarily professionally sponsored ones), without being unique enough to justify maintaining our own entirely custom tool (especially one that we expected all contributors and potential contributors to learn to use, not just release managers).

Contributor

ncoghlan commented Mar 18, 2017

The /next/ addition to blurb addresses my main previous automation related concern with it (which was having version numbers in path names). Thanks Larry.

However, I'm still +1 on the towncrier proposal, and only +0 on the others, as the towncrier proposal is the only one that doesn't require all contributors to learn a new tool.

Because towncrier uses an external namespacing mechanism to inherently avoid naming conflicts, we can just tell people "Name the NEWS file after the BPO issue you're working on and the kind of change it is. If there isn't an issue, go create one, and name it after that".

By contrast, both reno and blurb place new barriers in the way of potential contributors:

  • reno requires that all contributors be able to run reno new (or else core devs have to do it for them)
  • blurb requires that all contributors be able to run blurb add (or else core devs have to do it for them)

That means we'd be putting ourselves on the hook for getting arbitrary people up and running with reno and/or blurb on arbitrary operating systems just so they can generate a non-conflicting filename for their NEWS entry, and I really want to avoid that extra educational and support work.

I think it's also worth noting that I know Amber (the creator of towncrier) well enough that I'm not worried about our ability to get reasonable feature requests reviewed and approved if we need to tweak anything about the exact details of the input or output formats. I also think our particular needs as a project are better aligned with those of Twisted than they are OpenStack's (i.e. primarily volunteer contributors rather than primarily professionally sponsored ones), without being unique enough to justify maintaining our own entirely custom tool (especially one that we expected all contributors and potential contributors to learn to use, not just release managers).

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 18, 2017

Contributor

@methane: None of the blurb data files are deleted after NEWS is generated. NEWS is output-only; blurb never reads it. That means that the data for older versions is first converted into the blurb input file format (files containing individual items), like Misc/NEWS.d/3.6* in my sample.

@ncoghlan: If it's deemed to be important, the blurb workflow could easily be modified so that contributors could create the file by hand. All blurb add really does is write the template to disk, point your $EDITOR at it, then rename and stage the result. The template is really just there to guide the user to pick a valid NEWS section name. blurb could be modified to allow towncrier's workflow here--add your Misc/NEWS file to Misc/NEWS.d/new/<section> with a presumably-unique name.

On the other hand, Brett's currently asking for two bits of metadata:

  • The section the news item should go into in Misc/NEWS.

  • If it's notable enough, the suggested section the news item should go into in "What's New".

Other users have suggested other bits of metadata:

  • @westurner asked for a "security-relevant" flag

  • @warsaw suggested a number of metadata fields (backports, authors)

It's not clear to me how towncrier would handle these additional fields of metadata. Decorate the filename with them? Use an elaborate template which you get off a web page?

Also, I feel that towncrier's default file-naming approach is just a tiny bit underspecified. 95% of the time, there's a 1:1 mapping from NEWS item to bpo numbers. But there's that other 5%, where for example a bpo issue can result in several NEWS items. towncrier relies on bpo number providing a unique filename; it also permits adding NEWS entries that don't map to a bpo number, which are awkwardly named hide-<UUID4goeshere>.<ext>. Though perhaps this is easily solved by publishing a "generate a NEWS item filename for me" dynamic web page as part of the Dev Guide. (And, again, whatever we did for towncrier here, we could also easily adopt for blurb too.)

It's a sensible idea to not require contributors to use a new tool, if we can avoid it. I'm not sure installation / maintenance / use is all that big a deal; once the tool has matured, we could add it as a module to 3.x, then the user could use "python -m blurb" to add a news item. But "should not require use of an external tool to create a PR" was not one of the stated requirements, so if this is the deciding factor for which tool we choose--well, that's pretty lame.

Finally, I don't think writing and maintaining our own tool here is a big deal. It's not a complicated problem, and it wouldn't require a lot of code or ongoing maintenance. The blurb prototype is 800 lines, of which 150 are for blurb split which we'd only need at the beginning and could arguably subsequently remove. (Or move to an external maintained / abandoned tool.) I think it's better that we use a tool that does exactly what we need and which is totally under our control.

Contributor

larryhastings commented Mar 18, 2017

@methane: None of the blurb data files are deleted after NEWS is generated. NEWS is output-only; blurb never reads it. That means that the data for older versions is first converted into the blurb input file format (files containing individual items), like Misc/NEWS.d/3.6* in my sample.

@ncoghlan: If it's deemed to be important, the blurb workflow could easily be modified so that contributors could create the file by hand. All blurb add really does is write the template to disk, point your $EDITOR at it, then rename and stage the result. The template is really just there to guide the user to pick a valid NEWS section name. blurb could be modified to allow towncrier's workflow here--add your Misc/NEWS file to Misc/NEWS.d/new/<section> with a presumably-unique name.

On the other hand, Brett's currently asking for two bits of metadata:

  • The section the news item should go into in Misc/NEWS.

  • If it's notable enough, the suggested section the news item should go into in "What's New".

Other users have suggested other bits of metadata:

  • @westurner asked for a "security-relevant" flag

  • @warsaw suggested a number of metadata fields (backports, authors)

It's not clear to me how towncrier would handle these additional fields of metadata. Decorate the filename with them? Use an elaborate template which you get off a web page?

Also, I feel that towncrier's default file-naming approach is just a tiny bit underspecified. 95% of the time, there's a 1:1 mapping from NEWS item to bpo numbers. But there's that other 5%, where for example a bpo issue can result in several NEWS items. towncrier relies on bpo number providing a unique filename; it also permits adding NEWS entries that don't map to a bpo number, which are awkwardly named hide-<UUID4goeshere>.<ext>. Though perhaps this is easily solved by publishing a "generate a NEWS item filename for me" dynamic web page as part of the Dev Guide. (And, again, whatever we did for towncrier here, we could also easily adopt for blurb too.)

It's a sensible idea to not require contributors to use a new tool, if we can avoid it. I'm not sure installation / maintenance / use is all that big a deal; once the tool has matured, we could add it as a module to 3.x, then the user could use "python -m blurb" to add a news item. But "should not require use of an external tool to create a PR" was not one of the stated requirements, so if this is the deciding factor for which tool we choose--well, that's pretty lame.

Finally, I don't think writing and maintaining our own tool here is a big deal. It's not a complicated problem, and it wouldn't require a lot of code or ongoing maintenance. The blurb prototype is 800 lines, of which 150 are for blurb split which we'd only need at the beginning and could arguably subsequently remove. (Or move to an external maintained / abandoned tool.) I think it's better that we use a tool that does exactly what we need and which is totally under our control.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 18, 2017

Contributor

For "fun", I wrote a prototype "add server" for blurb. It's a web page that provides a form; you fill out the form, press Submit, and it gives you the filename to use and the text for the Misc/NEWS file. You copy and paste those things to your computer, add that file, and you're all set. No new tool required.

The "add server" is checked in to the blurb repo, and I've also added documentation showing what it (currently) looks like.

Contributor

larryhastings commented Mar 18, 2017

For "fun", I wrote a prototype "add server" for blurb. It's a web page that provides a form; you fill out the form, press Submit, and it gives you the filename to use and the text for the Misc/NEWS file. You copy and paste those things to your computer, add that file, and you're all set. No new tool required.

The "add server" is checked in to the blurb repo, and I've also added documentation showing what it (currently) looks like.

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Mar 18, 2017

Member

I don't want to get involved in micro implementation details, but since this will greatly affect all core devs and contributors, once you've settled on the details, could one of you please summarize the workflows, the files we'll have to write or commands we'll have to call, and what we'll see in the repo. python-dev or python-committers please.

Member

warsaw commented Mar 18, 2017

I don't want to get involved in micro implementation details, but since this will greatly affect all core devs and contributors, once you've settled on the details, could one of you please summarize the workflows, the files we'll have to write or commands we'll have to call, and what we'll see in the repo. python-dev or python-committers please.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Mar 18, 2017

Member

As with @ncoghlan, the introduction of the next directory solved the number one issue I saw with blurb, although I still think our time would be better spent not maintaining a tool that does this compared to doing something else more useful since there are already tools out there that I believe will function for CPython. (Although volunteer time is not fungible so it's possible @larryhastings would not spending the additional time on maintaining blurb on working on CPython!).

Towncrier does have the hide-<whatever>.<ext> files here, although that is largely just something I layered on top. Earlier we discussed requiring all changes to have a bpo entry if they are "notable" enough to go into the NEWS file, so that is one possible way of solving that issue without requiring a special bit of filenaming. The bit after hide- would not have be a uuid, it could be any sort of unique string, so it could be (as @ncoghlan suggested) hide-<user>-<per user description>.<ext>.

Towncrier does handle the "one news entry, multiple bugs being mentioned" issue by simply adding a file per bug with the exact same contents (eventually it all gets put into a dictionary liike {"news entry content: ["bpo-1234", "bpo-4556"]}). It does not have a mechanism for allowing the same bug report to be specified multiple times within the same section (within different sections is fine though). When I converted 3.7 to towncrier I did not come across any examples like that except for I think one where both entries were really the saying the same thing, just reworded differently. Without an example of this happening I can't come up with a good mechanism to handle it generically, though we could do something like bpo-1234.N.`` to allow it if we wanted to. I suspect that Amber would not be opposed to a feature like that.

Towncrier would not have metadata for things like authors and such, it is just a free-form text area, so it instead you would just mention the author in the text area or something along those lines.

We can add whatever sections we want, I just went with the default ones, but we could add a bpo-NNNN.security or add a security category or however we would best want to handle additional sections like that.

Member

dstufft commented Mar 18, 2017

As with @ncoghlan, the introduction of the next directory solved the number one issue I saw with blurb, although I still think our time would be better spent not maintaining a tool that does this compared to doing something else more useful since there are already tools out there that I believe will function for CPython. (Although volunteer time is not fungible so it's possible @larryhastings would not spending the additional time on maintaining blurb on working on CPython!).

Towncrier does have the hide-<whatever>.<ext> files here, although that is largely just something I layered on top. Earlier we discussed requiring all changes to have a bpo entry if they are "notable" enough to go into the NEWS file, so that is one possible way of solving that issue without requiring a special bit of filenaming. The bit after hide- would not have be a uuid, it could be any sort of unique string, so it could be (as @ncoghlan suggested) hide-<user>-<per user description>.<ext>.

Towncrier does handle the "one news entry, multiple bugs being mentioned" issue by simply adding a file per bug with the exact same contents (eventually it all gets put into a dictionary liike {"news entry content: ["bpo-1234", "bpo-4556"]}). It does not have a mechanism for allowing the same bug report to be specified multiple times within the same section (within different sections is fine though). When I converted 3.7 to towncrier I did not come across any examples like that except for I think one where both entries were really the saying the same thing, just reworded differently. Without an example of this happening I can't come up with a good mechanism to handle it generically, though we could do something like bpo-1234.N.`` to allow it if we wanted to. I suspect that Amber would not be opposed to a feature like that.

Towncrier would not have metadata for things like authors and such, it is just a free-form text area, so it instead you would just mention the author in the text area or something along those lines.

We can add whatever sections we want, I just went with the default ones, but we could add a bpo-NNNN.security or add a security category or however we would best want to handle additional sections like that.

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 18, 2017

(this is from a few days ago; I added headings)

#6 (comment)

new issue: codelabels / releaselog categories

python/cpython#552

If a single ticket belongs in multiple categories or types, you can make multiple files with different messages for each category or news type.

It's likely that something is both a bugfix (BUG) and a security (SEC) issue (or a removal and a security) issue.

  • a releasenote ("NEWS snippet") MAY have multiple releaselog categories

    • e.g. {bugfix, documentation, feature, removal} AND security
    • "blurb", "releasenote", "NEWS snippet"
  • a releaselog is a document suitable for inclusion in
    the python docs under whatsnew/

    • Misc/NEWS.rst
  • the releaselog contains structured data
    relevant to requirements traceability
    and release management.

  • a commit message MAY have "codelabels":

    • {BUG,DOC,ENH,DEP}, SEC
    • codelabels could also be helpful for CPython contributors
      who read the commit log

.

  • DRY: I don't want / need multiple releasenotes for e.g. a {new security feature} (ENH,SEC)
    • symlinks don't work on all platforms
    • How do I determine which versions are released with the ~same PR?
    • so, either:
      • a. the file is metadata:
        • YAML (reno, )
          • RST-in-YAML
          • YAML-LD yaml.dumps(json.loads(valid_jsonld))
      • OR
      • b. put metadata at the top of each file
        • RST field lists
        • YAML-in-RST
RST :field-lists:
:versions: 2.7, 3.6.1
:tags: bugfix, security
:pr: 123,
:issue: 22
:cve: 2011-1015

Data

  • Issues (id, name, description, {thread})
    • [codelabels]
  • Pull Requests (id, name, description, {thread})
    • [codelabels]
    • commit message(s)
  • Releaselogs
    • major.minor: 2.7, 3.6
    • composed of release notes
  • Releaselog releasenotes
    • Misc/NEWS.d/releasenote.ext --> Misc/NEWS.rst
    • Body, Text, schema:description _rst , schema:articleBody
    • Metadata
      • Releaselog categories
      • Issue numbers
      • Mentioned Issue numbers
      • Versions
        • commit revid(s) for each cherry-picking
          • [codelabels]
      • Security:
        • CVE number(s)
Releaselog file paths
Releaselog categories

Commit message conventions, releaselog sections:

  • cpython/Misc/NEWS.rst headings (and whatsnew.rst)
    • Feature
    • Removal
    • Bugfix
    • Documentation
    • Misc
    • Security

.

Hashtags?

Are hashtags potentially useful here?
In addition to or instead of (enhanced) docutils field lists
(:versions: 2.7, 3.6.1)?

  • create hashtag patterns (~#QName (#ns:123, #ns:"1 2 3"))
  • include hashtags in the releaselog releasenotes
  • parse and then strip them out when building Misc/NEWS.rst
    • convert to links / attributes / headings
    • generate releaselog

.

URI Patterns
JSON-LD
{"@context": {
    "bpo": "https://bug.python.org/issue",
    "pr": "https://github.com/python/cpython/pulls/",
    "py": "https://schema.python.org/v1#",
    "ver": "https://schema.python.org/v1#releases/",
    "tag": "https://github.com/python/cpython/tag/",
    "cve": "https://cvedetails.com/cve/CVE-",
    "t": "https://schema.python.org/v1#releaselog/tag/"
 },
 "@graph": [{
    "@type": "py:ReleaseLog",
    "name": "Python Misc/NEWS.rst",
    "notes": [{
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 ",
        "mentionedIssue": [ "bpo:21" ],
        "issue": [ "bpo:22" ],
        "cve": [ "cve:2011-1015" ],
        "pr": [ "pr:123" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:feature", "t:security" ]
    },
    {
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Fix thing #22 (closes #22) #bugfix #pr124 ",
        "mentionedIssue": [ "bpo:22" ],
        "issue": [ "bpo:22" ],
        "pr": [ "pr:124" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:bugfix" ]
    }
    ]
}
  • CVE OWL RDF RDFS Ontology:
    https://github.com/daedafusion/cyber-ontology/blob/master/cve/cve.rdf
  • As JSON, this is obviously too hard.
    • YAML-LD (this JSON-LD as YAML)
    • Or, re: docutils field lists and releaselogs:
      • Either:
        • Transform before concatenation
          (because repeated field list properties are not valid)
        • Extend docutils (to somehow infer an implicit e.g. heading from
          the releaselog filename)

Processes

Create a Pull Request:
  • create a Misc/NEWS.d/ entry ("releasenote")
  • commit message:
    • [include codelabels]
    • include issue #references
      • bugs.python.org/issue(\d+) URL
      • bpo-123 / 123
Release a Pull Request
  • (cherry-pick / merge)
    • Codelabels:
      • MRG (merge)
      • REL (!REF) / RLS (release)
  • For each release branch
    either (a) copy each file, or (b) append to the version metadata
    • a. cp releasenote-xyz.ext ../3.4/releasenote-xyz.ext
    • b. sed -i 's/(\:versions:)(.*)/\1\2, 3.4/' releasenote-xyz.rst
  • Regenerate ("recompile") Misc/NEWS.rst:
# Makefile
# ...
.PHONY: build-news commit-news build-news-commit

build-news:
    ./tool.py -i ./Misc/NEWS.d/ -o ./Misc/NEWS.rst

commit-news:
    git add Misc/NEWS.rst
    git commit -m 'DOC: Misc/NEWS.rst: :fast_forward: build w/ tool.py' ./Misc/NEWS.rst

build-news-commit: build-news commit-news

(this is from a few days ago; I added headings)

#6 (comment)

new issue: codelabels / releaselog categories

python/cpython#552

If a single ticket belongs in multiple categories or types, you can make multiple files with different messages for each category or news type.

It's likely that something is both a bugfix (BUG) and a security (SEC) issue (or a removal and a security) issue.

  • a releasenote ("NEWS snippet") MAY have multiple releaselog categories

    • e.g. {bugfix, documentation, feature, removal} AND security
    • "blurb", "releasenote", "NEWS snippet"
  • a releaselog is a document suitable for inclusion in
    the python docs under whatsnew/

    • Misc/NEWS.rst
  • the releaselog contains structured data
    relevant to requirements traceability
    and release management.

  • a commit message MAY have "codelabels":

    • {BUG,DOC,ENH,DEP}, SEC
    • codelabels could also be helpful for CPython contributors
      who read the commit log

.

  • DRY: I don't want / need multiple releasenotes for e.g. a {new security feature} (ENH,SEC)
    • symlinks don't work on all platforms
    • How do I determine which versions are released with the ~same PR?
    • so, either:
      • a. the file is metadata:
        • YAML (reno, )
          • RST-in-YAML
          • YAML-LD yaml.dumps(json.loads(valid_jsonld))
      • OR
      • b. put metadata at the top of each file
        • RST field lists
        • YAML-in-RST
RST :field-lists:
:versions: 2.7, 3.6.1
:tags: bugfix, security
:pr: 123,
:issue: 22
:cve: 2011-1015

Data

  • Issues (id, name, description, {thread})
    • [codelabels]
  • Pull Requests (id, name, description, {thread})
    • [codelabels]
    • commit message(s)
  • Releaselogs
    • major.minor: 2.7, 3.6
    • composed of release notes
  • Releaselog releasenotes
    • Misc/NEWS.d/releasenote.ext --> Misc/NEWS.rst
    • Body, Text, schema:description _rst , schema:articleBody
    • Metadata
      • Releaselog categories
      • Issue numbers
      • Mentioned Issue numbers
      • Versions
        • commit revid(s) for each cherry-picking
          • [codelabels]
      • Security:
        • CVE number(s)
Releaselog file paths
Releaselog categories

Commit message conventions, releaselog sections:

  • cpython/Misc/NEWS.rst headings (and whatsnew.rst)
    • Feature
    • Removal
    • Bugfix
    • Documentation
    • Misc
    • Security

.

Hashtags?

Are hashtags potentially useful here?
In addition to or instead of (enhanced) docutils field lists
(:versions: 2.7, 3.6.1)?

  • create hashtag patterns (~#QName (#ns:123, #ns:"1 2 3"))
  • include hashtags in the releaselog releasenotes
  • parse and then strip them out when building Misc/NEWS.rst
    • convert to links / attributes / headings
    • generate releaselog

.

URI Patterns
JSON-LD
{"@context": {
    "bpo": "https://bug.python.org/issue",
    "pr": "https://github.com/python/cpython/pulls/",
    "py": "https://schema.python.org/v1#",
    "ver": "https://schema.python.org/v1#releases/",
    "tag": "https://github.com/python/cpython/tag/",
    "cve": "https://cvedetails.com/cve/CVE-",
    "t": "https://schema.python.org/v1#releaselog/tag/"
 },
 "@graph": [{
    "@type": "py:ReleaseLog",
    "name": "Python Misc/NEWS.rst",
    "notes": [{
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 ",
        "mentionedIssue": [ "bpo:21" ],
        "issue": [ "bpo:22" ],
        "cve": [ "cve:2011-1015" ],
        "pr": [ "pr:123" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:feature", "t:security" ]
    },
    {
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Fix thing #22 (closes #22) #bugfix #pr124 ",
        "mentionedIssue": [ "bpo:22" ],
        "issue": [ "bpo:22" ],
        "pr": [ "pr:124" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:bugfix" ]
    }
    ]
}
  • CVE OWL RDF RDFS Ontology:
    https://github.com/daedafusion/cyber-ontology/blob/master/cve/cve.rdf
  • As JSON, this is obviously too hard.
    • YAML-LD (this JSON-LD as YAML)
    • Or, re: docutils field lists and releaselogs:
      • Either:
        • Transform before concatenation
          (because repeated field list properties are not valid)
        • Extend docutils (to somehow infer an implicit e.g. heading from
          the releaselog filename)

Processes

Create a Pull Request:
  • create a Misc/NEWS.d/ entry ("releasenote")
  • commit message:
    • [include codelabels]
    • include issue #references
      • bugs.python.org/issue(\d+) URL
      • bpo-123 / 123
Release a Pull Request
  • (cherry-pick / merge)
    • Codelabels:
      • MRG (merge)
      • REL (!REF) / RLS (release)
  • For each release branch
    either (a) copy each file, or (b) append to the version metadata
    • a. cp releasenote-xyz.ext ../3.4/releasenote-xyz.ext
    • b. sed -i 's/(\:versions:)(.*)/\1\2, 3.4/' releasenote-xyz.rst
  • Regenerate ("recompile") Misc/NEWS.rst:
# Makefile
# ...
.PHONY: build-news commit-news build-news-commit

build-news:
    ./tool.py -i ./Misc/NEWS.d/ -o ./Misc/NEWS.rst

commit-news:
    git add Misc/NEWS.rst
    git commit -m 'DOC: Misc/NEWS.rst: :fast_forward: build w/ tool.py' ./Misc/NEWS.rst

build-news-commit: build-news commit-news
@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 18, 2017

(i think codelabels are helpful signal that's relevant to writing releasenotes)

commit messages and codelabels

API: an (incompatible) API change
BLD: change related to building numpy
BUG: bug fix
DEP: deprecate something, or remove a deprecated object
DEV: development tool or utility
DOC: documentation
ENH: enhancement
MAINT: maintenance commit (refactoring, typos, etc.)
REV: revert an earlier commit
STY: style fix (whitespace, PEP8)
TST: addition or modification of tests
REL: related to releasing numpy
ENH: Feature implementation
BUG: Bug fix
STY: Coding style changes (indenting, braces, code cleanup)
DOC: Sphinx documentation, docstring, or comment changes
CMP: Compiled code issues, regenerating C code with Cython, etc.
REL: Release related commit
TST: Change to a test, adding a test. Only used if not directly related to a bug.
REF: Refactoring changes

Codelabels⬅

Codelabels (code labels) are three-letter codes with which commit messages can be prefixed.

CODE Label          color name      background  text
---- -------------- --------------- ----------  -------
BLD  build          light green     #bfe5bf     #2a332a
BUG  bug            red             #fc2929     #ffffff  (github default)
CLN  cleanup        light yellow    #fef2c0     #333026
DOC  documentation  light blue      #c7def8     #282d33
ENH  enhancement    blue            #84b6eb     #1c2733  (github default)
ETC  config
PRF  performance    deep purple     #5319e7     #ffffff
REF  refactor       dark green      #009800     #ffffff
RLS  release        dark blue       #0052cc     #ffffff
SEC  security       orange          #eb6420     #ffffff
TST  test           light purple    #d4c5f9     #2b2833
UBY  usability      light pink      #f7c6c7     #332829

DAT  data
SCH  schema

REQ  requirement
ANN  announcement

# Workflow Labels   (e.g. for waffle.io kanban board columns)
ready               dark sea green  #006b75     #ffffff
in progress         yellow          #fbca04     #332900

# GitHub Labels
duplicate           darker gray     #cccccc     #333333  (github default)
help wanted         green           #159818     #ffffff  (github default)
invalid             light gray      #e6e6e6     #333333  (github default)
question            fuschia         #cc317c     #ffffff  (github default)
wontfix             white           #ffffff     #333333  (github default)

Note: All of these color codes (except for fuschia)
are drawn from the default GitHub palette.

Note: There are 23 labels listed here.

Note
For examples with color swatches in alphabetical order, see https://github.com/westurner/dotfiles/labels

... codelabels are worth the effort because:

  • codelabels are great for free-coding
    • because nobody remembers what "added thing to fix it" from n years
      ago was
    • ENH,DOC,UBY: site.css: decrease header navbar padding (#22)
    • BUG,TST: module.py, test_module.py: handle bytestrings (fixes #21)
    • BUG,TST: handle bytestrings in module.py (fixes #21)
    • handle bytestrings in module.py
    • BLD,TST: Makefile: test: #tox -> pytest --pdb (#)
  • codelabels are great for maintainers
    • merging
    • cherry-picking
    • writing releaselog entries from commit logs

... Releaselogs and Codelabels are part of
a broader need for Change Management and Requirements Traceability:

westurner commented Mar 18, 2017

(i think codelabels are helpful signal that's relevant to writing releasenotes)

commit messages and codelabels

API: an (incompatible) API change
BLD: change related to building numpy
BUG: bug fix
DEP: deprecate something, or remove a deprecated object
DEV: development tool or utility
DOC: documentation
ENH: enhancement
MAINT: maintenance commit (refactoring, typos, etc.)
REV: revert an earlier commit
STY: style fix (whitespace, PEP8)
TST: addition or modification of tests
REL: related to releasing numpy
ENH: Feature implementation
BUG: Bug fix
STY: Coding style changes (indenting, braces, code cleanup)
DOC: Sphinx documentation, docstring, or comment changes
CMP: Compiled code issues, regenerating C code with Cython, etc.
REL: Release related commit
TST: Change to a test, adding a test. Only used if not directly related to a bug.
REF: Refactoring changes

Codelabels⬅

Codelabels (code labels) are three-letter codes with which commit messages can be prefixed.

CODE Label          color name      background  text
---- -------------- --------------- ----------  -------
BLD  build          light green     #bfe5bf     #2a332a
BUG  bug            red             #fc2929     #ffffff  (github default)
CLN  cleanup        light yellow    #fef2c0     #333026
DOC  documentation  light blue      #c7def8     #282d33
ENH  enhancement    blue            #84b6eb     #1c2733  (github default)
ETC  config
PRF  performance    deep purple     #5319e7     #ffffff
REF  refactor       dark green      #009800     #ffffff
RLS  release        dark blue       #0052cc     #ffffff
SEC  security       orange          #eb6420     #ffffff
TST  test           light purple    #d4c5f9     #2b2833
UBY  usability      light pink      #f7c6c7     #332829

DAT  data
SCH  schema

REQ  requirement
ANN  announcement

# Workflow Labels   (e.g. for waffle.io kanban board columns)
ready               dark sea green  #006b75     #ffffff
in progress         yellow          #fbca04     #332900

# GitHub Labels
duplicate           darker gray     #cccccc     #333333  (github default)
help wanted         green           #159818     #ffffff  (github default)
invalid             light gray      #e6e6e6     #333333  (github default)
question            fuschia         #cc317c     #ffffff  (github default)
wontfix             white           #ffffff     #333333  (github default)

Note: All of these color codes (except for fuschia)
are drawn from the default GitHub palette.

Note: There are 23 labels listed here.

Note
For examples with color swatches in alphabetical order, see https://github.com/westurner/dotfiles/labels

... codelabels are worth the effort because:

  • codelabels are great for free-coding
    • because nobody remembers what "added thing to fix it" from n years
      ago was
    • ENH,DOC,UBY: site.css: decrease header navbar padding (#22)
    • BUG,TST: module.py, test_module.py: handle bytestrings (fixes #21)
    • BUG,TST: handle bytestrings in module.py (fixes #21)
    • handle bytestrings in module.py
    • BLD,TST: Makefile: test: #tox -> pytest --pdb (#)
  • codelabels are great for maintainers
    • merging
    • cherry-picking
    • writing releaselog entries from commit logs

... Releaselogs and Codelabels are part of
a broader need for Change Management and Requirements Traceability:

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 20, 2017

Member

@larryhastings yep, I personally don't see a need for anything beyond NEWS section and What's New relevance. To be perfectly honest, I say we drop the NEWS sections and just use the What's New sections, but I would want people like @warsaw and @doko42 who have to care more about the NEWS file to say whether it is useful to them to know a change updated an extension nodule vs pure Python code.

As for the extra metadata like authors, backports, etc. I don't think that's necessary as that's covered by the git repo data. And I think I have seen security suggested, but that I believe that should just be a classification in and of itself and trump all other classifications of the news entry.

Member

brettcannon commented Mar 20, 2017

@larryhastings yep, I personally don't see a need for anything beyond NEWS section and What's New relevance. To be perfectly honest, I say we drop the NEWS sections and just use the What's New sections, but I would want people like @warsaw and @doko42 who have to care more about the NEWS file to say whether it is useful to them to know a change updated an extension nodule vs pure Python code.

As for the extra metadata like authors, backports, etc. I don't think that's necessary as that's covered by the git repo data. And I think I have seen security suggested, but that I believe that should just be a classification in and of itself and trump all other classifications of the news entry.

@warsaw

This comment has been minimized.

Show comment
Hide comment
@warsaw

warsaw Mar 20, 2017

Member

I've found the Misc/NEWS sections (including Library vs. Core/Builtin vs. Extension Modules) to be helpful when trying to sleuth out whether a downstream reported bug was caused by a change in upstream Python or not.

Member

warsaw commented Mar 20, 2017

I've found the Misc/NEWS sections (including Library vs. Core/Builtin vs. Extension Modules) to be helpful when trying to sleuth out whether a downstream reported bug was caused by a change in upstream Python or not.

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Mar 21, 2017

Contributor

+1 for what @warsaw noted - if something breaks after an upgrade, it's helpful to be able to go:

  • first check the What's New porting guide
  • then check the rest of What's New
  • then check the relevant section of Misc/NEWS
  • then check the whole of Misc/NEWS
  • then check the commit history

Sometimes "find in file" will short-circuit actually relying on the categorisation, but not always.

Contributor

ncoghlan commented Mar 21, 2017

+1 for what @warsaw noted - if something breaks after an upgrade, it's helpful to be able to go:

  • first check the What's New porting guide
  • then check the rest of What's New
  • then check the relevant section of Misc/NEWS
  • then check the whole of Misc/NEWS
  • then check the commit history

Sometimes "find in file" will short-circuit actually relying on the categorisation, but not always.

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 21, 2017

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Mar 21, 2017

Contributor

@westurner This is not the thread for you to try to tell operating system developers how to do our jobs (no thread is that thread). Reading docs can be done anywhere (and relatively quickly), while writing and running tests (especially under bisect) is far more time consuming and environmentally constrained (so you don't want to do it unless there's some reason to believe an upstream change may be at fault in the first place).

Contributor

ncoghlan commented Mar 21, 2017

@westurner This is not the thread for you to try to tell operating system developers how to do our jobs (no thread is that thread). Reading docs can be done anywhere (and relatively quickly), while writing and running tests (especially under bisect) is far more time consuming and environmentally constrained (so you don't want to do it unless there's some reason to believe an upstream change may be at fault in the first place).

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 21, 2017

@rbtcollins

This comment has been minimized.

Show comment
Hide comment
@rbtcollins

rbtcollins Mar 21, 2017

Member

@westurner they really are not even vaguely related. The issue is that for a consumer of an externally delivered package like Python, their code, that we can reasonably expect them to bisect is a consumer: the thing that has changed has changed as a large atomic unit comprising thousands of commits, bundled through at least two levels of delivery: Upstream -> distro, distro -> user. To successfully bisect in such a situation (and this is ignoring the complexities of interactions with non stdlib components that also required changes over the evolution of the change) requires the consumer to learn how to build Python; how to build it using the distro package rules; adjust those rules for older tree revisions where intermediate commits were never packaged...

Its a huge effort vs 'oh look, the release notes say that the traceback package has been reimplemented, and my failure was in traceback, so I should look closely at that bit of code and maybe the upstream commits for it'.

Member

rbtcollins commented Mar 21, 2017

@westurner they really are not even vaguely related. The issue is that for a consumer of an externally delivered package like Python, their code, that we can reasonably expect them to bisect is a consumer: the thing that has changed has changed as a large atomic unit comprising thousands of commits, bundled through at least two levels of delivery: Upstream -> distro, distro -> user. To successfully bisect in such a situation (and this is ignoring the complexities of interactions with non stdlib components that also required changes over the evolution of the change) requires the consumer to learn how to build Python; how to build it using the distro package rules; adjust those rules for older tree revisions where intermediate commits were never packaged...

Its a huge effort vs 'oh look, the release notes say that the traceback package has been reimplemented, and my failure was in traceback, so I should look closely at that bit of code and maybe the upstream commits for it'.

@berkerpeksag

This comment has been minimized.

Show comment
Hide comment
@berkerpeksag

berkerpeksag Mar 22, 2017

Member

FWIW, I prefer a custom tool that lives in the Python organization on GitHub. I don't really want to left comments like "can you please take a look at this?" every two weeks in order to get a simple fix merged.

Since it will be on the Python organization and all core devs will have commit rights, I don't think maintaining such tool will be an issue. I can help Larry with maintaining it if you want to see a list of maintainers.

Member

berkerpeksag commented Mar 22, 2017

FWIW, I prefer a custom tool that lives in the Python organization on GitHub. I don't really want to left comments like "can you please take a look at this?" every two weeks in order to get a simple fix merged.

Since it will be on the Python organization and all core devs will have commit rights, I don't think maintaining such tool will be an issue. I can help Larry with maintaining it if you want to see a list of maintainers.

@dstufft

This comment has been minimized.

Show comment
Hide comment
@dstufft

dstufft Mar 22, 2017

Member

I doubt that bugging people to get fixes merged is going to be that big of a problem, and if it ends up being so we can easily fork the thing at that point. Deciding up front that we need our own thing on the off chance we can't get fixes merged seems a bit wasteful when we can, at any time, fork whatever solution we use and start maintaining it if it becomes an issue.

Member

dstufft commented Mar 22, 2017

I doubt that bugging people to get fixes merged is going to be that big of a problem, and if it ends up being so we can easily fork the thing at that point. Deciding up front that we need our own thing on the off chance we can't get fixes merged seems a bit wasteful when we can, at any time, fork whatever solution we use and start maintaining it if it becomes an issue.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 22, 2017

Member

I agree with @dstufft that worrying about maintenance is a bit premature (e.g. we already rely on Sphinx for building our docs which has its own dependencies). Now if Larry makes blurb more attractive because he simply makes it do the exact thing we want and that happens to not be what towncrier does then that's a legitimate reason for having our own tool.

Member

brettcannon commented Mar 22, 2017

I agree with @dstufft that worrying about maintenance is a bit premature (e.g. we already rely on Sphinx for building our docs which has its own dependencies). Now if Larry makes blurb more attractive because he simply makes it do the exact thing we want and that happens to not be what towncrier does then that's a legitimate reason for having our own tool.

@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 22, 2017

Contributor

That was always my plan! Now if only we knew exactly what we wanted...

Contributor

larryhastings commented Mar 22, 2017

That was always my plan! Now if only we knew exactly what we wanted...

@1st1

This comment has been minimized.

Show comment
Hide comment
@1st1

1st1 Mar 22, 2017

Member

FWIW, I prefer a custom tool that lives in the Python organization on GitHub. I don't really want to left comments like "can you please take a look at this?" every two weeks in order to get a simple fix merged.

This. Let's agree on the actual workflow and let Larry implement it. FWIW argument clinic's clinic.py is a pleasure to work with and maintaining it is frictionless since it's part of CPython repo.

Member

1st1 commented Mar 22, 2017

FWIW, I prefer a custom tool that lives in the Python organization on GitHub. I don't really want to left comments like "can you please take a look at this?" every two weeks in order to get a simple fix merged.

This. Let's agree on the actual workflow and let Larry implement it. FWIW argument clinic's clinic.py is a pleasure to work with and maintaining it is frictionless since it's part of CPython repo.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 24, 2017

Member

I've rejected reno as an option. Thanks to @dhellmann and @rbtcollins for the time and effort put into proposing it. In the end the fact that no one voted positively for it and the YAML format led to more formatting than necessary for the common case compared to towncrier or blurb makes me think it isn't the best fit for us.

Member

brettcannon commented Mar 24, 2017

I've rejected reno as an option. Thanks to @dhellmann and @rbtcollins for the time and effort put into proposing it. In the end the fact that no one voted positively for it and the YAML format led to more formatting than necessary for the common case compared to towncrier or blurb makes me think it isn't the best fit for us.

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Mar 25, 2017

Contributor

A thought in regards to blurb as a CPython-specific tool: with it being just-for-us, there'd be more opportunities to make it aware of other workflow tools like bugs.python.org itself (especially if we follow up on @soltysh's efforts to incorporate the GSoC work that added a Roundup REST API: http://psf.upfronthosting.co.za/roundup/meta/issue581 ).

With a CPython-specific tool, CPython-specific service integrations aren't a problem. With a general purpose tool, we'd either need additional scripting around it (effectively creating our own tool anyway), or else come up with configurable solutions, rather than just handling the specific services we care about.

Given how much more complex the CPython development process is than a more typical single-release-stream Python project, that seems like it could be a recipe for future conflict (to put it in relative terms: when it comes to process complexity, CPython is to most other projects as OpenStack is to CPython)

Contributor

ncoghlan commented Mar 25, 2017

A thought in regards to blurb as a CPython-specific tool: with it being just-for-us, there'd be more opportunities to make it aware of other workflow tools like bugs.python.org itself (especially if we follow up on @soltysh's efforts to incorporate the GSoC work that added a Roundup REST API: http://psf.upfronthosting.co.za/roundup/meta/issue581 ).

With a CPython-specific tool, CPython-specific service integrations aren't a problem. With a general purpose tool, we'd either need additional scripting around it (effectively creating our own tool anyway), or else come up with configurable solutions, rather than just handling the specific services we care about.

Given how much more complex the CPython development process is than a more typical single-release-stream Python project, that seems like it could be a recipe for future conflict (to put it in relative terms: when it comes to process complexity, CPython is to most other projects as OpenStack is to CPython)

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Mar 28, 2017

Collaborator

@ncoghlan I can agree on querying bpo about the issues solved in past release (using that REST endpoint), is that sufficient for the release notes?

Collaborator

soltysh commented Mar 28, 2017

@ncoghlan I can agree on querying bpo about the issues solved in past release (using that REST endpoint), is that sufficient for the release notes?

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 30, 2017

@soltysh @WadeFoster @mikeknoop

(modified) From #6 (comment) ("JSON-LD") :

{"@context": {
    "py": "https://schema.python.org/v1#",
    "bpo": "https://bug.python.org/issue",
    "pr": "https://github.com/python/cpython/pulls/",
    "ver": "https://schema.python.org/v1#releases/",
    "t": "https://schema.python.org/v1#releaselog/tag/",
    "pyreltag": "https://schema.python.org/v1#releaselog/tag/",

    "label": "https://github.com/python/cpython/labels/",

    "cved": "https://cvedetails.com/cve/CVE-",

    "name": { "@id": "schema:name" },
    "description": { "@id": "schema:description" },

    "notes": { "@id": "py:notes", "@container": "@list"},
    "issue": { "@id": "py:issue", "@container": "@list"},
    "mentionedIssue": { "@id": "py:issue", "@container": "@list"},
    "versions": { "@id": "py:versions", "@container": "@list"},
    "pr": { "@id": "py:versions", "@container": "@list"},
    "cve": { "@id": "py:versions", "@container": "@list"},
    "tags": { "@id": "py:versions", "@container": "@list"}
 },
 "@graph": [{
    "@type": "py:ReleaseLog",
    "name": "Python Misc/NEWS.rst",
    "notes": [{
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 ",
        "issue": [ "bpo:22" ],
        "mentionedIssue": [ "bpo:21", "bpo:22" ],
        "cve": [ "cved:2011-1015" ],
        "pullRequests": [ "cpypr:123" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:feature", "t:security" ]
    },
    {
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Fix thing #22 (closes #22) #bugfix #pr124 ",
        "issue": [ "bpo:22" ],
        "mentionedIssue": [ "bpo:22" ],
        "pullRequest": [ "cpypr:124" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:bugfix" ]
    }
    ]
}

westurner commented Mar 30, 2017

@soltysh @WadeFoster @mikeknoop

(modified) From #6 (comment) ("JSON-LD") :

{"@context": {
    "py": "https://schema.python.org/v1#",
    "bpo": "https://bug.python.org/issue",
    "pr": "https://github.com/python/cpython/pulls/",
    "ver": "https://schema.python.org/v1#releases/",
    "t": "https://schema.python.org/v1#releaselog/tag/",
    "pyreltag": "https://schema.python.org/v1#releaselog/tag/",

    "label": "https://github.com/python/cpython/labels/",

    "cved": "https://cvedetails.com/cve/CVE-",

    "name": { "@id": "schema:name" },
    "description": { "@id": "schema:description" },

    "notes": { "@id": "py:notes", "@container": "@list"},
    "issue": { "@id": "py:issue", "@container": "@list"},
    "mentionedIssue": { "@id": "py:issue", "@container": "@list"},
    "versions": { "@id": "py:versions", "@container": "@list"},
    "pr": { "@id": "py:versions", "@container": "@list"},
    "cve": { "@id": "py:versions", "@container": "@list"},
    "tags": { "@id": "py:versions", "@container": "@list"}
 },
 "@graph": [{
    "@type": "py:ReleaseLog",
    "name": "Python Misc/NEWS.rst",
    "notes": [{
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 ",
        "issue": [ "bpo:22" ],
        "mentionedIssue": [ "bpo:21", "bpo:22" ],
        "cve": [ "cved:2011-1015" ],
        "pullRequests": [ "cpypr:123" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:feature", "t:security" ]
    },
    {
        "@type": "py:ReleaseNote",
        "name": None,
        "description":
            "Fix thing #22 (closes #22) #bugfix #pr124 ",
        "issue": [ "bpo:22" ],
        "mentionedIssue": [ "bpo:22" ],
        "pullRequest": [ "cpypr:124" ],
        "versions": [ "ver:2.7", "ver:3.6.1" ],
        "tags": [ "t:bugfix" ]
    }
    ]
}
@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 30, 2017

... The builds: matrix is obviously more information than is necessary for a release log.

... The builds: matrix is obviously more information than is necessary for a release log.

@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 30, 2017

Member

@westurner please stop dumping your personal notes here. They are not contributing anything useful to this thread (e.g. none of us need a link to GitHub's API just pasted in a list as we all know how to use a search engine and there is no relevancy here for JSON-LD).

You have now been warned twice in this thread about your posting habits. I know you mean well, but please keep your posts concise and on-point or else I will block you from this issue tracker for not being respectful of other people's time.

Member

brettcannon commented Mar 30, 2017

@westurner please stop dumping your personal notes here. They are not contributing anything useful to this thread (e.g. none of us need a link to GitHub's API just pasted in a list as we all know how to use a search engine and there is no relevancy here for JSON-LD).

You have now been warned twice in this thread about your posting habits. I know you mean well, but please keep your posts concise and on-point or else I will block you from this issue tracker for not being respectful of other people's time.

@westurner

This comment has been minimized.

Show comment
Hide comment
@westurner

westurner Mar 30, 2017

Or you could store release note links/edges as restructuredtext line blocks:

Entry 1
========
 "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 
| issue: bpo-22
| cve: cved:2011-1015
| pullRequests: cpypr:123
| versions: 2.7, 3.6.1
| tags: "feature", "security"
-  "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 
  - | issue: bpo-22
  - | cve: cved:2011-1015
  - | pullRequests: cpypr:123
  - | versions: 2.7, 3.6.1
  - | tags: "feature", "security"

Or you could store release note links/edges as restructuredtext line blocks:

Entry 1
========
 "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 
| issue: bpo-22
| cve: cved:2011-1015
| pullRequests: cpypr:123
| versions: 2.7, 3.6.1
| tags: "feature", "security"
-  "Add *argument* to function (closes #21) #feature #security #cve-2011-1015 #pr123 
  - | issue: bpo-22
  - | cve: cved:2011-1015
  - | pullRequests: cpypr:123
  - | versions: 2.7, 3.6.1
  - | tags: "feature", "security"
@larryhastings

This comment has been minimized.

Show comment
Hide comment
@larryhastings

larryhastings Mar 31, 2017

Contributor

I haven't found the time to sit down and write this properly, so here's a quick note on this topic before Brett makes up his mind. sorry if it's a bit long / messy.

First, I don't have a strong opinion about what the input format to blurb should look like. If there was a consensus about "it should look like X", then I'd make it look like X.

We don't seem to have a consensus about what the input format should look like, because I don't think we've reached consensus about what metadata the tool needs. We need to figure that out first.

Obviously necessary:

  1. the Misc/NEWS text
  2. the Misc/NEWS category

I would also like:
3. some datestamp/nonce that ensures the news entries remain in some sort of stable order (I prefer chronological ordering, sadly git doesn't maintain timestamps)

I believe Brett is also asking for:
4. an optional "please consider for the next What's New document" flag
5. a suggested category for "What's New"

By the way, towncrier's approach of pre-created directories named for the categories (2.) is a nice idea. That ensures people don't misspell the category name. blurb could easily switch to that. the only downside I know of is that iiuc git doesn't track directories as first-class objects, so we'd have to have an extra empty file in each directory.

blurb current supports 1-3. it uses the filename for two bits of metadata (stable sorting order and category), and the contents of the file are simply the news entry. but it's kind of reached my comfort level regarding storing metadata in the filename.

if we only want to add 4, a simple "consider for what's new please", then okay I think we could live with sticking that in the filename. Like we add .wn just before the extension, for example.

if we want to add 5, then my inclination is to add a simple metadata blob to the contents of the file:

  • simple name=value (or name: value) pairs
  • # is a line comment
  • empty line or some explicit marker line ("--") ends the metadata blob

If we do that, then my inclination is further to move all the metadata into that blob:

  • category=Library
  • nonce=20170513062235.ef4c88a1
  • # what's new = Improved Modules

(uncomment "what's new" to use it)

The "blurb" tool would make it easy to add these, but users could also create the file by hand using a web page form that formats the output for them. (Making the entry entirely by hand might be tricky, since the nonce should be in a standardized format. Maybe we could give them a short blob of Python they run to generate one?)

If all the metadata lives inside the file, then we don't care what the filename is, it just needs to be unique.

Contributor

larryhastings commented Mar 31, 2017

I haven't found the time to sit down and write this properly, so here's a quick note on this topic before Brett makes up his mind. sorry if it's a bit long / messy.

First, I don't have a strong opinion about what the input format to blurb should look like. If there was a consensus about "it should look like X", then I'd make it look like X.

We don't seem to have a consensus about what the input format should look like, because I don't think we've reached consensus about what metadata the tool needs. We need to figure that out first.

Obviously necessary:

  1. the Misc/NEWS text
  2. the Misc/NEWS category

I would also like:
3. some datestamp/nonce that ensures the news entries remain in some sort of stable order (I prefer chronological ordering, sadly git doesn't maintain timestamps)

I believe Brett is also asking for:
4. an optional "please consider for the next What's New document" flag
5. a suggested category for "What's New"

By the way, towncrier's approach of pre-created directories named for the categories (2.) is a nice idea. That ensures people don't misspell the category name. blurb could easily switch to that. the only downside I know of is that iiuc git doesn't track directories as first-class objects, so we'd have to have an extra empty file in each directory.

blurb current supports 1-3. it uses the filename for two bits of metadata (stable sorting order and category), and the contents of the file are simply the news entry. but it's kind of reached my comfort level regarding storing metadata in the filename.

if we only want to add 4, a simple "consider for what's new please", then okay I think we could live with sticking that in the filename. Like we add .wn just before the extension, for example.

if we want to add 5, then my inclination is to add a simple metadata blob to the contents of the file:

  • simple name=value (or name: value) pairs
  • # is a line comment
  • empty line or some explicit marker line ("--") ends the metadata blob

If we do that, then my inclination is further to move all the metadata into that blob:

  • category=Library
  • nonce=20170513062235.ef4c88a1
  • # what's new = Improved Modules

(uncomment "what's new" to use it)

The "blurb" tool would make it easy to add these, but users could also create the file by hand using a web page form that formats the output for them. (Making the entry entirely by hand might be tricky, since the nonce should be in a standardized format. Maybe we could give them a short blob of Python they run to generate one?)

If all the metadata lives inside the file, then we don't care what the filename is, it just needs to be unique.

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Mar 31, 2017

Contributor

For metadata-in-the-file, I quite like the format that the Nikola static blog generator uses:

.. title: The Python Packaging Ecosystem
.. slug: python-packaging-ecosystem
.. date: 2016-09-17 03:46:31 UTC
.. tags: python
.. category: python
.. link: 
.. description: Overview of the Python Packaging Ecosystem
.. type: text

Post starts here...

As an added bonus, when the file has the .rst extension, my editor automatically grays out the metadata as line comments, and assuming we're planning to use ReST in the snippets for ease of Sphinx integration, that would also apply here.

Using structured metadata like that would also open up future options for acknowledgements that aren't directly tracked in the git metadata - cases where we built on a patch written by someone else, or someone contributed API design ideas that someone else implemented, etc. At the moment we put that in the snippet body ("Initial patch by ..." and so forth), but a metadata field could more easily feed into ideas like auto-generating Misc/ACKS in addition to Misc/NEWS.

As far as a stable sort algorithm for display goes, we could then define that as:

  • a date field in the snippet metadata (e.g. the date string Nikola uses is just datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S UTC"))
  • the filename used for the snippet (since that has to be non-conflicting or git will complain)
Contributor

ncoghlan commented Mar 31, 2017

For metadata-in-the-file, I quite like the format that the Nikola static blog generator uses:

.. title: The Python Packaging Ecosystem
.. slug: python-packaging-ecosystem
.. date: 2016-09-17 03:46:31 UTC
.. tags: python
.. category: python
.. link: 
.. description: Overview of the Python Packaging Ecosystem
.. type: text

Post starts here...

As an added bonus, when the file has the .rst extension, my editor automatically grays out the metadata as line comments, and assuming we're planning to use ReST in the snippets for ease of Sphinx integration, that would also apply here.

Using structured metadata like that would also open up future options for acknowledgements that aren't directly tracked in the git metadata - cases where we built on a patch written by someone else, or someone contributed API design ideas that someone else implemented, etc. At the moment we put that in the snippet body ("Initial patch by ..." and so forth), but a metadata field could more easily feed into ideas like auto-generating Misc/ACKS in addition to Misc/NEWS.

As far as a stable sort algorithm for display goes, we could then define that as:

  • a date field in the snippet metadata (e.g. the date string Nikola uses is just datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S UTC"))
  • the filename used for the snippet (since that has to be non-conflicting or git will complain)
@brettcannon

This comment has been minimized.

Show comment
Hide comment
@brettcannon

brettcannon Mar 31, 2017

Member

I've decided to go with blurb. The fact that we're still discussing what we want to carry forward in the entries suggests we need as much flexibility in the tooling as possible. Thanks @dstufft for putting the work in to put towncrier forward (and once again to @dhellmann and @rbtcollins for reno). I assume we will check it into Tools/ so it is carried with the repo for easy use by everyone and to make updating it easy (if I thought other teams might use it then I might argue for putting it into its own repo, but I don't think anyone will so I'm not going to suggest that).

I've started #66 for discussing how we want to format the entries since this issue is gotten rather long and slightly unwieldy.

Member

brettcannon commented Mar 31, 2017

I've decided to go with blurb. The fact that we're still discussing what we want to carry forward in the entries suggests we need as much flexibility in the tooling as possible. Thanks @dstufft for putting the work in to put towncrier forward (and once again to @dhellmann and @rbtcollins for reno). I assume we will check it into Tools/ so it is carried with the repo for easy use by everyone and to make updating it easy (if I thought other teams might use it then I might argue for putting it into its own repo, but I don't think anyone will so I'm not going to suggest that).

I've started #66 for discussing how we want to format the entries since this issue is gotten rather long and slightly unwieldy.

@ncoghlan

This comment has been minimized.

Show comment
Hide comment
@ncoghlan

ncoghlan Apr 1, 2017

Contributor

I'd suggest putting it in the core-workflow repo or keeping it in its own repo, rather than putting it into Tools.

Tools is OK for things that don't change very often (e.g. reindent.py), and for things where it's OK for new features to only go into new versions (e.g. Argument Clinic), but it's a pain for anything that's still under active development and needs to behave consistently across branches.

Contributor

ncoghlan commented Apr 1, 2017

I'd suggest putting it in the core-workflow repo or keeping it in its own repo, rather than putting it into Tools.

Tools is OK for things that don't change very often (e.g. reindent.py), and for things where it's OK for new features to only go into new versions (e.g. Argument Clinic), but it's a pain for anything that's still under active development and needs to behave consistently across branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment