[PROPOSAL] Introduce in automatically generated release notes #2677

ZilongX · 2022-10-27T04:29:19Z

Proposal in one sentence

Enable the automation of release notes generations for OpenSearch-Dashboards repo

What problem are you trying to solve?

Currently, both Release Notes and CHANGELOG are updated mainly by manual efforts.

For the Release Notes, each release would required a PR to add respective markdown notes file to release-notes folder :
- Folder path : https://github.com/opensearch-project/OpenSearch-Dashboards/tree/main/release-notes.
- Example release note PR for v1.3.6 : Adds v1.3.6 release notes #2480
- Example release note PR for v2.3.0 : Add v2.3.0 release notes #2318
For the CHANGELOG, even worse, it requires a direct update in the CHANGELOG file literally in every single PR :
- Example PR : [CVE] Bump prismjs to 1.29.0 to fix CVE-2022-23647 #2668
- Example PR : Apply get indice error handling in step index pattern #2652
- Example PR : Change save object type, wizard id and name to visBuilder #2673

These manual updates are introducing in unnecessary burdens into each and every single release and PR, which is not serving the best interest of the OSD project and the whole community.
We need to improve our current process and make the sharing of release/change information smoother and less-painful for both our consumers and contributors.

Why should we solve it?

Release Notes / CHANGELOG are important, we definitely need to have a clear and concise way of telling people what has changed, when the changes happened and more details (like PR link) whenever needed. Yet the collection of these information should be in a much smarter, or in another word, automatic way. The current manual process should be revised ASAP, given some specific reasons including :

The current manual way is Time-consuming
- For Release Notes, each release manager will need to spend a significant time preparing the release notes file, by either comparing commits or retrieving the data from CHANGELOG. Comparing commits definitely sounds horrible, and getting data from the CHANGELOG may sound straightforward, but in real we are just shifting the burden from release managers to each individual PR contributors (by asking them to update the CHANGELOG in every PR). Based on the previously generated release notes it may take at least 30min ~ 1hour just to put the information together in one PR (tagging each release manager just to get more accurate data)
- For CHANGELOG, the content update itself may be not intense but process-wise it's taking more efforts since the PR number is required in each changed items as a reference, and usually the PR number won't be secured until the PR is created (next PR number can be retrieved through github API call, but still, extra work). Given this there are lots of forced commits after the PR generation which is just to get the CHANGELOG file updated, for example :
The current manual way is prone to Human Errors
- We are all human beings and Human beings make errors. While we stay inclusive when human errors happened, it's better for us to find an automatic way to minimize the chance of its happening. Recently I have learned this story from @ashwin-pc 's sharing and would like to share it here as well as an example that how could we create and maintain a good and better community as a whole :
  - Basically for this PR Ability to start OS Dashboards with a newer and compatible nodejs version #2091, our contributor here had all the changes ready, yet later an entry in CHANGELOG is required, the contributor made the change yet didn't sign off the very commit (cause usually we ask all commits to be signed). We made decisions as this is OK since the squash commit would have the signoff.
  - Yet, to dive deep, why can't we just let each contributor to stay focused on their actual changes and just leave the maintenance work of changelog to us (maintainers , release managers and uhh enthusiastic contributors..)

How do you propose to solve it?

One word - Automation

Specifically, enable the automation of release notes generations, which is a native github offering (ref doc : https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes

Kudos to @seraphjiang as first proposed and implemented this feature in the dashboards-anywhere repo : opensearch-project/dashboards-anywhere#90

Here is proposed improvement steps in detail :

Create a new file .github/release.yml to enable the automation for release notes generation
Define and update the label template in .github/release.yml (we can reuse the format in our current CHANGELOG file)
Create / update label suites if needed
Test the updates (in a fork), by creating testing releases and make sure release notes are generated as expected

Once the automation is tuned as expected, next step is proposed :

Remove the CHANGELOG update requirement in each individual PR
Update the CHANGELOG based on the automatically generated release notes
Release Notes preparation is still a mandatory step in the release process, yet this time it would fully depend on the automation process instead of manual efforts.

The text was updated successfully, but these errors were encountered:

ashwin-pc · 2022-10-27T05:18:42Z

Having seen it being used in the other repo's I can attest that this is much better! Thanks @seraphjiang and @ZilongX. But, since i've never actually had to generate a release notes or a changelog for an OSD release, I'd love to hear from @ananzh and others who have actually gone through the process of generating one to see if this meets the requirements. And if not, how can we adapt it such that it does.

mihirsoni · 2022-10-27T16:42:24Z

To make it super productive we need to ensure that we have very high bar on the commit messages while we merge PRs, ensure they we use squash merge to reduce noisey commits. In general these automated generator takes the commit messages.

seraphjiang · 2022-10-27T17:27:55Z

To make it super productive we need to ensure that we have very high bar on the commit messages while we merge PRs, ensure they we use squash merge to reduce noisey commits. In general these automated generator takes the commit messages.

Agree, we need to ensure high bar on commit message. and ensure use squash merge.
The automated generator takes from PR (commit) as my observation.

The effort to ensure high quality commit message for PR and high quality for change log message are same. To save effort, we should focus on ensure high bar on commit message anyway.

seraphjiang · 2022-10-27T17:30:06Z

@ZilongX @ashwin-pc @ananzh @AMoo-Miki

suggest to add releease.yml and have a try in 2.4.0 release, compare the auto generated note vs manual changelog so we know the gap.

zhongnansu · 2022-10-28T00:19:27Z

@ZilongX Great proposal. Change log today is making lots of backport PRs to fail. Because different PR are touching the same file and even worse, same section. Backport automation bots fail a lot. I've had a nightmare working on backport PRs. I like the idea to target for automation. The key seems to correctly label PRs, and as mentioned, have high bar of commit messages. Again we can't rely on human, we should rely on automation or checks in CI to prevent "bad" PRs (poorly labeled, or with poor commit message). I wonder if you have explored any existing solution, or github workflows/actions.

Also I think we still have these 2 open issues discussing the release automation. It talks about using https://github.com/release-drafter/release-drafter. It seems pretty similar to github native release notes automation. I wonder if you have thought about using that, and what's the pro and cons.

Just noticed that in the next major plan for release-drafter, they have plans to onboard a "Pull Request Label Checker". release-drafter/release-drafter#1173

kristenTian · 2022-10-28T18:10:22Z

Thanks Zilong, really had the pain to resolve conflicts on Changelog when it is manual work, also people need to require random X times of approval base on the X times they need to rebase.

I also would like to try this new approach and if not sufficient for our use case can migrate to other tools like @zhongnansu mentioned. In that case, given that these tool only impact on newly created PR so even the migration should be backward compatible.

kavilla · 2022-10-31T20:48:37Z

Thanks for creating this issue @ZilongX,

I think there was discussion in another repo but I am not able to find it so I will recapture some of the comments I posted there.

Historically, any automation was really painful for the release owner still because the commits were not descriptive enough. As @mihirsoni stated, we should have a high bar for commit messages and that point we are still requiring contributors to ensure they are properly doing things. Which I don't think is a bad thing.

in real we are just shifting the burden from release managers to each individual PR contributors

I think this should be the case, in an ideal state there should be no release managers. Release managers are required because releasing is still a very manual process but the OpenSearch project is a shared governance model. Not to say that this is a reason we should keep the CHANGELOG but I wouldn't consider it a bad thing if we raise the bar for commits (ie increase the burden on contributors).

Another historically thing that makes this painful is that our release-notes purposefully did not include commits that were already released. We can revisit this if this is too hard but how do we plan on addressing this? So now a single person will have to generate the release notes and revisit the previous releases an ensure that there is no duplication? Which kind of seems this puts us back to making a release notes generation process a blocker and a couple hours for the person in charge of the release instead of requiring contributors or maintainers to handle a conflict.

The conflicts seem bad right now but I think right now it's because the CHANGELOG was really not intended for feature development. A lot of the conflicts I have seen has been due to the increase speed of contributions into the repo which I would want to reflect if the idea to keep in the original merge for the feature even though it has a lot of fast follows and bug fixes prior to an official release was a good idea. Like if there was a required increase of commits that then caused multiple updates to the CHANGELOG daily then would have reverting the originally commit that caused this influx would have also relieved this pain point. There also would not have been any backport PRs required for MD and multiple reviewers if we have originally reverted the initial commit. But that is the past but I would be hesitant to delete the CHANGELOG based on issues that caused from us probably not following best practices.

The CHANGELOG verifier action is being updated to be able to skip files if in the future another feature development comes along another proposal is if we just skip the check for adding to the changelog if the feature hasn't even been released. Like in reality for the MD project we could have just added a single bullet point for MD and then keep adding PRs to that single bullet point.

One other point is that the CHANGELOG stuff is an OpenSearch project thing. I can see the benefit of the repos matching the general schema. If I work in the OpenSearch repo to add a feature and then it requires me to add a CHANGELOG and then want to add the corresponding OpenSearch Dashboards repo feature but don't have to add a CHANGELOG I would be slightly confused at the lack of standard for contributor requirements. So perhaps this discussion is better held within the .github repo and picked up by all teams.

But I think the biggest driving factor why we moved from automated release-notes was the issue of release cadence and when commits get released and avoid duplication in release notes which was a huge blocker for every release as stated above.

seraphjiang · 2022-11-01T03:35:18Z

@kavilla

One other point is that the CHANGELOG stuff is an OpenSearch project thing.

This per repo things for now. Only OpenSearch and OpenSearch Dashboards tried, and no other repo is adopting it. We have discussed with @dblock and he agree to experiment in different repo.

opensearch-project/OpenSearch#4769

But I think the biggest driving factor why we moved from automated release-notes was the issue of release cadence and when commits get released and avoid duplication in release notes which was a huge blocker for every release as stated above.

The way github's release note generated based on PR not the commit, it supports ignore labels to skip the PRs are not needed.

e.g.

changelog:
  exclude:
    labels:
      - ignore-for-release
    authors:
      - octocat

One of strong motivation is existing CHANGELOG file approach causes almost 100% backport conflict for simple CVE fix. Maintainer is lack of bandwidth to do it manually. Contributor has to raise separated backport PR. I'm sure it won't scale with limited maintainer group.

There are lots of discussion, why not we have a try in 2.4.0 and compare to the change log created manually, to see what's the real gap. what do we lose for a try?

it also support rest api, and other way to integrate with existing automation
https://docs.github.com/en/rest/releases/releases

in the case we see some gap, we could provide feedback to github

community/community#5962

This will not benefit open search project, but all github projects

bandinib-amzn · 2022-11-01T20:46:59Z

Let's have discussion in main thread

seraphjiang · 2022-11-02T04:31:54Z

Let's have discussion in main thread

@bandinib-amzn we have reach agreement, this is per repo based discussion. That thread is only for OpenSearch. We should focus drive the best approach for OpenSearch Dashboards repo.

seraphjiang · 2022-11-02T04:40:53Z

@ZilongX @AMoo-Miki @kavilla @ashwin-pc @dblock

Could we try the release note automation provide by Github?
love automation, but no need to reinvent wheel.

We already had a try on other repo, results are elegant. I really what to know are concerns to give a try which could be done in 5 minutes.

https://github.com/opensearch-project/dashboards-anywhere/releases/tag/v0.8.0
https://github.com/opensearch-project/opensearch-dashboards-functional-test/releases/tag/v2.4.0-rc2

ashwin-pc · 2022-11-02T10:44:42Z

Could we try the release note automation provide by Github?
love automation, but no need to reinvent wheel.

We already had a try on other repo, results are elegant. I really what to know are concerns to give a try which could be done in 5 minutes.

My biggest concern here is, how useful is the changelog. (FYI, The one we have now isnt much better, but since its manual, we can have entries very different from the commit or PR titles). I went through the actual changelog entries and as a user of the FTR repo, many of those changes dont make sense to me (except for the new contributors sections, thats just chefs kiss). After having taken a closer look at all the conversations around this issue, I think I'm not so sure we can wash our hands off with some automation alone.

@bandinib-amzn we have reach agreement, this is per repo based discussion. That thread is only for OpenSearch. We should focus drive the best approach for OpenSearch Dashboards repo.

@seraphjiang, while that thread is for OpenSearch, we face similar problems. In fact this seems to be a problem faced by other large OpenSource repo's too (See my link at the end). I think the discussion there is very relevant to our situation as well. While i was previously in favor of this proposal, after reading some of the comments there I think we need to ask ourselves what do we actually want in the change log?

For example.

https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/CHANGELOG.md?plain=1#L32-L34. These and many more entries here refer to the same change.
https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/CHANGELOG.md?plain=1#L95: All MDS features are going out in 2.4 for the first time. These dont make sense as enhancements since there was nothing in 2.3 related to MDS to enhance
https://github.com/opensearch-project/OpenSearch-Dashboards/blob/main/CHANGELOG.md?plain=1#L132: This is a regression from a single change that wasn't in any release

In each of these cases, what we seem to need is less not more automation since changelog's are after all for humans and not machines. As a dev, we should be easily able to tell the changes introduced between versions. However the fundamental flaw with our approach right now is two fold:

Every PR needs a changelog. We should really not be enforcing this since many changes fix issues introduced in the same version.
When 2 or more PR's edit the same changelog file section, we have a conflict. This is the big pain point right now

An interesting solution i came across was actually implemented by the GitLab team: A brilliant solution to this is documented by the GitLab team: https://about.gitlab.com/blog/2018/07/03/solving-gitlabs-changelog-conflict-crisis/

I like this approach because

It keeps the changelog human, where we are intentional about what we want to add to the changelog
Has room for automations (not as sick as this 5 min autogeneration, but imo, more use than it)
Gets rid of the changelog conflict hell that we are currently facing.
Is already in use by a mature opensource project that faced a similar issue.Does not need a changelog entry

dblock · 2022-11-07T18:51:28Z

With this proposal, how can I find out what has been unrelased on 2.x? This is a major ask and a big part of why the manual CHANGELOG was introduced in the first place. If I cannot tell what changed since the last release before the release, I don't have an opportunity to provide input before something ships, which is a core feature of open-source collaboration.

dblock · 2022-11-07T18:55:47Z

@kotwanikunal I believe you've resolved much of the backport conflict problem in OpenSearch, what do you recommend doing here?

kotwanikunal · 2023-01-09T19:13:54Z

@kotwanikunal I believe you've resolved much of the backport conflict problem in OpenSearch, what do you recommend doing here?

Sorry for the delayed response. I was away for a while.
The way we resolved much of the issues with the OpenSearch repo is by not requiring a changelog entry for all the PRs.
Release notes need any major change entries done by contributors and one of the key things @andrross pointed out was PRs can have a M:1 relation where we can have multiple PRs for a feature or any other category for that matter.
This simplification led to a general approach where maintainers reviewing PRs can mark a PR as skip-changelog, a feature which no-ops the workflow.
By default, the workflow dictates a changelog which the maintainers essentially override. This resolved a ton of conflicts in terms of all the issues listed above -

Reduces burden on the contributor for every PR
Reduces conflicts when backporting the said PRs
Generates changelog which is usable and succinct

The only challenge we face right now is PRs with changelog require conflict resolution when backported. I did have the GHA updated before I left for my vacation, and I will get that going as soon as I get a chance, which should also fix this last hiccup.

More on the process here: https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#changelog

ashwin-pc · 2023-01-10T07:35:27Z

Thanks for the context @kotwanikunal. I generally agree with the approach you guys have taken. We have the same issue too for M:1 relationship changes and I'd also add changes that fix issues that were created in the same release cycle. I too think that a lot fewer entries need to actually go into the changelog. Some questions I have about the approach you guys took is:

How much of a burden is the changelog process now for maintainers since its still quite manual?
How does it work during releases? Is there usually a lot of changes made to the release notes/changelog to clean them up before a release?
There still seems to be changelog conflicts with this approach. Are there any proposed plans to resolve that issue?

kotwanikunal · 2023-01-23T17:36:18Z

Thanks for the context @kotwanikunal. I generally agree with the approach you guys have taken. We have the same issue too for M:1 relationship changes and I'd also add changes that fix issues that were created in the same release cycle. I too think that a lot fewer entries need to actually go into the changelog. Some questions I have about the approach you guys took is:

How much of a burden is the changelog process now for maintainers since its still quite manual?

How does it work during releases? Is there usually a lot of changes made to the release notes/changelog to clean them up before a release?

There still seems to be changelog conflicts with this approach. Are there any proposed plans to resolve that issue?

Yes, the process is still manual since the ultimate goal of the release notes is to be human readable, by humans.
During release, we essentially copy over the changelog file into a release notes file, clear out the changelog for that branch which is a quick process for the release manager.
I was trying to look for an answer for this one and it looks like we have a way forward. The main changelog needs to maintain a section for the previous major version (as in here). As long as the changelog on the previous version branch and main for the previous version are in sync, the backport actually moves it over without any conflict (sample PR)

joshuarrrr · 2023-04-20T16:44:17Z

Should be considered for opensearch-project/.github#148

dblock · 2023-04-27T20:10:56Z

With this proposal, how can I find out what has been unrelased on 2.x? This is a major ask and a big part of why the manual CHANGELOG was introduced in the first place. If I cannot tell what changed since the last release before the release, I don't have an opportunity to provide input before something ships, which is a core feature of open-source collaboration.

I still have this question. I don’t see how the automation understands the baseline.

My 0.02c is that commit messages are a non starter because they can’t be changed. Queue spelling mistakes. We also squash. PR titles would be better.

joshuarrrr · 2023-05-11T16:38:34Z

@ZilongX Thanks for providing this proposal - after considering some of the tradeoffs, we've decided to move forward with this alternative instead: opensearch-project/.github#156

ZilongX added proposal release labels Oct 27, 2022

zhongnansu added the untriaged label Oct 28, 2022

bandinib-amzn removed the untriaged label Nov 1, 2022

ZilongX mentioned this issue Nov 2, 2022

[Release] Add release.yml for release notes automation #2732

Merged

8 tasks

ruanyl mentioned this issue Dec 19, 2022

doc: add developer guide opensearch-project/ml-commons-dashboards#28

Merged

joshuarrrr mentioned this issue Dec 23, 2022

[DevEx] Improve usefulness of commit messages by standardizing formatting and expectations #3129

Open

joshuarrrr mentioned this issue Apr 20, 2023

Request for Proposals: Changelog and release note process opensearch-project/.github#148

Closed

joshuarrrr closed this as completed May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Introduce in automatically generated release notes #2677

[PROPOSAL] Introduce in automatically generated release notes #2677

ZilongX commented Oct 27, 2022 •

edited by ashwin-pc

Loading

ashwin-pc commented Oct 27, 2022

mihirsoni commented Oct 27, 2022

seraphjiang commented Oct 27, 2022

seraphjiang commented Oct 27, 2022

zhongnansu commented Oct 28, 2022

kristenTian commented Oct 28, 2022

kavilla commented Oct 31, 2022 •

edited

Loading

seraphjiang commented Nov 1, 2022 •

edited

Loading

bandinib-amzn commented Nov 1, 2022

seraphjiang commented Nov 2, 2022

seraphjiang commented Nov 2, 2022

ashwin-pc commented Nov 2, 2022

dblock commented Nov 7, 2022 •

edited

Loading

dblock commented Nov 7, 2022

kotwanikunal commented Jan 9, 2023

ashwin-pc commented Jan 10, 2023

kotwanikunal commented Jan 23, 2023

joshuarrrr commented Apr 20, 2023

dblock commented Apr 27, 2023

joshuarrrr commented May 11, 2023

[PROPOSAL] Introduce in automatically generated release notes #2677

[PROPOSAL] Introduce in automatically generated release notes #2677

Comments

ZilongX commented Oct 27, 2022 • edited by ashwin-pc Loading

Proposal in one sentence

What problem are you trying to solve?

Why should we solve it?

How do you propose to solve it?

ashwin-pc commented Oct 27, 2022

mihirsoni commented Oct 27, 2022

seraphjiang commented Oct 27, 2022

seraphjiang commented Oct 27, 2022

zhongnansu commented Oct 28, 2022

kristenTian commented Oct 28, 2022

kavilla commented Oct 31, 2022 • edited Loading

seraphjiang commented Nov 1, 2022 • edited Loading

bandinib-amzn commented Nov 1, 2022

seraphjiang commented Nov 2, 2022

seraphjiang commented Nov 2, 2022

ashwin-pc commented Nov 2, 2022

dblock commented Nov 7, 2022 • edited Loading

dblock commented Nov 7, 2022

kotwanikunal commented Jan 9, 2023

ashwin-pc commented Jan 10, 2023

kotwanikunal commented Jan 23, 2023

joshuarrrr commented Apr 20, 2023

dblock commented Apr 27, 2023

joshuarrrr commented May 11, 2023

ZilongX commented Oct 27, 2022 •

edited by ashwin-pc

Loading

kavilla commented Oct 31, 2022 •

edited

Loading

seraphjiang commented Nov 1, 2022 •

edited

Loading

dblock commented Nov 7, 2022 •

edited

Loading