Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Cadence and Long Term Support #115

Closed
rbeyer opened this issue Jul 5, 2020 · 5 comments
Closed

Release Cadence and Long Term Support #115

rbeyer opened this issue Jul 5, 2020 · 5 comments

Comments

@rbeyer
Copy link
Member

rbeyer commented Jul 5, 2020

At some of our previous meetings (May 2020 & June 2020), we've talked about the ISIS release schedule, and the support model for those releases. The intent of this Issue is to lay these topics out, and really try and sharpen our concerns and needs so we can determine what changes (if any) we think would be good.

Current Release Schedule

One release per quarter.

Current Support Model

Single advancing front: any bug fixes are simply incorporated into the next quarterly release, which (so far in the post-ISIS 3.6 era) have mostly been feature-based releases (major or minor version bumps, few patch-only increments).

Concerns

These are some examples, from some different perspectives, please add yours below.

Individual Science User

Psychologically, as an individual user, seeing this rush of releases is like watching the odometer increase in some movie montage. I'm a pretty heavy ISIS user, and even I haven't been keeping up with the release schedule. As an individual user, who has time to mess with the "overhead" (time to upgrade and test) of upgrading ISIS every 3 months? I don't.

Why is this? I use ISIS as a tool for data processing in my science projects. My work on a project--from start to finish--usually takes more than 3 months (sadly, I wish I were faster). Typically, my work begins with slinging data, and heavy ISIS use, then transitions to data analysis, and finally to writing up the manuscript and sometimes revisiting ISIS to perform some additional work to modify the earlier work to produce figures or tweak some things. If I am in the middle of this cycle, I'm not going to update ISIS. If I used one version of ISIS to produce the early slug of data, and I discover that I need to process one more image, if I had changed ISIS versions, then I would need to go back and re-run all of my earlier data. None of this is impossible, and if I've built my system correctly, that should be easy, but depending on the scale of the data, it could be costly in terms of my human schedule, costly in terms of compute resource, or both.

Certainly, the magic of conda allows me to keep multiple versions of ISIS on my system, and I could keep the old version for that project, and update to a new version to play with or start another project, but I don't. And I don't think the typical user would either.

I typically have some version of ISIS as my "current" version for some period of time, and typically only "upgrade" when I have a mostly clean slate and am ready to start a new project. For me, this is more like a 6 to 9 month pattern.

"User version lag" is a phenomenon that ISIS has had for a while, even pre-3.6. For whatever reasons (I suspect largely due to trust issues), users have always been reluctant to "upgrade" ISIS, often sticking with a particular version for years (!). The unintended consequence is that this reluctance to upgrade means that when they finally do upgrade, it is even more painful, because so many changes have happened. In the current era, this "user version lag" is even worse, as users see versions speeding by, like cars on a busy street, and I suspect that the number of users for any given release is far less than it used to be.

Programmatic Users (ISIS as a software dependency)

I don't know how many of these we have for ISIS, but I can speak for my Ames Stereo Pipeline project. Our code has ISIS as a dependency. We manage about one release a year, and it is typically an all-in-one feature-and-bugfix release. In past years when we have managed two releases (last time was 2016), the additional releases are typically of a bug-fix nature. Depending on what changes are made in the 4:1 ratio of ISIS releases to ASP releases, ISIS can break us (and did this last year, around November, in fact, but we couldn't react to it until now).

I suspect that most, if not all, of the the software that depends on ISIS is probably going to have a smaller dev team than ISIS, and just won't be able to keep up with the current release cadence, and may have similar issues as we did on ASP. So we need to devise some system or mechanism to support software that uses ISIS as a dependency that might have a very different release cadence.

Mission Users

These ISIS has a lot of, and these have the longest version "memory" of the ISIS users. Missions hang on to an ISIS version far longer than either of the above categories. Typically, this is a side effect of how missions operate, they tend to freeze all kinds of software systems at some point in time. This is typically done with mission-critical systems, and this thinking carries over into other systems like their ground data processing system (which ISIS is typically a part of). The practice of freezing to a particular ISIS version is based on the decade-long experience of system administrators that typically prior to being Mission system administrators have typically been system administrators for a research group with science users of ISIS. They witnessed the (past) horrors of upgrading ISIS for their users (or they are new system administrators but have received this "wisdom" from their predecessors), and felt that they had no choice but to freeze, as they simply couldn't afford the person-time to keep current with ISIS. This has the same kinds of consequences as it does for users, such that if a mission lives long enough and is boxed into a corner such that they need to update ISIS, it can be a massive undertaking.

The flip side of this is that period before a mission freezes ISIS. They will typically have funded USGS to develop capability, and they will undergo a period of rapid requests and engagement for change, and will want all the latest and greatest until the hammer comes down and someone tells them (or they decide) to freeze.

Thoughts

We need to have a process that supports all of these kinds of usages and users. I would argue that we are currently not doing a great job of this. Technically, I think the software is in a great place, and all of the changes and growing pains were necessary to get us to the point where we can contemplate these kinds of issues which are a mix of tecnical and perceptual.

The new, quarterly release cadence is a wonderful achievement, but I fear that it has left many ISIS users in the dust.

Perhaps shifting to a different kind of cadence would serve these things better? I don't think big swings are needed, but I think small adjustments are in order until things are "just right."

For example, what if we switch to a "feature" release every six months, and "bugfix" releases between them?

This would mean that "release candidates" would be open longer. For those few users that need the bleeding edge, that RC version is available to them to use. For individual users, the speed at which the major and minor version numbers tick by appears to slow down, and at the same time we can begin to train them that if they have the most recent "feature" release, that upgrading to the latest "bugfix" version won't break their stuff.

How we tick the versions and what kinds of releases we provide is one aspect, but is really just a different way of presenting a single, ever-advancing front. Another, aspect upon which I have only touched lightly is long-term support. Should we commit to bugfixes on some version for some period of time? This question I'm less sure about given our codebase, our users, and our development resources.

@AndrewAnnex
Copy link
Member

So I agree with the idea of lts releases for isis3 that get backports of bug fixes made during the development of RC's (I would call them feature branches) for missions. In the end the same amount of development work would be occurring but it would lead to simplifying the diffs between various releases and would give more confidence for users who don't need the latest and greatest new additions to isis3 for a few years. If a lot of major version bumps are required it indicates that theses changes are necessary to get the project to a more stable API, which can be fine if enough communication is occurring between downstream projects and isis3 developers.

@AndrewAnnex
Copy link
Member

to follow on my own post, numpy has a framework for dealing with releases that could be inspirational https://numpy.org/neps/nep-0029-deprecation_policy.html

@jessemapel
Copy link
Collaborator

jessemapel commented Jul 13, 2020

How we tick the versions and what kinds of releases we provide is one aspect, but is really just a different way of presenting a single, ever-advancing front. Another, aspect upon which I have only touched lightly is long-term support. Should we commit to bugfixes on some version for some period of time? This question I'm less sure about given our codebase, our users, and our development resources.

This is the crux of my concerns. Slowing down the release schedule is not hard to do. Maintaining multiple code bases does take work. From my experience doing ISIS 3 and ISIS 4 releases concurrently, it is not a herculean task, but it takes resources and imposes a lot of requirements on contributors. As the code bases for the two supported version diverge more and more, the effort required to port bug fixes between them grows. Git cherry-pick is a miracle and helps, but when we have efforts like the conversion to gtest that require moving large amounts of code around it breaks down quickly. If we move to a long term support, then we need a few things:

  1. A system for determining what goes into long term support versions and what doesn't. Do all bug fixes go back? How do we indicate if a PR/issue will be patched into a LTS version?
  2. Contributors need to be responsible for patching their changes into LTS versions. Having someone who may or may not have been involved in the contribution patching before the release has a higher chance of causing issues, it also means that the dev environment for the LTS version "hibernates" between releases. We could have dependencies shift under us or the data/test data could get out of sync during this time. We need to treat the bleeding edge and LTS versions as equals when it comes to maintaining their environments.
  3. Changelogs. We need to make it very clear to users what changes are in what version. See Does ISIS have a "history" file? DOI-USGS/ISIS3#3941 for some discussion of changelogs in ISIS. I would also look at ISIS entry page DOI-USGS/ISIS3#3948 for some discussion of how we serve changelogs to users.
  4. Split documentation. Currently the ISIS documentation is a mix of bleeding edge and last release. The Github documentation is bleeding edge and the website it the last feature release. We need to be able to serve docs for the LTS version, the last release, and current dev. The doc system doesn't support that right now.

@AndrewAnnex
Copy link
Member

AndrewAnnex commented Jul 13, 2020

a lot of this depends on the health of the project and development requirements, for healthier projects it should hopefully be less of an issue because new features are introduced in ways orthogonal to the core api, ie you add a feature that only add new code, and does not modify older code. For example, the change to gtest sounds like a big but beneficial change, but hopefully these big changes are not happening that frequently, if they are ISIS should just be considered in a state of flux for some time after which a release that could be considered more stable would be made. This is not an easy problem to solve, and there may not be the resources to solve it.

for the other bullet points:

  1. dependent on the PR, if a change is included in a PR that could be considered a bug fix it should be committed separately in a different PR for easier cherry picking. The goal is to minimize diffs to new features so yes I think if you can be confident a code change is just a bug fix, all reasonable effort should be used to ensure it is backported. I think using the labels in github are ways to indicate this, and it is up to the contributor to open the new PR, and the reviewers/maintainers to do the right things.

  2. Dependencies should be pinned to exact releases or given bounds to the most recent version that still works "something >= 1.0, <=2.0" for when breaking changes occur. I think the data/test data issue has always been problematic, would it be possible to ensure files do not move/change and to only every add new data and new independently run tests?

  3. I use a markdown file, I think it is pretty easy to manage even for larger projects but adds some work to the release process, ideally prs would update it as they get merged in.

  4. I recommend fixing that by switching all the docs over to something like sphinx, thats a big change, but could potentially start out in an independent repository to keep it away from the main source (but I don't think that is necessary). This is obviously easier said than done...

@jessemapel
Copy link
Collaborator

I am going to close this because much of the discussion has moved into the TC meetings and #124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants