-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-43586: Add versioning to the fits module #32
Conversation
5e8c163
to
b79cef1
Compare
4657d83
to
1a28756
Compare
readable or not. Any changes to this module, even as trivial as formatting | ||
changes and documentation updates should bump the version number (in this case, | ||
it would bump the patch). This is mandated in GitHub Actions. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's obvious how one maps semver to I/O versioning; I think we need to spell out what is and is not supported under major, minor, and patch version bumps (and we might find that those are not sufficiently granular for the guarantees we might want to make, if both forwards- and backwards-compatibility and read/write permutations are in play).
But I also think going to all that effort might be premature, and maybe a single-integer version would be better for now? The most important thing is getting some kind of number into the files so future code doesn't have to guess about what it's dealing with.
I also have to admit that I'm a little skeptical of the value of the GitHub Action and the policy of bumping the patch version for all changes. Is this something you've seen in other codebases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm envisioning the semantic versioning to work in exactly the same way as it does for software. I suppose one assumption here is that the version with which we are reading a file is never older than the version used to write a file, since a write operation must preceed a read operation. An example of a major bump would be if we decide to use ImageHDU
s instead of BinTableHDU
s and drop the support for the latter. That would be a breaking change rendering the old files unreadable. Examples of minor bump would include this PR, supporting versions with or without VERSION
in the header and substituting a default value if not found for newer items like day_obs
and physical_filter
we just introduced. The changes in all of the three examples aren't fundamentally different - in one case we decide to drop support, and for the others we don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still rely on the developer doing the due diligence in deciding the correct version after a change. This is also enforceable but much harder. We could also call 0.x.y
versions unstable in that we don't give any semver kind of guarantee until 1.0.0
. Bumping minor versions for every change would be equivalent to just an integer numbering that you suggested. It seems wasteful to me to discard old data for any missing metadata. So this is definitely something we should support in production, even if not now.
d4fe428
to
0e02461
Compare
This is ready for another round of review, Jim. |
0e02461
to
2cf81a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid it seems we didn't land on the same page as much as we both thought after talking about it live. I'd still like to at least hope for a bit more forward-compatibility, and I'm still unconvinced about "semantic versioning" and especially "patch versions" being useful concepts here. (Major and minor versions are great, but I think semantic versioning implies a lot more than that.)
python/lsst/cell_coadds/_fits.py
Outdated
|
||
Changes to this module may require a bump to the version number denoted by the | ||
module constant FILE_FORMAT_VERSION. A change to this module without bumping | ||
the version would result in a failure in GitHub Actions to act as a reminder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this policy explicit; how about:
- when the on-disk format written by this code changes in a way that should not prevent any previous version of the code from reading it (though some new information may be dropped), there should be a minor version bump;
- when the on-disk format written by this code changes in a way that will prevent a previous version of the code from being able to read it, there should be a major version bump.
Having written that I'm still not sure what a patch version bump is good for; maybe "changes that might change the on-disk format slightly but might not?" But that just sounds like an excuse to commit poorly-tested code.
And if we don't need a patch version, I definitely think the phrase "semantic versioning" raises more questions than it answers; thought I think that might be the case even if we had a reason for a patch version.
python/lsst/cell_coadds/_fits.py
Outdated
return False | ||
|
||
if this_version.minor < other_version.minor: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels too pessimistic. It's shouldn't be that hard for us to ignore header keys, table columns, and HDUs we don't recognize.
My bad. My scant notes of our discussion partly to blame here. Let's revisit this again next week. I think versioning is important since this is a dataset that is likely to be bulk-downloaded and accessed outside of RSP so I'd like to get a good versioning scheme going on. |
6600c88
to
eaf527b
Compare
I've switched to using just major and minor versions, and removed any mention of semantic versioning. I've expanded on the policy on bumping versions and compatibility across versions is largely put in by hand. I'm hoping to wrap this up, barring any minor implementation changes after today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not fully on board with the descriptions of what major and minor version bumps mean, though I do agree with your examples. So I'm hoping we're almost on the same page now.
python/lsst/cell_coadds/_fits.py
Outdated
module constant FILE_FORMAT_VERSION. The policy for bumping the version is: | ||
|
||
1. When the on-disk format written by this module changes such that the | ||
accompanying reader can still read files written by the previous version, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think minor bumps are when older readers can still read files written by new writers. As you've described it here, the format is not changing; only the reader code is. If you don't want to try to rigorously support compatibility in that direction (as suggested in the examples), then we could say "old readers may be able to read new code", while major would mean they definitely cannot.
python/lsst/cell_coadds/_fits.py
Outdated
2. When the on-disk format written by this module changes in a way that will | ||
prevent the reader from reading a previous version of the code, then there | ||
should be a major bump. This holds even if the reader temporarily supports a | ||
previous version with deprecation warnings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this qualifier renders the previous sentence moot. When we switch to writing a new file format but keep backwards-compatibility reader code around, I agree we want a major bump in the version embedded in the file. But it's not because we're declaring at that point that we'll someday remove the backwards-compatibility; it's because we're definitely breaking the ability of old readers to read the new file.
I think it's also true that major version bumps defined this way will tend to have read support deprecated and ultimately removed in future code releases, but there's nothing stopping us from dropping read support for particular minor versions in future code releases as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this live, and we're pretty much on the same page; we'll leave major vs. minor to developer discretion, with intent to possibly deprecate being a strong indication that a major version bump is in order.
Thinking about this more after the call, I am with your points stronger than I was during the call. I'll clean up the changes and merge by EOD tomorrow. Thank you for all the discussions, Jim. |
2d41d8e
to
e9504f7
Compare
from the file.
e9504f7
to
03ddb21
Compare
No description provided.