Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What versioning scheme should we use? #335

Closed
krader1961 opened this issue Jan 7, 2018 · 27 comments
Assignees
Labels

Comments

@krader1961
Copy link
Collaborator

@krader1961 krader1961 commented Jan 7, 2018

Now that we are setting up automated builds (see issue #333) it is time to think about the format of the version string. One option is to continue with the current scheme which does not seem to be documented AFAICT. That is, strings like 93u+ (the version most commonly encountered on a distro) or 93v- (the version at the time the project was open sourced). Presumably the letter was being incremented whenever a stable minor release point was reached. No idea what the minus or plus symbols represent.

A second option is to switch to semantic versioning. This would use a string of the form major.minor.patch-level; e.g., 93.15.0. When building from any git commit without a version tag the current git commit hash would be appended to the most recent version tag. This is my preferred solution.

There are probably other options as well. Anyone with an opinion please feel free to comment.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jan 7, 2018

Also, notice the somewhat unusual formatting of the version information:

$ src/cmd/ksh93/ksh --version
  version         sh (AT&T Research) 93v- 2014-12-24
$ echo $KSH_VERSION
Version BIJ 93v- 2014-12-24

This makes using that information to programmatically determine which version is being used, and thus which features are available, more difficult than it needs to be. Which is another reason I favor a simple semantic version string.

@dannyweldon

This comment has been minimized.

Copy link
Collaborator

@dannyweldon dannyweldon commented Jan 7, 2018

But if you echo them in an arithmetic context, you get just the plain number:

$ echo $(( KSH_VERSION ))
20141224
$ echo $(( .sh.version ))
20141224

In one way, this makes it easier to check for a particular feature by checking whether this number is >= a particular date stamp. There is no reference available though for when features were added, so I had been making notes of a few from the old email list when I came across them and from the RELEASE file. Perhaps I could start a wiki page for it.

It might be better to keep that as is, so that existing scripts don't break and the fact that 93 < 2014, and create a new variable. eg. .sh.semver, but then should that become an array like $BASH_VERSINFO? Or, maybe use $KSH_VERSINFO? (But I don't particularly like the name.)

I don't agree with starting a semver from 93 either. That number should have been changed years ago. There was a mention on the ast-developers lists that the internal company politics to get it changed was too hard, so they found it easier to just leave it.

I think for major versions:

ksh88 was version 1
ksh93 was version 2
next major release version 3 ?

However, you could argue that there probably has been lots of releases of ksh93 that added major new features, but they were always backward compatible, so you could class them as minor releases. So technically, you could just start the versioning at 2.100.0 (100 being some arbitrarily high number), or perhaps just 3.0.0, to separate it out and because you have made some major changes to the build system, which would affect people building their own builtins.

@catull

This comment has been minimized.

Copy link
Contributor

@catull catull commented Jan 8, 2018

ksh88 is related to 1988.
ksh93 itself is linked to 1993.

Is it too far-fetched to say ksh 2018 is in progress ?
Perhaps ksh 2020 is more catchy.

There is also a need to distinguish long and short version info.

$ src/cmd/ksh93/ksh --version
version sh (AT&T Research) 93v- 2014-12-24

Here I would expect something along the lines of

$ src/cmd/ksh93/ksh --version
93v-

$ src/cmd/ksh93/ksh --full-version
ksh (AT&T Research) 93v- 2014-12-24

Argument names can also be: --version-short vs --version-long, leaving --version as is.

@KeithBierman

This comment has been minimized.

Copy link

@KeithBierman KeithBierman commented Jan 8, 2018

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jan 11, 2018

@KeithBierman, I agree with you, more or less. I say "more or less" because a lot of time has elapsed since the last stable release. And, among other things, this project is now using a new build system. Which itself is a major change. Even if we haven't made any substantive changes to the behavior of ksh the aforementioned changes warrant a change to the major version of the project. If only to signal that there is new management of the project.

@dannyweldon, Your comment is interesting in as much as I had not explored how the ksh version was exposed within a ksh script. But my point remains that ksh --version is basically unusable.

@hlangeveld

This comment has been minimized.

Copy link

@hlangeveld hlangeveld commented Jan 11, 2018

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jan 12, 2018

Throughout all of this time, the name of the product in legal speak remained "ksh93", even after over 20 major revisions.

Yes, that matches my recollection as an employee of Sequent Computer Systems for two decades. And it's one reason why I think we should revert to just "ksh" everywhere other than where we want to make it explicit this work is based on what has long been known as ksh93. And that means dropping the "93" from the version number, but possibly keeping it as a parenthetical in the long form version string.

Also, I still feel our first stable release should be viewed as a major release. Even if we don't add a major new feature. The reason is that we are pruning (i.e., removing) some experimental features like SHOPT_FIXEDARRAY and features of dubious value and correctness like SHOPT_AUDIT. That alone warrants changing the major component of the version number from "93" to something else.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jan 12, 2018

Also, the fact you can mutate what should be read-only attributes is itself a problem in my opinion:

Interestingly, I find I can just do stuff like:

     .sh.version.release="2012-08-01"
     .sh.version.compact="93u+"

Those values should be considered immutable and the implementation should not allow changing them. My opinion is obviously going to be controversial. Consider bash:

bash-4.4$ echo $BASH_VERSION
4.4.12(1)-release
bash-4.4$ BASH_VERSION=wtf
bash-4.4$ echo $BASH_VERSION
wtf

I think allowing mutating such a fundamental attribute is wrong and can potentially lead to correctness, if not security, problems. Under what circumstances would you consider it acceptable to alter the output of bash --version without running a different version of bash?

@dannyweldon

This comment has been minimized.

Copy link
Collaborator

@dannyweldon dannyweldon commented Jan 12, 2018

Those values should be considered immutable and the implementation should not allow changing them.

Agreed. I checked ksh and it has similar issues.

Hey, I could write a ksh library module that would extract these strings from the legacy version strings

Thanks, Henk. They probably should be implemented in the ksh binary itself though. Unless that is what you meant?

@hlangeveld

This comment has been minimized.

Copy link

@hlangeveld hlangeveld commented Jan 13, 2018

@krader1961 I think we vehemently agree.
Note that these substrings do not exist currently...
I would not want to change an existing version string during operation.

This was meant more as an example of how we could extend the existing
version string and turn it into a struct of sorts.

And we can do better, and make them readonly:

function _ksh_set_version {
  typeset -a A
  A=(${.sh.version})
  readonly .sh.version.prefix=${A[0]}
  readonly .sh.version.legacy=${A[1]}
  readonly .sh.version.compact=${A[2]}
  readonly .sh.version.release=${A[3]}
}

Think of this extension as akin to the various uname options to extract particular release info, vs. uname -a that prints the entire string.

Creating these subfields does not change the original version string at all.

By adding the readonly we can make sure this only happens once.
Heck, I'm tempted to add this to .sh.version itself...

Unfortunately, there is no global init file for non-interactive shells, but a shell programmer can include this little piece in a standard library* for ksh.

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented May 10, 2018

Summarizing my discussion with @kdudka about this topic:

I am fine with changing versioning scheme to semantic versioning. Existing variables in ksh that show version numbers should be left unchanged to avoid breaking any existing scripts. And we should introduce new variables to match new scheme. Switching to a new scheme should not break upgrade paths of distros as there are ways to enforce packages updates. For e.g. epoch attribute in .spec files.

What should be the major version number of the next release is an open discussion. But there should be a way for users to relate it to language version (ksh93 that is), so may be we should start with ksh-93.0.0.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented May 21, 2018

In light of the discussion in PR #544 (generate version from git state) it would appear the safest course of action is for the major number of the next release to be the year (e.g., 2018). That is because at present using variable .sh.version in a numeric context returns the date stamp as a simple integer:

$ echo $(( .sh.version ))
20120801

So that comparisons like if (( .sh.version >= 20120801 )) yield the expected result the simplest solution is for a semver value to use the year as the major component. Thus we would do something like this when ready to announce to the world that a new ksh release is ready for general use:

$ git tag -a 2018.0.0 $commit_id
@kdudka

This comment has been minimized.

Copy link
Contributor

@kdudka kdudka commented May 21, 2018

I think you could introduce upstream versioning in the form year.minor.patch but extract a timestamp out of the tagged commit to get a string compatible/comparable with 20120801 for the output of $((.sh.version)) to keep existing scripts working.

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented May 21, 2018

Also, using year as major version limits our ability to do a major version only once a year, but it should be acceptable in practice.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented May 22, 2018

I don't think there is any value in having .sh.version evaluated in a numeric context be based on a different value than that shown in a non-numeric context. Quite the opposite. AFAICT the only reason the AST/ksh team did that was to retain the legacy "93" major version number in the string form while having something easier to compare in a numeric context. So that they didn't have to figure out how to convert "93u" and "93v" into numbers that could be compared. Semantic versioning makes that a non-issue.

Too, the major number doesn't strictly speaking have to be the year the release occurred. It only has to be greater than "2014" (the year the 93v beta release occurred). So we could start with "2015.0.0" and increment it whenever a major release occurs. Having said that it clearly will be less confusing if we at least start with "2018" or whatever year the next release based on this code occurs. Too, the probability there would be two major releases in the same calendar year is so close to zero that concern should be a non-issue. In fact, I would consider it a significant issue if there were two major releases in a 12 month period.

@kdudka

This comment has been minimized.

Copy link
Contributor

@kdudka kdudka commented May 22, 2018

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented May 22, 2018

My proposal does preserve backward compatibility while also simplifying matters. Note that the semantic major number does not have to be the current year. It only has to be greater than the year 93v was released (2014) so that anyone doing a test like if (( .sh.version >= 20120801 )) to verify the shell version is at least 93u will still yield the correct result.

Without googling the answer or running ksh what does x need to be in if (( .sh.version >= x )) to verify the shell is version 93t or newer? What are the associated dates for 93s, 93r, etcetera? With my proposal that stops being a source of confusion and possible bugs. There is just one semantic version that can be represented as a string (e.g., 2018.0.0) or an integer (e.g., 20180000). It also means that if you want to know whether you're running the first new major release or one that has had at least one minor or patch release you simply check the minor or patch numbers are non-zero. Something you can't do with the current scheme.

Too, what do we do when four more releases have occurred and the short version string is 93z? Do we start doubling up the letters; e.g., 93aa or 93za? We might as well bite the bullet now and break with that versioning convention.

@kdudka

This comment has been minimized.

Copy link
Contributor

@kdudka kdudka commented May 23, 2018

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented Jun 18, 2018

@krader1961 If I am understanding correctly, your proposal is to use 2018 as major version number string and .sh.version will be set to 20180000, so it retains backward compatibility. I don't see any issues with such change.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jun 18, 2018

@siteshwar Correct. The main idea is to stop having two distinct versions: the legacy 93x form and the datestamp that is used when .sh.version is used in a numeric context like (( .sh.version >= 20120101 )). A single semantic version whose major number is greater than the year of the previous release retains backward compatibility and simplifies matters. It doesn't even have to be 2018.0.0. Depending on whether or not we consider 93v to be an official release we could use 2013.0.0 (since 93u+ has version 2012.08.01) or 2015.0.0 (since 93v- has a nominal release date of 2014.12.24). But we might as well use the current year when the next stable release occurs.

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented Jul 24, 2018

@krader1961 In the last stable version of ksh, value of KSH_VERSION is set as

Version AJM 93u+ 2012-08-01

Are you proposing to remove 93u+ from this value ? It may be a backward incompatible change, but I think removing this will avoid any confusions in future regarding versioning schemes.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jul 24, 2018

@siteshwar Why would replacing 93u+ in the value of KSH_VERSION with something like 2018.0.0 be backward incompatible? If 93v- had become a production release that string would have been something like Version AJM 93v+ 2014-12-24. The only safe way to check if you're using a ksh version that meets some minimum release is to check the numeric form; e.g., (( .sh.version >= 20120101 )) if you want to ensure you're using 93u+ or later. There probably are scripts doing checks like [[ $KSH_VERSION == *93u* ]]. But those would be broken by any new release; even if we kept the original versioning scheme.

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented Jul 25, 2018

Why would replacing 93u+ in the value of KSH_VERSION with something like 2018.0.0 be backward incompatible?

What would $KSH_VERSION look like after your proposed change ? I presume it would change from something like:

Version AJM 93u+ 2012-08-01

to

Version AJM 2012.08.01

so the format of string has changed. If a script gets version number of ksh through commands like:

echo $KSH_VERSION | cut -f4 -d' '

then it is going to break.

The only safe way to check if you're using a ksh version that meets some minimum release is to check the numeric form

You are right. But people do strange things while writing scripts. For e.g. in @dannyweldon's example KSH_VERSION is evaluated in arithmetic context to get release number, so I am skeptical if this change may break any scripts.

@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Jul 25, 2018

If they are using cut in that manner they are already broken by any system that builds ksh with no SHOPT_ controlled features enabled. Because in that case the AJM in your hypothetical example will not be present. See the definition of e_version[].

That the new format of the final token uses periods instead of dashes does not, itself, cause any problems. Consider your example of someone extracting the final token from the version string and doing a comparison to see if if the ksh version is a particular version. Using just ksh, no external commands, you can only do equal/not-equal comparisons of the version string. So to see if it is ksh93u+ you do the equivalent of [[ $version_date == '2012-08-01' ]]. That the new semantic version uses periods and non-zero padded values does not affect the outcome. It only causes a problem if they are using an external command such as expr to perform a string ordering test.

The solution to that, unlikely though it might be, scenario is to embed the semantic version twice. For example:

Version AJM 2018.0.0 2018-00-00

The second value being algorithmically derived from the first so that we only have a single semantic version number we have to worry about assigning to a commit as a tag.

@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented Jul 25, 2018

The solution to that, unlikely though it might be, scenario is to embed the semantic version twice. For example:
Version AJM 2018.0.0 2018-00-00

It seems unreasonable to set version string this way, just for the sake of compatibility. I am fine with settting it to:

Version AJM 2018.0.0
@siteshwar siteshwar self-assigned this Aug 16, 2018
@siteshwar

This comment has been minimized.

Copy link
Collaborator

@siteshwar siteshwar commented Aug 24, 2018

We have switched to semantic version numbering in upstream. $KSH_VERSION variable is set to:

Version A 2017.0.0-devel-1535-g7c33a1cd

and in arithmetic context it's evaluated to:

$ echo $((KSH_VERSION))      
20170000

Evaluating version number in arithmetic context maintains backward compatibility, so this should not break any scripts.

siteshwar added a commit to siteshwar/ast that referenced this issue Aug 25, 2018
Related: att#335
siteshwar added a commit to siteshwar/ast that referenced this issue Aug 25, 2018
Version number string should be converted to number only once as it is
not going to change in subsequent calls of `nget_version`.

Related: att#335
siteshwar added a commit to siteshwar/ast that referenced this issue Aug 25, 2018
Version number string should be converted to number only once as it is
not going to change in subsequent calls of `nget_version`.

Related: att#335
siteshwar added a commit that referenced this issue Aug 26, 2018
Related: #335
siteshwar added a commit to siteshwar/ast that referenced this issue Aug 26, 2018
Version number string should be converted to number only once as it is
not going to change in subsequent calls of `nget_version`.

Related: att#335
siteshwar added a commit to siteshwar/ast that referenced this issue Aug 27, 2018
Version number string should be converted to number only once as it is
not going to change in subsequent calls of `nget_version`.

Related: att#335
siteshwar added a commit that referenced this issue Aug 27, 2018
Version number string should be converted to number only once as it is
not going to change in subsequent calls of `nget_version`.

Related: #335
siteshwar added a commit that referenced this issue Aug 27, 2018
Related: #335
@krader1961

This comment has been minimized.

Copy link
Collaborator Author

@krader1961 krader1961 commented Sep 5, 2018

Closing since @siteshwar has implemented the core of my proposal. We're now using a semantic version with the major number (currently) being 2017 and using .sh.version in a numeric context compares correctly with versions from older releases.

@krader1961 krader1961 closed this Sep 5, 2018
siteshwar added a commit to siteshwar/ast that referenced this issue Sep 6, 2018
in contributing doc.

Related: att#335
siteshwar added a commit that referenced this issue Sep 7, 2018
in contributing doc.

Related: #335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.