Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Needs canonical examples / reference implementation (including RegExp) #59

Open
coolaj86 opened this Issue · 10 comments

5 participants

@coolaj86

See mojombo/semver#32

  • RegExp in various languages that correctly parse semver
  • Reference implementations

JavaScript RegExp:

/^(([\d+)\.(\d|)\.(\d+))(?:-([\dA-Za-z\-]+(?:\.[\dA-Za-z\-]+)*))?(?:\+([\dA-Za-z\-]+(?:\.[\dA-Za-z\-]+)*))?$/

JavaScript Reference Implementation:

https://github.com/coolaj86/semver-utils

I would suggest having two sections - one for tested Regular Expressions that match and another for modules / reference implementations.

There should be at least canonical reference implementation of a parser / validator with an api that would make sense to copy in any language.

@kherge-old

The regular expression needs to be updated to detect the presence of leading zeros.

Unfortunately, I haven't found a way to do that, otherwise I would have provided an updated example.

@coolaj86

Can you give an example of a valid semver string that has leading zeros? and also provide a reference to the documentation that suggests this is allowed (i.e. a phrase or example in the docs)?

If so I'll add your string to the tests in semver-utils and fix it.

@kherge-old

Sorry, I meant to disallow.

The regular expression as used now does not seem to reject version numbers with leading zeros..

@tbull

We decided only recently that leading zeroes are no longer allowed: mojombo/semver#112
That decision invalidated all regexps developed earlier.

@gvlx gvlx referenced this issue in jeluard/semantic-versioning
Open

Pattern does not follow semver 2.0 #48

@gvlx

Hi,

I have been playing with the regex for version 2.0.0 and came up with this on regex101.

Expanded:

/^
(?'MAJOR'
    0|(?:[1-9]\d*)
)
\.
(?'MINOR'
    0|(?:[1-9]\d*)
)
\.
(?'PATCH'
    0|(?:[1-9]\d*)
)
(?:-(?'prerelease'
    (?:0|(?:[1-9A-Za-z-][0-9A-Za-z-]*))
    (?:\.
        (?:0|(?:[1-9A-Za-z-][0-9A-Za-z-]*))
    )*
))?
(?:\+(?'build'
    (?:0|(?:[1-9A-Za-z-][0-9A-Za-z-]*))
    (?:\.
        (?:0|(?:[1-9A-Za-z-][0-9A-Za-z-]*))
    )*
))?
$/

The pre-release and build patterns are very complex because they require the 'no leading zeros' rule.

I can't figure any benefits of that over a (very) relaxed pattern as in /[0-9A-Za-z]+([.-][0-9A-Za-z]+)*/ which is just 'dot-or-dash' separated alphanumeric identifiers (e.g "00000-aaaaa.bbbbb") which, for me, would be more useful (I usually have to use UUIDs and other mechanical identifiers).

Notice that according to the railroad diagram and the BNF (boy, isn't that hard to read! :confused:) the identifier "0000.0000.0000.0000.------" is valid (leading zeros allowed).

If you can, please supply more edge cases :smile: on the regex101 page (in the first block is all cases are valid, in the second, invalid).

Happy hacking!

@coolaj86

I'm in the camp to veto the use of leading zeros. In JavaScript (and some other languages) parsers will default to octal and then your sorting could get all out of wack because suddenly '011' is less than '10' both lexicographically and numerically.

And what about 007 vs 07 vs 7? Numerically they're all the same in base 10 and base 8 so how would you know which version is the "newer" one?

For the love of all that is good on this earth: no leading zeros!!!

@coolaj86

Oh, sorry I missed the part about that being a build number. I'll have to look at the spec, but I don't think the parser I mentioned prohibits this either way.

@gvlx

Hi,

The requirement is on 2.0.0 for pre-release but not on build version:

9 A pre-release version (...) Numeric identifiers MUST NOT include leading zeroes. (...)

But on the BNF:

<pre-release identifier> ::= <alphanumeric identifier>
                           | <numeric identifier>

<build identifier> ::= <alphanumeric identifier>
                     | <digits>

<alphanumeric identifier> ::= <non-digit>
                            | <non-digit> <identifier characters>
                            | <identifier characters> <non-digit>
                            | <identifier characters> <non-digit> <identifier characters>

<identifier characters> ::= <identifier character>
                          | <identifier character> <identifier characters>

<identifier character> ::= <digit>
                         | <non-digit>

<non-digit> ::= <letter>
              | "-"

<digit> ::= "0"
          | <positive digit>

The railroad diagram is less clear.

So maybe the text requires some correction.
Added pull request #95

So, version 2.0.1? (patterns allowed on 2.0.0 will still work here).

@gvlx

New regex101 pattern:

^
(?'MAJOR'(?:
    0|(?:[1-9]\d*)
))
\.
(?'MINOR'(?:
    0|(?:[1-9]\d*)
))
\.
(?'PATCH'(?:
    0|(?:[1-9]\d*)
))
(?:-(?'prerelease'
    [0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*
))?
(?:\+(?'build'
    [0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*
))?
$
@FichteFoll

9 A pre-release version (...) Numeric identifiers MUST NOT include leading zeroes. (...)

How I read this, it means that a pre-release identifier like 0123456789 is just not interpreted as a numeric but as an alphanumeric identifier and thus compared lexically instead of numerically.


identifiers consisting of only digits are compared numerically and identifiers with letters or hyphens are compared lexically in ASCII sort order. Numeric identifiers always have lower precedence than non-numeric identifiers.

Never mind, it appears that numeric identifiers with leading zeros are not accepted at all or at least have no precedence defined which is pretty much the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.