Skip to content

Full Maven artifact versions comparison algo description

apendragon edited this page Sep 28, 2014 · 3 revisions

Maven versions comparison algo synthesis

See the official apache maven doc is a good intro to the maven comparison algo. But unfortunately for us, it’s not enough because a part of what is described there is wrong or incomplete if you trust code of ComparableVersion.java.

This page describes how it works actually.

definition

A Maven Artifact version is set in pom.xml file under project.version node.

<groupId>com</groupId>
<artifactId>foo</artifactId>
<name>bar</name>
<version>1.0.1-SNAPSHOT</version>

lowercase

First step in version comparison is to lowercase version

1-SNAPSHOT’ begins ’1-snapshot’.

items

A Maven Artifact version consits of items. They are built during version string parsing.

There are 4 item types :

  • integeritem
  • stringitem
  • listitem
  • nullitem

Each item will have its own comparison value function of its type and its value.

item separators

Items are separated from each others by dot ’.’ or by dash ’-’.

listitems splitting

The dash separator “-” will create listitem only if it is preceded by digit and it is followed by digit.

zero appending

After listitems splitting, for each listitem, zero (’0’) will be appended on each blank separator char (dot ‘.’ or dash ‘-‘) Therefore version that begins with separator is automatically prefixed by zero.

-1’ will be internally moved to ’0-1’.

1....1’ will be internally moved to ’1.0.0.0.1’.

listitem building

After they have been splitted, all listitems (main and sub listitems) will be built. A listitem is just an items container. For each listitems, separator to separator a new item will parsed, cast and added.

item casting

  • if item has only digits, it is intergeritem
  • if item has another char, it is stringitem

nullitem

  • integeritem with 0 value is a nullitem
  • stringitem with =”= value is a nullitem

items comparison matrix

integeritem stringitem listitem nullitem
integeritem highest is greater integeritem is greater integeritem is greater equal if integeritem == 0 otherwise integeritem is greater
stringitem integeritem is greater see ‘stringitems and qualifiers’, ‘stringitem alias’ listitem is greater nullitem will be substitute to qualifier and will be lexically compared to stringitem
listitem integeritem is greater listitem is greater compare item by item until inequality first item of listitem will be compared to nullitem
nullitem equal if integeritem == 0 otherwise integeritem is greater nullitem will be substitute to qualifier and will be lexically compared to stringitem first item of listitem will be compared to nullitem never happens

stringitems and qualifiers

There are special stringitems named qualifiers.

All stringitems will be substituted by qualifiers that have special value during stringitems comparison. After what they will be lexically compared

qualifier value
‘alpha’ 0
‘beta’ 1
‘milestone’ 2
‘rc’ or ‘cr’ 3
‘snapshot’ 4
5
‘sp’ 6
something_else 7-something_else

Then when comparing ’alpha’ with ’beta’, ’0’ will be lexically compared to ’1’ that will obviously result to LT (-1).

When comparing ’sp’ with ’thing’, ’6’ will be lexically compared to ’7-thing’ that will obviously result to LT.

And when comparing ’thatthing’ with ’thing’, ’7-thatthing’ will be lexically compared to ’7-thing’ that will obviously result to GT (1).

stringitem alias

Some qualifiers have alias. An alias is just placeholder that will be substituted by an other value for comparison.

simple alias

placelholder substitution
‘cr’ ‘rc’
‘final’
‘ga’

Then ’1-final’ will be substituted by ’1’ during listitem building.

special alias

Special alias must immediatly be followed by a digit to be an Alias.

placelholder substitution
a alpha
b beta
m milestone

Then ’a1’ will be substituted by ’alpha.1’ during listitem building.

Therefore item ’a1’ begin 2 items ’alpha’ (a stringitem) and 1 (an integeritem).

m12’ will be substitued by ’milestone.12’.

mchar’ will obviously not be substituted.

item splitting

stringitem cannot contain digit. Then when an item is ‘hybrid’, i.e it contains chars and digits, it will be splitted to stringitems and integeritems.

Then xxx12 begins ’xxx.12’, ’xxx’ is stringitem, and ’12’ is integeritem.

According to stringitem alias, m1char begins milestone.1.char, 3 items.

Normalization

It’s kind of reducing version components function.

Its aim is to shoot useless version components in artifact version. To simplify it, understand that ’1.0’ must be internally represented by ’1’ during comparison.

But normalization appends in specific times during artifact version parsing.

It appends:

  • each time a dash ’-’ separator is preceded by digit
  • at the end of each parsed listitem

And normalization process current parsed listitem from current position when normalization is called, back to the beginning of this current listitem.

Each encountered nullitem will be shot until a non nullitem is encountered or until the begining of this listitem is reached if all its items are nullitems. In this last case precisely, the empty listitem will be shot except if it is the main one.

Then understand that :

  • 1.0.alpha.0 becomes (1,0,alpha) #because when main listitem parsing has ended, normalization has been called. Last item was 0, 0 is the nullitem of integeritem, then it has been shooted. Next last item was alpha that is not nullitem then normalization process stopped.
  • 1.0-final-1 becomes (1,,1) #because a dash has been encoutered during parsing. Then normalization has been called because it was preceded by a digit and last item in the current listitem is 0. Then it has been shot. final has been substituted by ” but when next normalization has been called, at the end of the parsing, the last item was not nullitem, then normalization did not meet ”.
  • 0.0.ga becomes () # because ‘ga’ has been substituted by ” and when listitem has been normalized at the end, all items where nullitems
  • final-0.1 becomes (,0,1) # because normalization has not been called after first dash because it has not been preceded by digit.

version parsing scheduling

To synthetize, during version parsing, described steps will be sheduled in the following order :

  • 1: lowercase whole version
  • 2: version is splitted to listitems
  • 3: each listitem is splitted to sub-parts to normalize (remember: dash preceded by digit)
  • 4: each sub-part is zero appended on blank separator
  • 5: each alias placeholder will be substituted in sub-parts
  • 6: each hybrid item will be splitted in sub-parts
  • 7: each sub-part will be normalized

versions comparison

Both versions will be parsed resulting to two listitems.

Both listitems will be compared according to items comparison matrix until it will return nonzero value or until the end in case of full equality.