-
Notifications
You must be signed in to change notification settings - Fork 2
Full Maven artifact versions comparison algo description
See the official apache maven doc is a good intro to the maven comparison algo. But unfortunately for us, it’s not enough because a part of what is described there is wrong or incomplete if you trust code of ComparableVersion.java.
This page describes how it works actually.
A Maven Artifact version is set in pom.xml
file under project.version
node.
<groupId>com</groupId> <artifactId>foo</artifactId> <name>bar</name> <version>1.0.1-SNAPSHOT</version>
First step in version comparison is to lowercase version
‘1-SNAPSHOT
’ begins ’1-snapshot
’.
A Maven Artifact version consits of items. They are built during version string parsing.
There are 4 item types :
integeritem
stringitem
listitem
nullitem
Each item will have its own comparison value function of its type and its value.
Items are separated from each others by dot ’.
’ or by dash ’-
’.
The dash separator “-” will create listitem
only if it is preceded by digit and it is followed by digit.
After listitems splitting, for each listitem, zero (’0
’) will be appended on each blank separator char (dot ‘.’ or dash ‘-‘)
Therefore version that begins with separator is automatically prefixed by zero.
‘-1
’ will be internally moved to ’0-1
’.
‘1....1
’ will be internally moved to ’1.0.0.0.1
’.
After they have been splitted, all listitems (main and sub listitems) will be built. A listitem is just an items container. For each listitems, separator to separator a new item will parsed, cast and added.
- if item has only digits, it is
intergeritem
- if item has another char, it is
stringitem
-
integeritem
with0
value is anullitem
-
stringitem
with =”= value is anullitem
integeritem | stringitem | listitem | nullitem | |
---|---|---|---|---|
integeritem | highest is greater | integeritem is greater | integeritem is greater | equal if integeritem == 0 otherwise integeritem is greater |
stringitem | integeritem is greater | see ‘stringitems and qualifiers’, ‘stringitem alias’ | listitem is greater | nullitem will be substitute to qualifier and will be lexically compared to stringitem |
listitem | integeritem is greater | listitem is greater | compare item by item until inequality | first item of listitem will be compared to nullitem |
nullitem | equal if integeritem == 0 otherwise integeritem is greater | nullitem will be substitute to qualifier and will be lexically compared to stringitem | first item of listitem will be compared to nullitem | never happens |
There are special stringitems
named qualifiers
.
All stringitems
will be substituted by qualifiers
that have special value during stringitems
comparison.
After what they will be lexically compared
qualifier | value |
---|---|
‘alpha’ | 0 |
‘beta’ | 1 |
‘milestone’ | 2 |
‘rc’ or ‘cr’ | 3 |
‘snapshot’ | 4 |
” | 5 |
‘sp’ | 6 |
something_else | 7-something_else |
Then when comparing ’alpha
’ with ’beta
’, ’0
’ will be lexically compared to ’1
’ that will obviously result to LT
(-1
).
When comparing ’sp
’ with ’thing
’, ’6
’ will be lexically compared to ’7-thing
’ that will obviously result to LT
.
And when comparing ’thatthing
’ with ’thing
’, ’7-thatthing
’ will be lexically compared to ’7-thing
’ that will obviously result to GT
(1
).
Some qualifiers have alias. An alias is just placeholder that will be substituted by an other value for comparison.
placelholder | substitution |
---|---|
‘cr’ | ‘rc’ |
‘final’ | ” |
‘ga’ | ” |
Then ’1-final
’ will be substituted by ’1
’ during listitem building.
Special alias must immediatly be followed by a digit to be an Alias.
placelholder | substitution |
---|---|
a | alpha |
b | beta |
m | milestone |
Then ’a1
’ will be substituted by ’alpha.1
’ during listitem building.
Therefore item ’a1
’ begin 2 items ’alpha
’ (a stringitem
) and 1
(an integeritem
).
‘m12
’ will be substitued by ’milestone.12
’.
‘mchar
’ will obviously not be substituted.
stringitem
cannot contain digit. Then when an item is ‘hybrid’, i.e it
contains chars and digits, it will be splitted to stringitems
and
integeritems
.
Then xxx12
begins ’xxx.12
’, ’xxx
’ is stringitem
, and ’12
’ is
integeritem
.
According to stringitem alias, m1char
begins milestone.1.char
, 3 items.
It’s kind of reducing version components function.
Its aim is to shoot useless version components in artifact version. To
simplify it, understand that ’1.0
’ must be internally represented by ’1
’ during comparison.
But normalization appends in specific times during artifact version parsing.
It appends:
- each time a dash ’
-
’ separator is preceded by digit - at the end of each parsed
listitem
And normalization process current parsed listitem
from current position
when normalization is called, back to the beginning of this current
listitem
.
Each encountered nullitem
will be shot until a non nullitem
is encountered
or until the begining of this listitem
is reached if all its items are nullitems
.
In this last case precisely, the empty listitem
will be shot except if it is the main one.
Then understand that :
-
1.0.alpha.0
becomes(1,0,alpha)
#because when mainlistitem
parsing has ended, normalization has been called. Last item was 0, 0 is thenullitem
ofintegeritem
, then it has been shooted. Next last item wasalpha
that is notnullitem
then normalization process stopped. -
1.0-final-1
becomes(1,,1)
#because a dash has been encoutered during parsing. Then normalization has been called because it was preceded by a digit and last item in the currentlistitem
is 0. Then it has been shot.final
has been substituted by ” but when next normalization has been called, at the end of the parsing, the last item was notnullitem
, then normalization did not meet ”. -
0.0.ga
becomes()
# because ‘ga’ has been substituted by ” and whenlistitem
has been normalized at the end, all items wherenullitems
-
final-0.1
becomes(,0,1)
# because normalization has not been called after first dash because it has not been preceded by digit.
To synthetize, during version parsing, described steps will be sheduled in the following order :
- 1: lowercase whole version
- 2: version is splitted to listitems
- 3: each listitem is splitted to sub-parts to normalize (remember: dash preceded by digit)
- 4: each sub-part is zero appended on blank separator
- 5: each alias placeholder will be substituted in sub-parts
- 6: each hybrid item will be splitted in sub-parts
- 7: each sub-part will be normalized
Both versions will be parsed resulting to two listitems
.
Both listitems
will be compared according to items comparison matrix until
it will return nonzero value or until the end in case of full equality.