Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Enhancement to the version sorting (versorted) output #13
Hi Seth, thanks for the great work on this lib!
>>> a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1']
Let's add the
>>> a = ['1.9.9a', '1.11', '1.9.9b', '1.11.4', '1.10.1'] >>> natsorted(a) ['1.9.9a', '1.9.9b', '1.10.1', '1.11', '1.11.4', '1.11a']
I think the '1.11a' should be sorted before the '1.11.4'.
What output would you expect from the following:
>>> a = ['1.11', '1.11.4', '1.11a', '1.11.0', '1.11a.0', '1.11a.4']
I agree that '1.11a' would go before '1.11.4' in some cases, but I am not sure if everyone would agree to this 100% percent of the time. The problem is that versioning for pre-release is not overly strict, so you can get cases like this where the pattern doesn't match up.
We can investigate why the ordering is the way it is using the output from the
>>> from natsort import natsorted, natsort_keygen >>> nsk = natsort_keygen() >>> [nsk(x) for x in natsorted(a)] [(u'', 1, '.', 11), (u'', 1, '.', 11, '.', 0), (u'', 1, '.', 11, '.', 4), (u'', 1, '.', 11, 'a'), (u'', 1, '.', 11, 'a.', 0), (u'', 1, '.', 11, 'a.', 4)]
Since 'a' comes after '.' in the ASCII table, the '.' is put first. If you wanted to reverse this, you could replace '.' in your strings with something that comes at the end of the ASCII table (such as '~'):
>>> natsorted(a, key=lambda x: x.replace('.', '~')) >>> ['1.11', '1.11a', '1.11a.0', '1.11a.4', '1.11.0', '1.11.4']
Would this work for your use case? Is this common enough that you think should be added to the
Thanks for the detailed explanation.
>>> a = ['1.2', '1.2rc1', '1.2beta2', '1.2beta', '1.2alpha', '1.2.1', '1.1', '1.3'] >>> natsorted(a, key=lambda x: x.replace('.', '~'), reverse=True) ['1.3', '1.2.1', '1.2rc1', '1.2beta2', '1.2beta', '1.2alpha', '1.2', '1.1' ]
I don't think it needs to be added to the API, until other people manifest a need too.
Sorry for jumping in here, but I have a related problem like @tdruez described in this bug and it probably does not make sense to open a new issue for this.
I have the same problem, but with release candidates. But I would like to add
If '1.11' were '1.11.0' instead, this would work as expected (assuming you do the '~' trick I suggested). The sorting algorithm doesn't actually comprehend the input as versions numbers, but rather separates out the numbers for you so that things ascend properly. What is happening is that each of the four numbers you suggest have '1.11' at the front, so the one with no trailing characters is placed first. Imagine that we replaced '1.11' with 'and', and you will see what I mean:
To remedy this, you can try something bold like this:
>>> natsorted(['1.11', '1.11rc1', '1.11a1', '1.11b1'], key=lambda x : x+'z') ['1.11a1', '1.11b1', '1.11rc1', '1.11']
This will tack on the 'z' character to each version, so that you will be sorting
If for some reason this does not work, let me know why and I can try and suggest other ways.