New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
versorted should follow SemVer rules #61
Comments
I'm not particularly interested in implementing this myself. Any takers? |
I'm somewhat interested in this. But there's a much easier solution for SemVer specifically using the semver package: >>> a = ['1.0.0-alpha', '1.0.0-alpha.1', '1.0.0-alpha.beta', '1.0.0-beta', '1.0.0-beta.2', '1.0.0-beta.11', '1.0.0-rc.1', '1.0.0']
>>> natsorted(a, key=semver.parse_version_info)
[
'1.0.0-alpha',
'1.0.0-alpha.1',
'1.0.0-alpha.beta',
'1.0.0-beta',
'1.0.0-beta.2',
'1.0.0-beta.11',
'1.0.0-rc.1',
'1.0.0',
] Perhaps this should just be documented instead? There are many different versioning systems, so you'd likely end up making the API and/or code ugly trying to specifically support them all (or even just the most popular) from within natsort. It could also be supported by creating an algorithm for SemVer and others if needed. Users would still need to specify the algorithm in the call, but natsort would handle any package imports, etc. Edit: Should have used |
I like this idea, but to be successful I think it needs to handle input that contains versions (e.g. package names with versions), like the below list a = [
"package-1.0.0.tar.gz",
"package-1.0.0-alpha.tar.gz",
"package-1.0.0-rc.gz",
"package-1.0.0-alpha.1.tar.gz",
"package-1.0.0-beta.tar.gz",
] Can the semver package handle this (documentation's pretty light so it's not immediately obvious to me if it does). Alternatively, users could jjust be recommended to remove |
The semver only handles version strings. So, it might be possible to find semantic version strings within package and file names (if algorithm specified) to help determine the sorting key in some way. The only tricky part might be separating extensions from the end of the version string in some cases. The semver package has a regular expression that might be modified a bit to get anything preceding and following the version string as well as the version string. It really depends on how generalized you want to get. Should it be limited to only things that look like package and file names? Should it be able to support any string as long as there is a semantic version string in it? |
Yeah, I don't think there's going to be a reliable way to separate the file extension from dotted pre-release or build sections short of whitelisting extensions. |
Yeah, that's why I had lost interest in implementing 😄 I think this can be done using a factory function given to the user so that they can make a custom key. I'll give it some thought and respond later today with an idea of what I am thinking. |
What if natsort provided a key-generation function for semver that optionally accepted a regular expression that matches possible suffixes (like file extensions). This way, the user defines where the semantic version ends. (Instead of a key-generation function, if this were implemented as part of This is a really hard problem. I think that if it is implemented with known limitations, and those limitations are documented clearly, it will be a win. |
I think that this is really a bigger change. If
I'm sure I've probably forgotten some of the questions/ideas I came up with last night in bed about your idea. But here are my thoughts on these:
Note: Restrictions in my opinions generally include a `when a version algorithm or the versorted function is used' caveat. But, if I'm not mistaken, the currently supported version algorithm is on by default in natsorted, correct? So, here's different questions I have: are there people who actually want file names sorted by versioning rather than as file names are sorted by <insert OS/file manager>? Is this a problem we should be worrying about? My gut feeling says that people are looking for OS/file manager sorting for file names, at least the other case would be quite rare. I also think we're conflating many different ideas/features into one here. I think the idea of supporting version-based sorting on anything other than version strings is a separate idea from supporting sorting strings with versions based on a specific versioning scheme. Frankly, the current version scheme support isn't technically sorting by the version scheme anyway, hence the documented workaround. I think supporting sorting of version strings based on a version scheme through the use of algorithms is what should be done right now. Maybe it should still be done by taking |
I think you have many good points. There's a lot to sift through - apologies if I miss something you felt is important. I think that many of the points you made can be addressed if I give some history of In retrospect, this was a terrible idea. I had many issues filed where In retrospect, this was a terrible idea. Discoverability of this was low, and it is a lot to type. Again, instead of changing the default behavior to what people want 99% of the time, I decided to make it easier to use that algorithm by providing a function called In retrospect, this was a terrible idea. Now there was a function with a name that implies that it treated version numbers specially in some manner, when in fact it was just using a run-of-the-mill algorithm that just happens to work for most version numbers. So, in I don't really like the presence of Every other function within the
I don't think so, because that only works if what a user is sorting is only the version, and if that is the limitation then
I think this is getting a bit too specific. I really don't like the idea of tailoring the algorithm to assume the input data conforms to a particular "shape". Many of the problems I faced early on with this library were because I made assumptions about how the input data looked. So, rather than supporting packages/file names, the way I want to approach the problem is handling arbitrary input where the definition of the number is a version rather than a signed/unsigned float/int.
I think that an optimal solution to finding versions in an arbitrary string would not need a workaround in order to give the correct results.
This. I think these are the correct types of questions to be asking. Consider that you have a folder of distributions of a package, e.g. "foo-1.0.0.zip", "foo-2.0.0.zip", etc. And you want to present them to a user to indicate the available packages they can use, starting from the latest. In this case the sorting would be on more than just the version. Did Perhaps the whole idea of supporting SemVer natively and completely is me looking for a problem where there isn't one. Your suggestion of just using |
Just some quick clarifications and conclusion.
Technically, as shown in the existence of that workaround,
The versions in the workaround example are not valid semantic versions, so it couldn't replace the workaround for non-SemVer version strings. I'm not sure that workaround works properly for all semantic versions (or how many cases it actually does solve). I really do (and always did) think this should be a documented example using |
Regarding your first point, I think we are both in agreement. There is no handling within The real issue is that this is not called out explicitly in the documentation. I will make sure to do that. As for your second point, I hadn't given it too much thought. The real problem is (as you pointed out in an earlier comment) that there are simply too many version number conventions to be able to reliably handle all of them. The best case scenario is to show users examples of how to handle various version schemes (like the workaround or To avoid confusion, in the next major release I think |
The documentation used to give the impression that natsort comprehended versions in a meaningful way. Hopefully that fantasy has been dispelled. This is in response to the discussion in #61.
Resolution:
@thebigmunch Thanks for the discussion! |
Minimum, Complete, Verifiable Example
According to https://semver.org/,
1.0.0-alpha < 1.0.0-alpha.1 < 1.0.0-alpha.beta < 1.0.0-beta < 1.0.0-beta.2 < 1.0.0-beta.11 < 1.0.0-rc.1 < 1.0.0
.natsort
puts the1.0.0
is in the wrong place.Error message, Traceback, Desired behavior, Suggestion, Request, or Question
There is a useful hack to make this work, but that should not be needed for a function called
versorted
. It should handle this out-of-the-box.This would be a breaking change, and might require updating the
natsort
major version.The text was updated successfully, but these errors were encountered: