Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set which OS to sort by in os_sorted #150

Open
PhillipMaire opened this issue Apr 24, 2022 · 8 comments
Open

Set which OS to sort by in os_sorted #150

PhillipMaire opened this issue Apr 24, 2022 · 8 comments

Comments

@PhillipMaire
Copy link

Describe the feature or enhancement
use an optional input to os_sorted that allows sorting based on any operating system even when you are not using that operating system e.g. os_sorted(my_list, force_os = 'windows') will sort based on a windows machine even when in Unix/mac

Provide a concrete example of how the feature or enhancement will improve natsort
If people need to replicate results in one operating system using code created in another operating system which uses os_sorted then this would be useful. I have a project where I work on cloud drives across machines and happen to use os_sorted, I would love to replace some stuff that already integrates os_sorted. alternatively could you show me how to do this using the os_sort_keygen? or some other method?

Would you be willing to submit a Pull Request for this feature?
I don't have the experience to do this so I am sorry but I can't help here

thank you for your help and useful package

@SethMMorton
Copy link
Owner

This is what I had wanted to do originally, and why I sat on #41 for so many years. Unfortunately, as far as I can tell this is not possible.

In order to do the OS sorting on Windows, natsort literally loads the low-level function on Windows that is responsible for deciding how to sort directory components. This function is simply not available on non-Windows operating systems. It also appears that the sort order of Windows Explorer is proprietary and so do not have a way to re-implement it.

If you can find anything that refutes either of my findings then I would love to have this implemented.

@PhillipMaire
Copy link
Author

I see, no I don't have anything to refute your findings but have you seen this post? (assuming the answer is yes but just in case)

Explorer uses the API StrCmpLogicalW() for this kind of sorting (called 'natural sort order').

You don't need to write your own comparison function, just use the one that already exists.

A good explanation can be found here.

that being said would it be possible to reverse engineer? you would have to want it bad enough haha but in theory if you have a windows system and you generate files you could use a classifier to make your own algorithm that effectively sorted the same as windows. just a thought though

@SethMMorton
Copy link
Owner

SethMMorton commented Apr 25, 2022

StrCmpLogicalW() is what natsort is using under-the-hood on Windows.

It certainly could be reverse engineered, but unless I am getting paid to do it I am not interested in spending my nights and weekends doing so (especially given that I don't readily have access to a Windows machine).

I would happily accept a PR from someone else who does want to spend their free time doing that.

@PhillipMaire
Copy link
Author

totally get that! did you see this comment on issue 41. it seems like WINE implements a version of StrCmpLogicalW meant for UNIX

@SethMMorton
Copy link
Owner

I went down the rabbit hole a bit, and found that there is quite a bit of code behind that collation function.

If you go down the rabbit hole for CompareStringW (which is what we care about) you end up at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/locale.c#L2495 which has a fair bit of logic in it. One of the bits of logic involves using bitshifts to access the collation table at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/collation.c... that's all pretty messy.

Though, the function at https://github.com/wine-mirror/wine/blob/e909986e6ea5ecd49b2b847f321ad89b2ae4f6f1/dlls/kernelbase/locale.c#L2131 looks more promising in terms of being useful in the scope of natsort. Not sure how easily that could be ported. It would still involve that 11K element collation array...

@PhillipMaire
Copy link
Author

PhillipMaire commented Apr 25, 2022

I see I see, thanks for the effort! never heard of bitshifts before, interesting!

ok well I will leave this request open in case you or anyone else finds it and wants to implement it. But I won't expect it for the reasons you mentioned above. thanks again for your package it is appreciated

@PhillipMaire
Copy link
Author

oh one more thing, just a thought but someone could also implement a windows-ish and UNIX-ish sorting that would allow users to get the desired results on most cases (excluding some edge cases) but the method would be hardcoded making it reproducible across operating systems.

@SethMMorton
Copy link
Owner

oh one more thing, just a thought but someone could also implement a windows-ish and UNIX-ish sorting that would allow users to get the desired results on most cases (excluding some edge cases) but the method would be hardcoded making it reproducible across operating systems.

If one just uses natsorted with alg=ns.PATH they will get Windows-ish and UNIX-ish results for >90% of the data you would want to sort and it will be reproducible. Tossing in ns.LOCALE will make it closer to >95% of the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants