Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add natural sort functions for wxString #1923

Closed
wants to merge 45 commits into from
Closed

Add natural sort functions for wxString #1923

wants to merge 45 commits into from

Conversation

PBfordev
Copy link
Contributor

@PBfordev PBfordev commented Jul 2, 2020

This PR builds on #780, attempting to fix some possible deficiencies in that PR.

I have not updated the documentation yet, as this PR may be rejected for the reasons stated in my last comment in #780.

The code also needs to account for wxUSE_REGEX == 0 but I do not know how to best do it.

Rocketmagnet and others added 6 commits April 27, 2018 10:50
Add a new string fragment type for whitespace and punctuation which needs
to be assessed separately from letters and symbols.

Use wxUint64 instead of long for storing the value for numeric fragment.

Use collate instead of compare for non-numeric fragments.

Change names for the public comparison functions: wxWidgets provided function
is now named wxCmpGenericNatural() and for common public use is wxCmpNatural()
which calls a native function in wxMSW and wxCmpGenericNatural() elsewhere.

Make some other changes to simplify and possibly speed up the code.
In src/arrstr.cpp add wx/utils.h and wx/wxcrt.h includes.
Copy link
Contributor

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks broadly good to me and the remaining small changes should be very simple to make.

Concerning wxUSE_REGEX, I think it would be fine to just fall back on wxStrcmp() in this case, we just would need to mention this in the documentation (and also disable unit test, i.e. add a warning about skipping it, in this case).

Thanks again for reviving this, it would be nice to finally see it merged!

include/wx/arrstr.h Outdated Show resolved Hide resolved
interface/wx/arrstr.h Outdated Show resolved Hide resolved
interface/wx/arrstr.h Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
@vadz
Copy link
Contributor

vadz commented Jul 4, 2020

I may be missing something here, but why are you so pessimistic about the chance of this being merged? I don't see any really bad problems with these changes. Of course, they're not perfect, but then nothing is and things could be improved later, while for now this already adds a wrapper for a useful (in the context of UI development) MSW function with a reasonable fallback under the other platforms, so I think it should be fine to merge it... Unless, once again, I simply oversaw some critical problem?

@PBfordev
Copy link
Contributor Author

PBfordev commented Jul 4, 2020

why are you so pessimistic about the chance of this being merged?

My innate pessimism aside, the issues I see (none is critical):

  1. I cannot guarantee there is not a bug there, the algorithm seems simple enough, but even the original PR had it wrong.
  2. Sorting like this is very inefficient since we do not provide a sort function but only a comparison function. Hence string transformations must be done during each comparison, i.e., possibly many times for each string. OTOH, the function should probably not be used for large amounts of strings.
  3. The "fragmentization" algorithm does not properly handle control characters. OTOH, this function should be used for strings that should not have such characters; moverover, ASCII tabs and new lines which are probably most common control characters for strings presented to the user are included in :space:. I thought of adding a string fragment type using regex :cntrl: but that would probably involve also Unicode control characters which are a can of worms which I did not dare open and left it to the collation function.
  4. The original PR was in a limbo for two years.

Anyway, I will address the issues pointed in the review, hopefully by Tuesday at the latest.

"$" is not considered a letter/symbol, but a punctuation.
In wxCmpNaturalGeneric() do not use wxStrcmp() which requires
including wxcrt.h; use wxString::compare() instead.

In the wxCmpNaturalGeneric() documentation, fix the typo in wxRegEx
(was wxRegex) and do not refer to undocumented wxStrcmp().
@PBfordev
Copy link
Contributor Author

PBfordev commented Jul 5, 2020

I have addressed the review comments, please let me know if there something more to be done here.

@vadz
Copy link
Contributor

vadz commented Jul 5, 2020

Looks good to me, I'm ready to merge this once the CI builds finish. Thanks again!

@vadz vadz added the ok to merge Was reviewed and ok'd by at least one person label Jul 5, 2020
@PBfordev
Copy link
Contributor Author

PBfordev commented Jul 6, 2020

You probably have a plan for this but I wonder how to merge it? I think, if possible, it would be best to somehow squash it into just two commits: the first containing all the commits by Rocketmagnet, the second with all mine.

EDIT
I forgot to mention that I am not sure about using the simplest string comparison when wxRegEx is unavailable. If, for example, in the future the natural sorting is added to wxFileCtrl and wxDirCtrl, perhaps doing a case-insensitive collation instead would be much better.

src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
src/common/arrstr.cpp Show resolved Hide resolved
src/common/arrstr.cpp Outdated Show resolved Hide resolved
If wxRegEx is not available, do not just make a simple string comparison
but perform a case-insensitive collation.
src/common/arrstr.cpp Outdated Show resolved Hide resolved
@PBfordev
Copy link
Contributor Author

PBfordev commented Jul 7, 2020

I have addressed the comments from the last review.

vadz added a commit that referenced this pull request Jul 7, 2020
Add natural sort functions.

See #1923
@vadz
Copy link
Contributor

vadz commented Jul 7, 2020

Thanks again, I've squash-rebased as suggested and pushed it now.

@vadz vadz closed this Jul 7, 2020
@PBfordev PBfordev deleted the natural-sort branch October 13, 2020 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok to merge Was reviewed and ok'd by at least one person
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants