You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comparing strings by their ASCII values is often enough for programmer's purposes, but other kinds of string sorting are often needed for user interaction:
Natural sort for numbers, e.g."I have 99 problems" should be "smaller" than "I have 100 problems".
Case-insensitive sort, e.g."Whatever" and "whatever" should compare equal. This comparator should convert the letters at every comparison since the alternative using more memory can simply be realized with schwartz_adapter and a simple upper or lower projection.
Both might be needed at once, so a general string_compare could be provided, with a bitfield-like structure describing which properties should be taken into account. Not sure whether it's better to pass the value as a template parameter at compile-time, or as a parameter to the constructor at runtime. If it is passed at compile-time, it allows to have dedicated case_insensitive_compare and natural_compare specialization type aliases implementing more efficient algorithms. There might be way to force string_compare to accept runtime parameters when no compile-time parameter is explicitly given.
Concerning the names, natural_less may be better than natural_compare; morevoer, it might allow for a corresponding natural_greater if needed. Making them customization points might let the door open for other people to reimplement the functions for custom string types.
This comparator would only handle ASCII, not unicode since handling unicode would require more code than the library has in its current state.
This could be the birth of a new cpp-sort/comparators subdirectory and the linked cpp-sort/comparators.h general include file. It could be interesting to expose some of the comparators in detail such as projection_compare, since that specific one could potentially be used at more places in the library and thus might need some documentation (see issue #53). The *_less and *_greater comparators described in #18 can also be included in this module.
The text was updated successfully, but these errors were encountered:
The default version works with any forward sequence of char, including
std::string, std::vector<char> or char[]. It might work with other types
of characters than char, but it isn't guaranteed since it uses
std::isdigit.
The comparators aren't perfect, but the remaining problems are hardly addressable or are the topic of other issues:
natural_less only recognizes ASCII digits, but making it work with other kinds of digits is probably not doable with the standard library unless the numeration system is positional and the digits are one-character long and d1 < d2 also means that parse(d1) < parse(d2). I don't know enough about character encondings and numeration systems to be sure that it's worse the cost.
case_insensitive_less isn't performant because of the repeated calls to std::use_facet, but this can be addressed by issue Refining comparators and projections with type #83. Additional performance might be obtained for char locales thanks to a technique described in How to do case-insensitive string comparison by Matt Austern.
I decided to separate the comparators rather than combining them for the sake of simplicity. A case-insensitive natural sort could surely be performed by a mix of comparisons and projections if needed:
Comparing strings by their ASCII values is often enough for programmer's purposes, but other kinds of string sorting are often needed for user interaction:
"I have 99 problems"
should be "smaller" than"I have 100 problems"
."Whatever"
and"whatever"
should compare equal. This comparator should convert the letters at every comparison since the alternative using more memory can simply be realized withschwartz_adapter
and a simpleupper
orlower
projection.Both might be needed at once, so a general
string_compare
could be provided, with a bitfield-like structure describing which properties should be taken into account. Not sure whether it's better to pass the value as a template parameter at compile-time, or as a parameter to the constructor at runtime. If it is passed at compile-time, it allows to have dedicatedcase_insensitive_compare
andnatural_compare
specialization type aliases implementing more efficient algorithms. There might be way to forcestring_compare
to accept runtime parameters when no compile-time parameter is explicitly given.Concerning the names,
natural_less
may be better thannatural_compare
; morevoer, it might allow for a correspondingnatural_greater
if needed. Making them customization points might let the door open for other people to reimplement the functions for custom string types.This comparator would only handle ASCII, not unicode since handling unicode would require more code than the library has in its current state.
This could be the birth of a new
cpp-sort/comparators
subdirectory and the linkedcpp-sort/comparators.h
general include file. It could be interesting to expose some of the comparators indetail
such asprojection_compare
, since that specific one could potentially be used at more places in the library and thus might need some documentation (see issue #53). The*_less
and*_greater
comparators described in #18 can also be included in this module.The text was updated successfully, but these errors were encountered: