Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'alphanumeric sort' commands #4289

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

seb-bl
Copy link
Contributor

@seb-bl seb-bl commented Oct 15, 2022

Implements #4153 with the crate mentioned in the discussion of the issue.

This adds 2 new commands, sortn and rsortn, corresponding to the existing ones (sort and rsort), and they sort according to https://docs.rs/alphanumeric-sort/1.4.4/alphanumeric_sort/fn.compare_str.html

@the-mikedavis
Copy link
Member

It's a short (~100L) block of code within the library and the library includes a bunch of other functions I don't think will be used in the future so I think we should vendor this block https://github.com/magiclen/alphanumeric-sort/blob/bea0284409f68407dd25b432fd901b62f6e944e7/src/lib.rs#L95-L199

@kirawi

This comment was marked as resolved.

@kirawi kirawi added A-helix-term Area: Helix term improvements S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 15, 2022
@seb-bl
Copy link
Contributor Author

seb-bl commented Oct 15, 2022

I removed the dependency, and added a function to compare the strings.

When I copied the source, I realized the handling of long numbers was incorrect (it accumulates it in a f64, thus rounds it). I looked at the other crate, and it had a similar problem (accumulating the number in a u64).

So I implemented my own function. It orders numbers with leading zeroes like the crate alphanumeric-sort, but uses the slice containing the digits instead of a f64 for the comparison

@kirawi
Copy link
Member

kirawi commented Oct 15, 2022

Just to note, I think that using a u64 would've been okay. len() returns a usize which is equivalent to u64 on 64bit systems. Assuming that each number is equivalent to a byte, then it could handle 16,384 petabytes of data.

}
}

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved into a

#[cfg(test)]
mod tests {
}

block at the bottom of the file.

@kirawi
Copy link
Member

kirawi commented Oct 15, 2022

By the way, make sure to run cargo xtask docgen and commit the changes to make CI pass.

@seb-bl
Copy link
Contributor Author

seb-bl commented Oct 15, 2022

Thanks for the hints 🙂

I think I have not explained the problem with the u64 correctly. When encountering a number,lexical-sort parses the number into a u64, digit by digit, it multiplies the number by 10 and add the current digit (at this line). So for example "18446744073709551616" (=2^64) would cause an overflow. Numbers this long are not very common, but may still happen

@kirawi kirawi added S-waiting-on-review Status: Awaiting review from a maintainer. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 16, 2022
@kirawi kirawi self-requested a review February 26, 2023 22:08
Comment on lines 1400 to 1405
fn sort_impl(
cx: &mut compositor::Context,
_args: &[Cow<str>],
reverse: bool,
alphanumeric: bool,
) -> anyhow::Result<()> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could remove both reverse and alphanumeric parameters and pass a single parameter called cmp_fn which is a FnMut(&Tendril, &Tendril) -> Ordering leaving to the caller to say how it want the ordering.

See: #8436

Comment on lines +1416 to 1421
fragments.sort_by(match (reverse, alphanumeric) {
(true, false) => |a: &Tendril, b: &Tendril| b.cmp(a),
(false, false) => |a: &Tendril, b: &Tendril| a.cmp(b),
(true, true) => |a: &Tendril, b: &Tendril| numeric_cmp(b, a),
(false, true) => |a: &Tendril, b: &Tendril| numeric_cmp(a, b),
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @leandrobbraga - this block is a little dense and we can make the call sites for sort_impl a little cleaner by moving these closures to parameters of sort_impl.

So this function would replace the reverse parameter with a compare_fn: impl FnMut(&Tendril, &Tendril) -> std::cmp::Ordering, param and then this block can be just fragments.sort_by(compare_fn);.

We could also use this opportunity to drop the args parameter which is currently unused

@the-mikedavis the-mikedavis added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from a maintainer. labels Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-helix-term Area: Helix term improvements S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants