Improvement to the Natural Order Sort #3276

jdunkerley · 2022-02-14T16:57:09Z

Pull Request Description

Moving away from RegEx based comparing of 2 texts to using break iterators to progress piecemeal.

Also added an initial Faker class to the test library allowing construction of mock string.

Important Notes

Significant performance gains.

Old	New
7372.4	409.5

Test results for 10,000 values being sorted in an vector.
Average over 10 runs
Running on an r5a.xlarge EC2

Checklist

Please include the following checklist in your PR:

The documentation has been updated if necessary.
All code conforms to the Scala, Java, and Rust style guides.
All documentation and configuration conforms to the markdown and YAML style guides.
All code has been tested where possible.

wdanilo

The performance improvements are fantastic!

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso

distribution/lib/Standard/Test/0.0.0-dev/src/Faker.enso

radeusgd

Looks good but I have a few comments.

The main function is quite complex - it is expected to some extent because the performed operation is not trivial. But I'm thinking if we could simplify it a bit - one idea would be to separate the lazy splitting of text into texts/numbers into a separate construct (either with a stateful iterator or ideally, in a bit more Enso-way, with a lazy list) - but the only reasonable way to do this that I see would not be efficient on pairs like 10...0 (a very long number) vs a (just a letter) - because it would still parse the very long number unnecessarily.
So I think your approach is indeed best, in terms of performance (because it looks at the minimum number of characters to resolve the comparison).
I think it may be worth to add comments with a few words of explanation of the intent behind the helper functions (get_number and order). Explaining in more detail what the arguments represent and what the functions return could make it easier to understand what is going on here.

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso

distribution/lib/Standard/Test/0.0.0-dev/src/Faker.enso

test/Benchmarks/src/Natural_Order_Sort.enso

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso

Data generator for benchmarking

Benchmark script

Restore missing ToDo

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso

jdunkerley force-pushed the wip/jd/natural-order-181176589 branch from de53c0f to f06f40f Compare February 14, 2022 19:48

jdunkerley requested review from NedHarding and wdanilo February 14, 2022 19:48

jdunkerley marked this pull request as ready for review February 14, 2022 19:48

jdunkerley requested review from 4e6 and radeusgd as code owners February 14, 2022 19:48

wdanilo approved these changes Feb 14, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

distribution/lib/Standard/Test/0.0.0-dev/src/Faker.enso Outdated Show resolved Hide resolved

4e6 approved these changes Feb 15, 2022

View reviewed changes

radeusgd reviewed Feb 15, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

radeusgd reviewed Feb 15, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

radeusgd reviewed Feb 15, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

radeusgd reviewed Feb 15, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

jdunkerley and others added 5 commits February 16, 2022 09:08

Improved Natural Order

79fca92

Data generator for benchmarking

Missing Import

351f4ff

Benchmark script

Update Natural_Order.enso

0f24e8f

Restore missing ToDo

Changelog

9265790

PR Comments

9e9d70b

jdunkerley force-pushed the wip/jd/natural-order-181176589 branch from fc9696d to 9e9d70b Compare February 16, 2022 09:10

PR Comments

be0f089

radeusgd reviewed Feb 16, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Ordering/Natural_Order.enso Outdated Show resolved Hide resolved

jdunkerley added 2 commits February 16, 2022 12:30

Additional comments.

0bc8a9b

Correction

878c523

radeusgd approved these changes Feb 16, 2022

View reviewed changes

jdunkerley enabled auto-merge (squash) February 16, 2022 13:00

jdunkerley added 3 commits February 16, 2022 13:02

Merge branch 'develop' into wip/jd/natural-order-181176589

7ae5d13

Merge branch 'develop' into wip/jd/natural-order-181176589

561711d

Merge branch 'develop' into wip/jd/natural-order-181176589

d440ce4

jdunkerley merged commit 68b85de into develop Feb 16, 2022

jdunkerley deleted the wip/jd/natural-order-181176589 branch February 16, 2022 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement to the Natural Order Sort #3276

Improvement to the Natural Order Sort #3276

jdunkerley commented Feb 14, 2022 •

edited

Loading

wdanilo left a comment

radeusgd left a comment

Improvement to the Natural Order Sort #3276

Improvement to the Natural Order Sort #3276

Conversation

jdunkerley commented Feb 14, 2022 • edited Loading

Pull Request Description

Important Notes

Checklist

wdanilo left a comment

Choose a reason for hiding this comment

radeusgd left a comment

Choose a reason for hiding this comment

jdunkerley commented Feb 14, 2022 •

edited

Loading