Optimise English implementation #53

jakewilliami · 2022-06-02T10:08:42Z

Coming back to this, there are changes I'd like to make regarding optimisation. I started these a while ago, but never pushed anything. Figured I should open a PR to track these changes.

jakewilliami · 2022-06-26T14:28:00Z

I'm worried that these optimisations aren't actually optimising anything! I have had success in the past with using IOBuffer instead of string concatenation, as this can be slow. However, it appears that these IOBuffer changes have slowed it down and caused it to allocate more:

julia> @btime spelled_out($123456789, lang = $:en_UK)  # master at dc74774
  8.797 μs (204 allocations: 4.39 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

julia> @btime spelled_out($123456789, lang = $:en_UK)  # optimisation at 22e2773
  9.906 μs (220 allocations: 5.83 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

Instead of creating IOBuffers within each function, pass around IOBuffers to modifying functions and only take from them at the end of the process! This vastly reduces the allocations as we are no longer creating a new IOBuffer within each function. However, it is still not as fast as it was (but its memory usage is better).

jakewilliami · 2022-06-26T23:23:57Z

Since last commit, I figured that creating IOBuffers within each function would indeed be slow, as we are often recursing or calling functions multiple times in a single spelled_out call. Hence, I thought that the best way around this would be to only work with one IOBuffer at the top level, and have modifying functions on the buffer. So small_convert_en becomes _small_convert_en!, and large_convert_en becomes _large_convert_en!, and their first argument—an IOBuffer—is only constructed once and is modified by these functions.

The benchmarking results are as follows. Indeed, it uses fewer allocations. However, I wonder where it is spending the most time? It is not as fast as the others (in fact, it might even be slower than all of the other versions).

julia> @btime spelled_out(123456789, lang = :en_GB)  # optimisation at 3175b96
  10.275 μs (190 allocations: 3.92 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

jakewilliami · 2022-06-27T14:13:18Z

For some reason, on a whim I changed write(io, ...) to print(io, ...), and it is considerably faster (with the same allocations). I'm feeling good about this branch again!

julia> @btime spelled_out(123456789, lang = :en_GB)  # optimisation at a9403ed
  10.119 μs (190 allocations: 3.92 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

julia> @btime spelled_out(123456789, lang = :en_GB)  # optimisation at 00f458a
  9.296 μs (190 allocations: 3.92 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

jakewilliami · 2022-06-27T14:23:21Z

I don't know why, but I was being awfully liberal with using BigInts in this code. Reducing when I use these has greatly optimised this

julia> @btime spelled_out(123456789, lang = :en_GB)  # optimisation at 7fb5af2
  6.499 μs (120 allocations: 2.70 KiB)
"one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine"

Instead of using a weird `reverse` method for string manipulation in printing of ordinals, we use a util function. Also, use `IOBuffer` where practical for alt convert methods

jakewilliami · 2022-06-27T15:06:38Z

I think this is probably as optimised as I'm going to get it. The attached jlprof file for profiling information (see ProfileView.jl; you will have to rename the file as GitHub wouldn't allow me to attach the jlprof file directly). At the top of the call graph are predominantly implementations of things that I cannot change nor optimise.

profview.jlprof.txt

Optimise English implementation

Initial commit on optimisation branch

25aa38e

jakewilliami changed the title ~~Optimise English implementation~~ WIP: Optimise English implementation Jun 26, 2022

jakewilliami added 3 commits June 27, 2022 01:21

Removed non-function test line

c0f0664

Added benchmark testing script

99fbc78

Prefer buffers over strings

22e2773

jakewilliami added 2 commits June 27, 2022 03:01

Use IOBuffer in small convert

8c1eabe

jakewilliami added 3 commits June 27, 2022 11:49

Remove unused kwargs to lesser functions

b40d835

Only work with bigints after we check if number is small

a9403ed

Favour print(io, ...) over write(io, ...)

00f458a

Reduced liberal use of big integers

7fb5af2

Better string manipulation and use buf for niche convert funcs

d2f1fef

Instead of using a weird `reverse` method for string manipulation in printing of ordinals, we use a util function. Also, use `IOBuffer` where practical for alt convert methods

jakewilliami added 2 commits June 28, 2022 03:07

Remove original copy backup

a8f2a97

Removed no-longer-needed benchmarking file

9b2bc8b

jakewilliami changed the title ~~WIP: Optimise English implementation~~ Optimise English implementation Jun 27, 2022

jakewilliami merged commit 7e001d5 into master Jun 27, 2022

jakewilliami deleted the optimisation branch June 27, 2022 15:17

jakewilliami added a commit that referenced this pull request Mar 29, 2023

Merge pull request #53 from jakewilliami/optimisation

508e847

Optimise English implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimise English implementation #53

Optimise English implementation #53

jakewilliami commented Jun 2, 2022

jakewilliami commented Jun 26, 2022

jakewilliami commented Jun 26, 2022

jakewilliami commented Jun 27, 2022

jakewilliami commented Jun 27, 2022 •

edited

Loading

jakewilliami commented Jun 27, 2022

Optimise English implementation #53

Optimise English implementation #53

Conversation

jakewilliami commented Jun 2, 2022

jakewilliami commented Jun 26, 2022

jakewilliami commented Jun 26, 2022

jakewilliami commented Jun 27, 2022

jakewilliami commented Jun 27, 2022 • edited Loading

jakewilliami commented Jun 27, 2022

jakewilliami commented Jun 27, 2022 •

edited

Loading