Skip to content

Conversation

dkuku
Copy link
Contributor

@dkuku dkuku commented Dec 30, 2024

The benchmark with go made me think if we can get any speed gains in synthetic tests :D
This pr changes datetime formatting functions:
around 55% of memory usage and 3x speedup when serializing with optional 4x when using iodata ;)

It also adds new functions *_to_iodata that can be used by libraries that operate on iodata like ecto or json

defmodule CalendarStringBench do
  def run do
    datetime = ~U[2024-12-30 12:21:45.000123Z]

    Benchee.run(
      %{
        "Elixir.DateTime.to_string" => fn -> DateTime.to_string(datetime) end,
        "Calendar.ISO.datetime_to_string" => fn ->
          Calendar.ISO.naive_datetime_to_string(
            datetime.year,
            datetime.month,
            datetime.day,
            datetime.hour,
            datetime.minute,
            datetime.second,
            datetime.microsecond
          )
        end,
        "Calendar.IOISO.datetime_to_string" => fn ->
          Calendar.IOISO.naive_datetime_to_string(
            datetime.year,
            datetime.month,
            datetime.day,
            datetime.hour,
            datetime.minute,
            datetime.second,
            datetime.microsecond
          )
        end,
        "Calendar.IOISO.datetime_to_iodata" => fn ->
          Calendar.IOISO.naive_datetime_to_iodata(
            datetime.year,
            datetime.month,
            datetime.day,
            datetime.hour,
            datetime.minute,
            datetime.second,
            datetime.microsecond
          )
        end
      },
      time: 10,
      memory_time: 2,
      formatters: [
        #  {Benchee.Formatters.HTML, file: "bench/output/calendar_string.html"},
        Benchee.Formatters.Console
      ]
    )
  end
end

# Run the benchmark
CalendarStringBench.run()

result

Benchmarking Calendar.IOISO.datetime_to_iodata ...                                                                                            
Benchmarking Calendar.IOISO.datetime_to_string ...                                                                                            
Benchmarking Calendar.ISO.datetime_to_string ...                                                                                              
Benchmarking Elixir.DateTime.to_string ...                                                                                                    
Calculating statistics...                                                                                                                     
Formatting results...                                                                                                                         
                                                                                                                                              
Name                                        ips        average  deviation         median         99th %                                       
Calendar.IOISO.datetime_to_iodata        3.80 M      263.30 ns ±12989.04%         201 ns         350 ns                                       
Calendar.IOISO.datetime_to_string        2.86 M      350.05 ns  ±6363.30%         290 ns         471 ns                                       
Calendar.ISO.datetime_to_string          1.10 M      905.92 ns  ±3587.70%         651 ns        1352 ns                                       
Elixir.DateTime.to_string                1.04 M      958.94 ns  ±2794.13%         701 ns        1673 ns                                       
                                                                                                                                              
Comparison:                                                                                                                                   
Calendar.IOISO.datetime_to_iodata        3.80 M                                                                                               
Calendar.IOISO.datetime_to_string        2.86 M - 1.33x slower +86.75 ns
Calendar.ISO.datetime_to_string          1.10 M - 3.44x slower +642.62 ns
Elixir.DateTime.to_string                1.04 M - 3.64x slower +695.65 ns

Memory usage statistics:

Name                                 Memory usage                      
Calendar.IOISO.datetime_to_iodata           480 B                      
Calendar.IOISO.datetime_to_string           528 B - 1.10x memory usage +48 B
Calendar.ISO.datetime_to_string             896 B - 1.87x memory usage +416 B
Elixir.DateTime.to_string                   896 B - 1.87x memory usage +416 B

[
time_to_iodata_format(hour, minute, second, format),
".",
microsecond |> zero_pad(6) |> IO.iodata_to_binary() |> binary_part(0, precision)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there may be a better way for this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the precision to zero pad and do it in one pass?

3 -> ["000", num]
4 -> ["0000", num]
5 -> ["00000", num]
6 -> ["000000", num]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we would ever reach this, since byte_size is never 0. We should also check that count <= 6.

@josevalim
Copy link
Member

Yes, please do complete this pull request and expose these functions. Then please change JSON.Encoder to use the iodata variants for Calendar.ISO. :) Thank you!

defp date_to_string_guarded(year, month, day, :extended) do
zero_pad(year, 4) <> "-" <> zero_pad(month, 2) <> "-" <> zero_pad(day, 2)
defp date_to_iodata_guarded(year, month, day, :extended) do
[zero_pad(year, 4), "-", zero_pad(month, 2), "-", zero_pad(day, 2)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[zero_pad(year, 4), "-", zero_pad(month, 2), "-", zero_pad(day, 2)]
[zero_pad(year, 4), ?-, zero_pad(month, 2), ?-, zero_pad(day, 2)]

This should save a small amount of mem, but hey, why not!


defp zero_pad(val, count) do
"-" <> zero_pad(-val, count)
["-", zero_pad(-val, count)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Suggested change
["-", zero_pad(-val, count)]
[?-, zero_pad(-val, count)]

defp time_to_string_format(hour, minute, second, :extended) do
zero_pad(hour, 2) <> ":" <> zero_pad(minute, 2) <> ":" <> zero_pad(second, 2)
defp time_to_iodata_format(hour, minute, second, :extended) do
[zero_pad(hour, 2), ":", zero_pad(minute, 2), ":", zero_pad(second, 2)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (as below):

Suggested change
[zero_pad(hour, 2), ":", zero_pad(minute, 2), ":", zero_pad(second, 2)]
[zero_pad(hour, 2), ?:, zero_pad(minute, 2), ?:, zero_pad(second, 2)]

@dkuku
Copy link
Contributor Author

dkuku commented Dec 30, 2024

Do we really use other precisions than 0,3 and 6?

Name                                        ips        average  deviation         median         99th %
Calendar.IOISO.datetime_to_iodata        4.75 M      210.37 ns ±14119.07%         170 ns         281 ns
Calendar.IOISO.datetime_to_string        3.20 M      312.52 ns  ±5866.09%         271 ns         420 ns
Calendar.ISO.datetime_to_string          1.12 M      893.28 ns  ±3302.84%         671 ns        1403 ns
Elixir.DateTime.to_string                1.09 M      915.88 ns  ±2761.10%         701 ns        1353 ns

Comparison: 
Calendar.IOISO.datetime_to_iodata        4.75 M
Calendar.IOISO.datetime_to_string        3.20 M - 1.49x slower +102.15 ns
Calendar.ISO.datetime_to_string          1.12 M - 4.25x slower +682.91 ns
Elixir.DateTime.to_string                1.09 M - 4.35x slower +705.51 ns

Memory usage statistics:

Name                                 Memory usage
Calendar.IOISO.datetime_to_iodata           456 B
Calendar.IOISO.datetime_to_string           504 B - 1.11x memory usage +48 B
Calendar.ISO.datetime_to_string             896 B - 1.96x memory usage +440 B
Elixir.DateTime.to_string                   896 B - 1.96x memory usage +440 B

@josevalim
Copy link
Member

Precision can be any number between 0 and 6, even though we commonly use 0, 3, and 6.

@dkuku
Copy link
Contributor Author

dkuku commented Dec 30, 2024

@whatyouhide I updated it but according to benchee the memory usage does not change - it's still 456B
I have between 4.6M -5.1M ips depending on the run.
When I change the all the zeros from "000" to [?0, ?0, ?0] the memory usage goes up to 488B

Name                                        ips        average  deviation         median         99th %                                       
Calendar.IOISO.datetime_to_iodata        5.09 M      196.35 ns ±16231.76%         160 ns         270 ns                                       
                                                                                                                                              
Memory usage statistics:                                                                                                                      
                                                                                                                                              
Name                                 Memory usage                                                                                             
Calendar.IOISO.datetime_to_iodata           456 B                                                                                             
                                                                                                                                              
**All measurements for memory usage were the same**     

@dkuku
Copy link
Contributor Author

dkuku commented Dec 30, 2024

Yes, please do complete this pull request and expose these functions. Then please change JSON.Encoder to use the iodata variants for Calendar.ISO. :) Thank you!

@josevalim This is currently implemented as:

defimpl JSON.Encoder, for: [Date, Time, NaiveDateTime, DateTime, Duration] do
  def encode(value, _encoder) do
    [?", @for.to_iso8601(value), ?"]
  end
end

I would have to create something like to_iso8601_iodata ?

@josevalim
Copy link
Member

No need. That’s to_string for the iso calendar!

@josevalim
Copy link
Member

All of the literals point to a constant, either ?0 or “0” take no extra runtime memory. The reason ?0, ?0 takes more than “00” is because it takes two entries in the list. Overall, a single character should be an integer, multiple should be a binary.

@dkuku dkuku force-pushed the dk_use_iodata_when_formatting_dates branch from c2873fe to 7025133 Compare December 30, 2024 21:33
Comment on lines 1240 to 1241
iex> Calendar.ISO.time_to_iodata(2, 2, 2, {2, 6})
[[["0", "2"], 58, ["0", "2"], 58, ["0", "2"]], 46, ["00000", "2"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should show the exact iodata, or a call to IO.iodata_to_binary() (which is what we usually do)


defp zero_pad(val, count) do
"-" <> zero_pad(-val, count)
[?-, zero_pad(-val, count)]
Copy link
Contributor

@sabiwara sabiwara Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a further optimization if we're using IO-data is to use improper lists when possible, replacing the last , by |:

Suggested change
[?-, zero_pad(-val, count)]
[?- | zero_pad(-val, count)]

I haven't benchmarked it and wouldn't expect it to be a huge difference, but it should create less cons cells in theory?
(we might need to use @dialyzer :no_improper_lists)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sabiwara it makes a difference in the memory usage:

Name                                        ips        average  deviation         median         99th %
Calendar.IOISO.datetime_to_iodata        4.89 M      204.42 ns ±19386.20%         160 ns         271 ns

Memory usage statistics:

Name                                 Memory usage
Calendar.IOISO.datetime_to_iodata           376 B

**All measurements for memory usage were the same**

def microseconds_to_iodata(microsecond, 6), do: zero_pad(microsecond, 6)

defp microseconds_to_iodata(microsecond, precision) do
def microseconds_to_iodata(microsecond, precision) do
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need it in the next pr

Copy link
Contributor

@sabiwara sabiwara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks @dkuku 💜

Comment on lines 1331 to 1332
Converts the given date into a iodata.
Look at date_to_string/4 for more information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Converts the given date into a iodata.
Look at date_to_string/4 for more information
Converts the given date into an iodata.
See `date_to_string/4` for more information.

Comment on lines 1234 to 1236
Converts the given time into a iodata.
Look at time_to_string/5 for more information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Converts the given time into a iodata.
Look at time_to_string/5 for more information
Converts the given time into an iodata.
See `time_to_string/5` for more information.

Comment on lines 1410 to 1411
Converts the given naive_datetime into a iodata.
Look at naive_datetime_to_iodata/8 for more information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Converts the given naive_datetime into a iodata.
Look at naive_datetime_to_iodata/8 for more information
Converts the given naive_datetime into an iodata.
See `naive_datetime_to_iodata/8` for more information.

Comment on lines 1534 to 1535
Converts the given datetime into a iodata.
Look at datetime_to_iodata/12 for more information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Converts the given datetime into a iodata.
Look at datetime_to_iodata/12 for more information
Converts the given datetime into an iodata.
See `datetime_to_iodata/12` for more information.

Co-authored-by: Jean Klingler <sabiwara@proton.me>
@josevalim josevalim merged commit 9b95af8 into elixir-lang:main Dec 31, 2024
9 checks passed
@josevalim
Copy link
Member

💚 💙 💜 💛 ❤️

@dkuku
Copy link
Contributor Author

dkuku commented Jan 1, 2025

There is one small micro optimization we can make here that is still visible: a special case for zero_pad with 2 elements, which is called quite often. The math is really simple and doesn't require calling Integer.to_string.

  defp zero_pad(val, 2) when val >= 0 and val < 10 do
    [?0, val + ?0]
  end

  defp zero_pad(val, 2) when val >= 10 and val < 100 do
    tens = div(val, 10)
    [tens + ?0, val - tens * 10 + ?0]
  end

Benchee + fprof results after applying this change:

Name                                        ips        average  deviation         median         99th %
Calendar.IOISO.datetime_to_string        4.28 M      233.68 ns  ±5211.52%         201 ns         321 ns
Calendar.ISO.datetime_to_string          3.70 M      270.57 ns  ±5605.04%         231 ns         361 ns

Comparison: 
Calendar.IOISO.datetime_to_string        4.28 M
Calendar.ISO.datetime_to_string          3.70 M - 1.16x slower +36.89 ns

Extended statistics: 

Name                                      minimum        maximum    sample size                     mode
Calendar.IOISO.datetime_to_string          160 ns    16775502 ns         5.26 M                   200 ns
Calendar.ISO.datetime_to_string            200 ns    17099465 ns         4.84 M                   230 ns

Memory usage statistics:

Name                                 Memory usage
Calendar.IOISO.datetime_to_string           464 B
Calendar.ISO.datetime_to_string             424 B - 0.91x memory usage -40 B

**All measurements for memory usage were the same**

Profiling Calendar.IOISO.datetime_to_string with eprof...

Profile results of #PID<0.1123558.0>
#                                           CALLS     % TIME µS/CALL
Total                                          21 100.0    7    0.33
anonymous fn/0 in CalendarStringBench.run/0     1  0.00    0    0.00
Calendar.IOISO.time_to_iodata_guarded/5         1  0.00    0    0.00
Calendar.IOISO.time_to_iodata_format/4          1  0.00    0    0.00
Calendar.IOISO.time_to_iodata/5                 1  0.00    0    0.00
Calendar.IOISO.naive_datetime_to_string/8       1  0.00    0    0.00
Calendar.IOISO.naive_datetime_to_string/7       1  0.00    0    0.00
Calendar.IOISO.naive_datetime_to_iodata/8       1  0.00    0    0.00
Calendar.IOISO.microseconds_to_iodata/2         1  0.00    0    0.00
Calendar.IOISO.date_to_iodata_guarded/4         1  0.00    0    0.00
Calendar.IOISO.date_to_iodata/4                 1  0.00    0    0.00
:erlang.integer_to_binary/1                     2 14.29    1    0.50
Calendar.IOISO.zero_pad/2                       7 14.29    1    0.14
:erlang.iolist_to_binary/1                      1 28.57    2    2.00
:erlang.apply/2                                 1 42.86    3    3.00

Profile done over 14 matching functions

Profiling Calendar.ISO.datetime_to_string with eprof...

Profile results of #PID<0.1123560.0>
#                                           CALLS     % TIME µS/CALL
Total                                          26 100.0    9    0.35
Calendar.ISO.time_to_iodata_guarded/5           1  0.00    0    0.00
Calendar.ISO.time_to_iodata_format/4            1  0.00    0    0.00
Calendar.ISO.time_to_iodata/5                   1  0.00    0    0.00
Calendar.ISO.naive_datetime_to_string/8         1  0.00    0    0.00
Calendar.ISO.naive_datetime_to_string/7         1  0.00    0    0.00
Calendar.ISO.naive_datetime_to_iodata/8         1  0.00    0    0.00
Calendar.ISO.microseconds_to_iodata/2           1  0.00    0    0.00
Calendar.ISO.date_to_iodata_guarded/4           1  0.00    0    0.00
Calendar.ISO.date_to_iodata/4                   1  0.00    0    0.00
anonymous fn/0 in CalendarStringBench.run/0     1 11.11    1    1.00
:erlang.apply/2                                 1 22.22    2    2.00
:erlang.iolist_to_binary/1                      1 22.22    2    2.00
:erlang.integer_to_binary/1                     7 22.22    2    0.29
Calendar.ISO.zero_pad/2                         7 22.22    2    0.29

Profile done over 14 matching functions

TIL [1, 2] is iodata but [1 | 2] is not

iex(1)> IO.iodata_to_binary([1, 2])
<<1, 2>>
iex(2)> IO.iodata_to_binary([1 | 2])
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: not an iodata term

    :erlang.iolist_to_binary([1 | 2])
    iex:2: (file)

@michalmuskala
Copy link
Member

michalmuskala commented Jan 6, 2025

One thing to be careful with iodata is when you keep it around for a long time - traversing iodata during GC is very, very expensive compared to binaries.
We've hit that in Jason before - where eagerly converting stuff to binaries, rather than keeping as iodata was beneficial.

The benchmarks here are only dealing with allocated memory, not retained memory, so they don't capture the issue. The way to capture the issue would be to encode a list of dates and return a list of encoded values (so they are kept around alive when a GC is triggered mid-iteration), rather than doing it one by one.

It's probably also possible to rewrite this to take advantage of the new binary append optimisation in OTP 26, where the multiple binary re-allocations in the middle hopefully shouldn't happen

@dkuku
Copy link
Contributor Author

dkuku commented Jan 6, 2025

Good point but without a proper benchmark it's hard to compare.
The intermediate short strings that are concatenated also need to be GCed.

@michalmuskala
Copy link
Member

If the code is written to take advantage of mutable binaries, it's possible to avoid that - there will be just one binary buffer that gets appended to

@dkuku
Copy link
Contributor Author

dkuku commented Jan 7, 2025

You made me curious.
I prepared a branch and run a benchmark.
The speed is the same as with binary concatenation, but the memory usage reported by benchee it over 3kb (benchmark result in the pr desc).
It's probably because I'm appending to the buffer.

@dkuku dkuku mentioned this pull request Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants