-
Notifications
You must be signed in to change notification settings - Fork 3.5k
dk_use_iodata_for_datetime_formatting #14130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dk_use_iodata_for_datetime_formatting #14130
Conversation
lib/elixir/lib/calendar/iso.ex
Outdated
[ | ||
time_to_iodata_format(hour, minute, second, format), | ||
".", | ||
microsecond |> zero_pad(6) |> IO.iodata_to_binary() |> binary_part(0, precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there may be a better way for this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass the precision to zero pad and do it in one pass?
lib/elixir/lib/calendar/iso.ex
Outdated
3 -> ["000", num] | ||
4 -> ["0000", num] | ||
5 -> ["00000", num] | ||
6 -> ["000000", num] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we would ever reach this, since byte_size is never 0. We should also check that count <= 6
.
Yes, please do complete this pull request and expose these functions. Then please change JSON.Encoder to use the iodata variants for Calendar.ISO. :) Thank you! |
lib/elixir/lib/calendar/iso.ex
Outdated
defp date_to_string_guarded(year, month, day, :extended) do | ||
zero_pad(year, 4) <> "-" <> zero_pad(month, 2) <> "-" <> zero_pad(day, 2) | ||
defp date_to_iodata_guarded(year, month, day, :extended) do | ||
[zero_pad(year, 4), "-", zero_pad(month, 2), "-", zero_pad(day, 2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[zero_pad(year, 4), "-", zero_pad(month, 2), "-", zero_pad(day, 2)] | |
[zero_pad(year, 4), ?-, zero_pad(month, 2), ?-, zero_pad(day, 2)] |
This should save a small amount of mem, but hey, why not!
lib/elixir/lib/calendar/iso.ex
Outdated
|
||
defp zero_pad(val, count) do | ||
"-" <> zero_pad(-val, count) | ||
["-", zero_pad(-val, count)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
["-", zero_pad(-val, count)] | |
[?-, zero_pad(-val, count)] |
lib/elixir/lib/calendar/iso.ex
Outdated
defp time_to_string_format(hour, minute, second, :extended) do | ||
zero_pad(hour, 2) <> ":" <> zero_pad(minute, 2) <> ":" <> zero_pad(second, 2) | ||
defp time_to_iodata_format(hour, minute, second, :extended) do | ||
[zero_pad(hour, 2), ":", zero_pad(minute, 2), ":", zero_pad(second, 2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here (as below):
[zero_pad(hour, 2), ":", zero_pad(minute, 2), ":", zero_pad(second, 2)] | |
[zero_pad(hour, 2), ?:, zero_pad(minute, 2), ?:, zero_pad(second, 2)] |
Do we really use other precisions than 0,3 and 6?
|
Precision can be any number between 0 and 6, even though we commonly use 0, 3, and 6. |
@whatyouhide I updated it but according to benchee the memory usage does not change - it's still 456B
|
@josevalim This is currently implemented as:
I would have to create something like |
No need. That’s to_string for the iso calendar! |
All of the literals point to a constant, either ?0 or “0” take no extra runtime memory. The reason ?0, ?0 takes more than “00” is because it takes two entries in the list. Overall, a single character should be an integer, multiple should be a binary. |
c2873fe
to
7025133
Compare
lib/elixir/lib/calendar/iso.ex
Outdated
iex> Calendar.ISO.time_to_iodata(2, 2, 2, {2, 6}) | ||
[[["0", "2"], 58, ["0", "2"], 58, ["0", "2"]], 46, ["00000", "2"]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should show the exact iodata, or a call to IO.iodata_to_binary()
(which is what we usually do)
lib/elixir/lib/calendar/iso.ex
Outdated
|
||
defp zero_pad(val, count) do | ||
"-" <> zero_pad(-val, count) | ||
[?-, zero_pad(-val, count)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a further optimization if we're using IO-data is to use improper lists when possible, replacing the last ,
by |
:
[?-, zero_pad(-val, count)] | |
[?- | zero_pad(-val, count)] |
I haven't benchmarked it and wouldn't expect it to be a huge difference, but it should create less cons cells in theory?
(we might need to use @dialyzer :no_improper_lists
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sabiwara it makes a difference in the memory usage:
Name ips average deviation median 99th %
Calendar.IOISO.datetime_to_iodata 4.89 M 204.42 ns ±19386.20% 160 ns 271 ns
Memory usage statistics:
Name Memory usage
Calendar.IOISO.datetime_to_iodata 376 B
**All measurements for memory usage were the same**
def microseconds_to_iodata(microsecond, 6), do: zero_pad(microsecond, 6) | ||
|
||
defp microseconds_to_iodata(microsecond, precision) do | ||
def microseconds_to_iodata(microsecond, precision) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need it in the next pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks @dkuku 💜
lib/elixir/lib/calendar/iso.ex
Outdated
Converts the given date into a iodata. | ||
Look at date_to_string/4 for more information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converts the given date into a iodata. | |
Look at date_to_string/4 for more information | |
Converts the given date into an iodata. | |
See `date_to_string/4` for more information. |
lib/elixir/lib/calendar/iso.ex
Outdated
Converts the given time into a iodata. | ||
Look at time_to_string/5 for more information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converts the given time into a iodata. | |
Look at time_to_string/5 for more information | |
Converts the given time into an iodata. | |
See `time_to_string/5` for more information. |
lib/elixir/lib/calendar/iso.ex
Outdated
Converts the given naive_datetime into a iodata. | ||
Look at naive_datetime_to_iodata/8 for more information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converts the given naive_datetime into a iodata. | |
Look at naive_datetime_to_iodata/8 for more information | |
Converts the given naive_datetime into an iodata. | |
See `naive_datetime_to_iodata/8` for more information. |
lib/elixir/lib/calendar/iso.ex
Outdated
Converts the given datetime into a iodata. | ||
Look at datetime_to_iodata/12 for more information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converts the given datetime into a iodata. | |
Look at datetime_to_iodata/12 for more information | |
Converts the given datetime into an iodata. | |
See `datetime_to_iodata/12` for more information. |
Co-authored-by: Jean Klingler <sabiwara@proton.me>
💚 💙 💜 💛 ❤️ |
There is one small micro optimization we can make here that is still visible: a special case for zero_pad with 2 elements, which is called quite often. The math is really simple and doesn't require calling defp zero_pad(val, 2) when val >= 0 and val < 10 do
[?0, val + ?0]
end
defp zero_pad(val, 2) when val >= 10 and val < 100 do
tens = div(val, 10)
[tens + ?0, val - tens * 10 + ?0]
end Benchee + fprof results after applying this change:
TIL iex(1)> IO.iodata_to_binary([1, 2])
<<1, 2>>
iex(2)> IO.iodata_to_binary([1 | 2])
** (ArgumentError) errors were found at the given arguments:
* 1st argument: not an iodata term
:erlang.iolist_to_binary([1 | 2])
iex:2: (file) |
One thing to be careful with iodata is when you keep it around for a long time - traversing iodata during GC is very, very expensive compared to binaries. The benchmarks here are only dealing with allocated memory, not retained memory, so they don't capture the issue. The way to capture the issue would be to encode a list of dates and return a list of encoded values (so they are kept around alive when a GC is triggered mid-iteration), rather than doing it one by one. It's probably also possible to rewrite this to take advantage of the new binary append optimisation in OTP 26, where the multiple binary re-allocations in the middle hopefully shouldn't happen |
Good point but without a proper benchmark it's hard to compare. |
If the code is written to take advantage of mutable binaries, it's possible to avoid that - there will be just one binary buffer that gets appended to |
You made me curious. |
The benchmark with go made me think if we can get any speed gains in synthetic tests :D
This pr changes datetime formatting functions:
around 55% of memory usage and 3x speedup when serializing with optional 4x when using iodata ;)
It also adds new functions
*_to_iodata
that can be used by libraries that operate on iodata like ecto or jsonresult