-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add encoders to the options #28
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello!
I'm unsure about adding this level of flexibility to the package. How does it benchmark against the previous version?
I think there is no loss here. |
4568e92
to
b1cca12
Compare
I've updated the PR with minor changes to maintain consistency and improve performance based on @michalmuskala's comment in #27. |
Iirc using BIF map_size =:= 0 may be better, but worth measuring.
…On Wed, Oct 11, 2023, 07:06 William Fank Thomé ***@***.***> wrote:
I've updated the PR with minor changes to maintain consistency and improve
performance based on @michalmuskala <https://github.com/michalmuskala>'s
comment in #27 <#27>.
—
Reply to this email directly, view it on GitHub
<#28 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFKWME5OQRVGGQAYXUXMF3X6Z4SXANCNFSM6AAAAAA526YOTU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The performance is almost the same using ./erlperf 'map_size(#{a=>a,b=>b,c=>c,d=>d,e=>e}) =:= 0.' '#{a=>a,b=>b,c=>c,d=>d,e=>e} =:= #{}.'
Code || QPS Time Rel
#{a=>a,b=>b,c=>c,d=>d,e=>e} =:= #{}. 1 76647 Ki 13 ns 100%
map_size(#{a=>a,b=>b,c=>c,d=>d,e=>e}) =:= 0. 1 76297 Ki 13 ns 100% But with_overhead(M) when is_map(M), map_size(M) =:= 0 -> M.
without_overhead(M) when M =:= #{} -> M. BTW, the benchmark is almost the same: ./erlperf 'M = #{a=>a,b=>b,c=>c,d=>d,e=>e}, is_map(M) andalso map_size(M) =:= 0.' 'M = #{a=>a,b=>b,c=>c,d=>d,e=>e}, M =:= #{}.'
Code || QPS Time Rel
M = #{a=>a,b=>b,c=>c,d=>d,e=>e}, M =:= #{}. 1 77480 Ki 12 ns 100%
M = #{a=>a,b=>b,c=>c,d=>d,e=>e}, is_map(M) andalso map_size(M) =:= 0. 1 77461 Ki 12 ns 100% |
FWIW, the last change in the PR does not remove the overhead. |
Looking at the code again, I think the encoders should be exposed, e.g. I need to parse a key-value pair (which it's a encode(Data) ->
Opts = #{encoders => #{proplist => proplist/3},
thoas:encode(Data, Opts).
proplist({{_,_,_},{_,_,_}} = Date, Escape, Encoders) ->
datetime(Date); % <- custom
proplist(KV, Escape, Encoders) ->
thoas_proplist_encoder:encode(KV, Escape, Encoders). % <- new module (exposed encoder)
datetime(Date) ->
Date. So, a -module(thoas_encoder).
% Just an example, I'm unsure about the types now.
-callback encode(Value, Escape, Encoders) -> Result
when Value :: term(),
Escape :: function(),
Encoders :: map(),
Result :: iodata(). What do you think? Any other suggestions? |
@williamthome I'm not sure you do. Date-time is a 2-tuple, not a proplist so would hit the Defining a behavior would definitely be helpful from a documenting-how-to-write-a-custom-encoder perspective |
Yes, you are right, my example is wrong. proplist([{{_,_,_},{_,_,_}} = Date | T], Escape, Encoders) ->
% loop But makes no sense. |
Anyway, the |
Thought I'd just drop this here as a potential example for documentation updates should this be accepted. thoas:encode(MyData, #{encoders => #{unknown => fun my_encoder:encode/3}}). %% Example custom encoder module
%% Extends the default encoder module to
%% support IP addresses and CIDR encoding
-module(my_encoder).
%% API
-export([encode/3]).
encode({O1, O2, O3, O4} = InVal, Escape, #{binary := Encode} = Encoders)
when is_integer(O1) andalso O1 >= 0 andalso O1 =< 255 andalso
is_integer(O2) andalso O2 >= 0 andalso O2 =< 255 andalso
is_integer(O3) andalso O3 >= 0 andalso O3 =< 255 andalso
is_integer(O4) andalso O4 >= 0 andalso O4 =< 255 ->
case inet:ntoa(InVal) of
{error, einval} ->
error(invalid_type);
IpStr ->
Encode(list_to_binary(IpStr), Escape, Encoders)
end;
encode({{O1, O2, O3, O4} = Ip, Mask}, Escape, #{binary := Encode} = Encoders)
when is_integer(O1) andalso O1 >= 0 andalso O1 =< 255 andalso
is_integer(O2) andalso O2 >= 0 andalso O2 =< 255 andalso
is_integer(O3) andalso O3 >= 0 andalso O3 =< 255 andalso
is_integer(O4) andalso O4 >= 0 andalso O4 =< 255 andalso
is_integer(Mask) andalso Mask >= 0 andalso Mask =< 32 ->
case inet:ntoa(Ip) of
{error, einval} ->
error(invalid_type);
IpStr ->
MaskBin = integer_to_binary(Mask),
IpBin = list_to_binary(IpStr),
Encode(<<IpBin/binary, "/", MaskBin/binary>>, Escape, Encoders)
end;
encode({O1, O2, O3, O4, O5, O6, O7, O8} = InVal, Escape, #{binary := Encode} = Encoders)
when is_integer(O1) andalso O1 >= 0 andalso O1 =< 65535 andalso
is_integer(O2) andalso O2 >= 0 andalso O2 =< 65535 andalso
is_integer(O3) andalso O3 >= 0 andalso O3 =< 65535 andalso
is_integer(O4) andalso O4 >= 0 andalso O4 =< 65535 andalso
is_integer(O5) andalso O5 >= 0 andalso O5 =< 65535 andalso
is_integer(O6) andalso O6 >= 0 andalso O6 =< 65535 andalso
is_integer(O7) andalso O7 >= 0 andalso O7 =< 65535 andalso
is_integer(O8) andalso O8 >= 0 andalso O8 =< 65535 ->
case inet:ntoa(InVal) of
{error, einval} ->
error(invalid_type);
IpStr ->
Encode(list_to_binary(IpStr), Escape, Encoders)
end;
encode({{O1, O2, O3, O4, O5, O6, O7, O8} = Ip, Mask}, Escape, #{binary := Encode} = Encoders)
when is_integer(O1) andalso O1 >= 0 andalso O1 =< 65535 andalso
is_integer(O2) andalso O2 >= 0 andalso O2 =< 65535 andalso
is_integer(O3) andalso O3 >= 0 andalso O3 =< 65535 andalso
is_integer(O4) andalso O4 >= 0 andalso O4 =< 65535 andalso
is_integer(O5) andalso O5 >= 0 andalso O5 =< 65535 andalso
is_integer(O6) andalso O6 >= 0 andalso O6 =< 65535 andalso
is_integer(O7) andalso O7 >= 0 andalso O7 =< 65535 andalso
is_integer(O8) andalso O8 >= 0 andalso O8 =< 65535 andalso
is_integer(Mask) andalso Mask >= 0 andalso Mask =< 64 ->
case inet:ntoa(Ip) of
{error, einval} ->
error(invalid_type);
IpStr ->
MaskBin = integer_to_binary(Mask),
IpBin = list_to_binary(IpStr),
Encode(<<IpBin/binary, "/", MaskBin/binary>>, Escape, Encoders)
end;
encode(_InVal, _Escape, _Encoders) ->
error(invalid_type). |
b1cca12
to
a5734e9
Compare
The PR is now up to date with the main branch. |
You don't need both I think |
I also think so. The current implementation of this PR can be with |
I still think date/datetime encoding (and not decoding, strangely enough?) does not belong in the JSON library itself, since it is not part of the JSON spec (same with inet data, and whatever else we can think of...). May I suggest as structure where encoders are not based on hard coded patterns but are passed through a sequence of modules/functions instead? Something like That way, it would be easy to add some dependency like |
In this way overriding a specific encoder it's not possible. Allow override specific encoders gives more control to the user, IMHO. |
You can override it if you match on the same pattern as the encoder you want to override. E.g. -module(my_overrides).
-export([encode/1]).
encode(Integer) when is_integer(Integer) -> "foo";
encode(_) -> pass. % Or whatever the API will be... And then use it like so |
The problem with the map-based solution is that you can't add new patterns. Let's say I want to encode |
Hmm sorry if I hadn't understood what you said, but you can define any custom encoder using the thoas:encode(MyData, #{encoders => #{unknown => fun my_encoder:encode/3}}). The default encoders will be merged to these options, so the result will be this: #{
atom => fun(Atom, Escape, _) -> encode_atom(Atom, Escape) end,
binary => fun(Bin, Escape, _) -> encode_string(Bin, Escape) end,
integer => fun(Int, _, _) -> integer(Int) end,
float => fun(Float, _, _) -> float(Float) end,
proplist => fun map_naive/3,
list => fun list/3,
map => fun map/3,
date => fun(Date, Escape, _) -> date(Date, Escape) end,
datetime => fun(Date, Escape, _) -> datetime(Date, Escape) end,
unknown => fun my_encoder:encode/3 % <- CUSTOM HERE
} |
Probably not the greatest idea, but what about adding a This would allow for both cases then. You'd be able to run your custom encoder for special shapes before the data is handed to the default encoders and additionally add an encoder for Alternatively just support the |
IMHO, the current implementation of the PR is the cleanest and does not add any extra overhead to the code, also does not decrease the performance. |
Hello! After some thought I don't believe that the API of this library should be expanded in this way. My goal was to make a library as simple as possible with speeds similar to Jason, and I don't think this contributes to that. Thank you once again. If you wish to publish a JSON project do let me know and I'll add a link to the README. |
Close #27
This PR adds the possibility to define custom encoders.
In this way, a custom encoder can be defined in the
unknown
option for an unmatched value.The original idea is from @leonardb (thanks). I found it really useful.
This PR solves the
inets
issue in #25.I've not implemented the date encoders that @leonardb has implemented in #26.
Suggestions welcome.