-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use conversion to sets to check uniqueItems (performance improvement) #116
Use conversion to sets to check uniqueItems (performance improvement) #116
Conversation
f0c6c74
to
67cfcaa
Compare
There was an issue with the macro and OTP releases 20 and lower. I hope it is fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issues with CI. Seems latest version of proper
requires at least OTP-20. I think the options are:
- Use older version of
proper
- Create a separate rebar.config profile and make CI to only run proper on OTP >= 20
- Drop support for OTP-19 (need confirmation from @andreineculau for that)
src/jesse_lib.erl
Outdated
normalize_and_sort_object(Value) when is_map(Value)-> | ||
Value; | ||
normalize_and_sort_object(Value) -> | ||
maps:from_list(jesse_json_path:unwrap_value(Value)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maps:from_list(jesse_json_path:unwrap_value(Value)). | |
maps:from_list([{Key, normalize_and_sort(Value)} || {Key, Value} <- jesse_json_path:unwrap_value(Value)]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you for spotting that. I have added a test to make sure that this error doesn't come up again. The only thing with your suggestion is that Values that are maps also need this "normalize_and_sort" in their values, so my proposal in the next commit will be to apply the normalization in the is_map function.
@@ -749,26 +749,42 @@ check_unique_items(_, false, State) -> | |||
State; | |||
check_unique_items([], true, State) -> | |||
State; | |||
check_unique_items([_], true, State) -> | |||
State; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea! 👍
test/jesse_tests_draft3_SUITE.erl
Outdated
do_test("uniqueItemsPerformance", Config), | ||
End = erlang:system_time(millisecond), | ||
ElapsedMilliseconds = End - Start, | ||
?assert(ElapsedMilliseconds < 100). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this has a risk of being unstable if CI agent is not fast enough... Do you think there is enough margin for it to not be flaky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it failed in CI at least once because it took more than 100ms. I think we need to either raise the threshold or not run this test on CI. Also, would be nice to log the ElapsedMilliseconds
to see how much it have been exceeded
?assert(ElapsedMilliseconds < 100). | |
?assert(ElapsedMilliseconds < 100, [{elapsed, ElapsedMilliseconds}]). |
but I don't remember if ?assert/2
is supported on older OTP versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately assert with comment seems to have been added in OTP20
erlang/otp@d2be06f#diff-9b2a91139c4c8fff63934580389d53a634eea53927abba13e0a49a59f239a376
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a bit ugly to have this performance tests within a CT suite.
I thought it would be safe to put it with 100 ms. since in my laptop it takes below 20 ms to run the tests with the improvement and around 800 ms without the improvement. But I see now that it doesn't pass for the CI workers.
If it is ok with you, I will remove it.
I think the easiest thing is to use Proper version 1.2.0 for OTP >= 20 and 1.4.0 otherwise. I am going to assume that you mean to crete "rebar.OTPXX.config" files for each release as there are now rebar3.OTPXX files and use links for those releases that are the same, right? i am going to go ahead and try to change the Makefile in that direction. |
339fdd4
to
fde5e93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
@@ -1,5 +1,5 @@ | |||
%%-*- mode: erlang -*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please let's keep rebar.config
file as well (maybe having it as a symlink to rebar.OTP20.config
or other way around). I'm afraid it might cause issues when jesse
is added as a dependency if there is no rebar.config
Makefile
Outdated
@@ -69,38 +71,42 @@ test/JSON-Schema-Test-Suite/tests: | |||
.PHONY: test | |||
# Would be nice to include elvis to test, but it fails on OTP-18 | |||
# test: elvis | |||
test: eunit ct xref dialyzer cover | |||
test: eunit ct xref dialyzer cover proper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's put proper
before cover
, so coverage generated by running proper
is also included to cover report
fde5e93
to
9996923
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with it, thank you!
@andreineculau any opinion?
I'll merge, say, 25th of January if noone complains.
Merged. Thank you for your contribution and patience =) |
This pull request addresses #115
The current implementation of the check for uniqueItems condition in a JSON array seems to be inefficient, with an implementation that has approximately O(n^2) complexity on the number of elements in the array.
This proposal is to change the implementation by using a conversion to a set of the array under uniqueItems condition, and in this way eliminating duplicates efficiently. In order to do this, the format for JSON objects within the array is normalized to Erlang maps.
Implementation of normalization and is_equal is moved to jesse_lib.