-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance OTP27 on binary_to_term slower #8329
Comments
Thanks, I have a rough idea of what might have caused this. I'll have a deeper look at this later in the week. :-) |
I've opened #8347 that should improve things a bit, I'm seeing slightly better results locally but I'm still a bit puzzled as to why it was so much slower on your machine. |
I've run Thomas's test on my machines:
|
Just to be sure I did run test again on Mac M2 Pro, Ventura 13.6 with results comparable to earlier run:
I'll try to build pr8347? as well to see if I can relate to the above numbers. |
This is tangental to the main topic here, but how come your M2 Pro is twice as fast as my M1 Pro @ThomasArts? :) I don't see you on the Erlanger slack. If you don't mind joining and chatting there for a bit, I'd appreciate that. Hopefully we can all learn something about Erlang/OTP performance on Apple Silicon (or in general, based on what we find). |
Have you compiled all Erlang code you are running with the Erlang compiler in Erlang/OTP 27? The format of the type information in BEAM files are different in OTP 26 and OTP 27. If the format of type information is not the expected one, the JIT cannot do any type-guided optimizations, and will potentially emit worse code. This is just a wild guess. |
Yes, I compile the code with the version I use at runtime. I built from |
Oh, I found the culprit. When debugging a different issue recently I set the CFLAGS to Just as an anecdote: the issue I was debugging turned out to be a problem running Docker image built for x86 on an ARM Macbook. Thanks to all the Docker/Rosetta2 magic that generally works, but recently we made some changes in RabbitMQ that triggered some bug probably (in Rosetta2 I guess? Not sure, too much low-level magic). Anyway, we get some crazy failures where the stacktraces reported don't match the source code, which is why I suspected that maybe GCC optimisations had something to do with that. |
I'm interested in figuring out a way to tell options that were used to build ERTS. Specifically, I want |
@max-au there is 1> proplists:get_value(cflags, erlang:system_info(compile_info)).
"-Werror=undef -Werror=implicit -Werror=return-type -fno-common -g -O2
-I...asdf_24.3.4.16/otp_src_24.3.4.16/erts/x86_64-apple-darwin22.6.0
-DHAVE_CONFIG_H
-Wall -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Wdeclaration-after-statement
-DUSE_THREADS -D_THREAD_SAFE -D_REENTRANT -DPOSIX_THREADS -DBEAMASM=1" |
Describe the bug
It seems that
binary_to_term
got slower.To Reproduce
I used the following simple way to generate a rather large term and then N copies of that term as a binary in a list.
On each of the copies we run
binary_to_term
. Expectation is that it is equally fast as for OTP-26.1.2... but it isn't.On OTP26, roughly:
But on the latest master:
source-8504d0e0b8
the 10k test gets much slower!Test program
Expected behavior
Equal speeds are expected
Affected versions
This seems introduced in commit: 24ef4cb for OTP27.
The last commit with faster times is: 49024e8
Additional context
This is observed when testing riak and may be related to issue: #8229
The text was updated successfully, but these errors were encountered: