-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[elixir] Version and code updates #9198
[elixir] Version and code updates #9198
Conversation
fortunes = [additional_fortune | Repo.all(Fortune)] | ||
fortunes = | ||
[additional_fortune | Repo.all(Fortune)] | ||
|> Enum.sort_by(& &1.message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you benchmarked this one? I actually expect sort
to be more efficient, because sort_by
needs to do additional passes on the data.
@@ -29,7 +30,7 @@ defmodule HelloWeb.PageController do | |||
worlds = | |||
Stream.repeatedly(&random_id/0) | |||
|> Stream.uniq() | |||
|> Stream.map(fn idx -> Repo.get(World, idx) end) | |||
|> Stream.map(&Repo.get(World, &1)) | |||
|> Enum.take(size(params["queries"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can optimize this in two ways:
- Wrap the whole operation on a
Repo.checkout
, so we avoid checking out the connection multiple times - Use Task.async/await or Task.async_stream so we run the queries in parallel
# If this is not sorted it sometimes generates | ||
# FAIL for http://tfb-server:8080/updates/20 | ||
# Only 20470 executed queries in the database out of roughly 20480 expected. | ||
|> Enum.sort_by(& &1.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why we need to sort it? I can't see anything that would explain it. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure on this. I was digging into why this intermittently failed, found the Rails version did a sort at the end, and found comments in other languages about this exact issue.
FrameworkBenchmarks/frameworks/Rust/tide/src/handlers.rs
Lines 124 to 137 in db30328
// This line is required to pass test verification otherwise we get: | |
// | |
// ``` | |
// FAIL for http://tfb-server:8080/updates/20 | |
// Only 20470 executed queries in the database out of roughly 20480 expected. | |
// PASS for http://tfb-server:8080/updates/20 | |
// Rows read: 10128/10240 | |
// FAIL for http://tfb-server:8080/updates/20 | |
// Only 10118 rows updated in the database out of roughly 10240 expected. | |
// ``` | |
// | |
// I don't know why this is the case, but I copied it from the actix | |
// implementation and here it shall stay. | |
loaded_worlds.sort_by_key(|loaded_world| loaded_world.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing. I don't understand why but at least it is "consistent". :) I will try to dig deeper later.
And, if possible, please let me know if the parallelism or Repo.checkout changes I suggest above are allowed. I think they could improve the timing of a few endpoints. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not affiliated with this project other than enjoying Elixir wanting to make it show well. ;)
Your feedback is greatly appreciated. IMHO using Repo.checkout
and Tasks
are within both the rules of many of the tests and within spirit of the benchmarks. I can make a PR for those but won't be able to until ~7 hours from now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thank you! ❤️
After thinking a bit about this, I would probably write this as:
Stream.repeatedly(&random_id/0)
|> Stream.uniq()
|> Enum.take(size(params["queries"]))
|> Enum.sort()
|> Enum.map(fn id ->
world = Repo.get(World, id)
%{id: world.id, randomnumber: :rand.uniform(@random_max)}
end)
The first three parts of the pipeline can be abstracted into a function which we can reuse across actions:
def sample(params_count) do
Stream.repeatedly(&random_id/0)
|> Stream.uniq()
|> Enum.take(size(params_count))
end
EDIT 1: I also think we can remove the plug :accepts
from the router. :)
EDIT 2: I would probably start with the Repo.checkout version. However, I am also worried that the contention we are seeing on single query for c=512 is related to the pool, I will have to investigate that on the Ecto side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some iterations of running benchmarks and these are my observations.
- The greatest impact seems to be
Repo.checkout
where there are a lot of queries. On examples with a very large number of queries and concurrency reduced the time by 200% - There was a small but consistent reduction simply by building the ids first.
- Parallelization using
Task
seems to have had a negative effect. I'm not confident on this because the difference was in the margin of error kind of range and inconsistent. I'm guessing because the the db is so small and on the same docker container that running the queries is nearly no cost?
Putting in a new PR with the changes that have a clear benefit.
Updates the version of Erlang and Elixir for most current. There have been significant performance changes over the last year which should have a noticeable impact on performance. This didn't require and changes to the web/test code itself.
Updates the version of Phoenix and some other libraries to most current. Again there were some optimizations which should have an impact on the overall performance. This did require a small confirmation changes to how libraries are loaded.
Remove older
cowboy
server in favor of the now defaultbandit
server.Some smaller optimizations to use
sort_by
instead ofsort
.Fix problem with
update
endpoint which intermittently returned errors.Fix caching that was not working.