-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: small performance optimisations #780
Conversation
a27b28f
to
095c6fb
Compare
Keyspace strategies were cloned and allocated into a Vec, just for creating an iterator to references to them in order to construct a `ReplicaLocator`. It was done due to lifetime issues (in order to send references to another thread, they must be 'static). The problem is solved by sending keyspaces to another thread for constructing `ReplicaLocator` and then moving it back to the calling thread afterwards. Clones are elided. Hooray!
By leveraging branched variable initialisation, cloning datacenters and racks is no longer necessary.
This one wasn't particularly hard.
There no reason why the method should be async.
095c6fb
to
54fe919
Compare
v2: rebased on |
There was a convoluted logic that can be easily expressed using itertools' `find_or_first()` iterator method.
54fe919
to
73373d2
Compare
.map(|node| node.get_working_connections()) | ||
.flatten_ok() | ||
// By an invariant `self.known_peers` is nonempty, so the returned iterator | ||
// is nonempty, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that? If all node pools are in MaybePoolConnections::Broken
state, every call to node.get_working_connections()
will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, and hence the resulting iterator will consist of only Err
variants.
flatten_ok()
does not remove any Err
variants; what it does is flattening the Ok
variants; that is, Ok(Ok(x))
becomes Ok(x)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a quite large change in semantics. Previously, if all nodes had a broken connection pool then the function returned an error; otherwise you would get a list of current connections. Now, you are putting the responsibility for handling errors to the caller. Despite this change the code seems to be working due to how the current callers are handling the errors from the iterator.
I'd rather see the old semantics, i.e. the function should either fail immediately if there are no connections or return an iterator that returns just Arc<Connection>
. Alternatively (but not preferably) you could document the meaning of what the iterator really returns.
73373d2
to
8f9e619
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a nit, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One request, LGTM otherwise.
scylla/src/transport/session.rs
Outdated
let handles = connections_iter.map(|c| async { | ||
match c { | ||
Ok(c) => c.fetch_schema_version().await, | ||
Err(err) => Err(err), | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would and_then
instead of map
work? Similarly above, lines 868-873.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably mean "instead of match
". Then I don't think so, because the operation is done inside an async
block, which is a known limitation of functional idioms in Rust.
.map(|node| node.get_working_connections()) | ||
.flatten_ok() | ||
// By an invariant `self.known_peers` is nonempty, so the returned iterator | ||
// is nonempty, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a quite large change in semantics. Previously, if all nodes had a broken connection pool then the function returned an error; otherwise you would get a list of current connections. Now, you are putting the responsibility for handling errors to the caller. Despite this change the code seems to be working due to how the current callers are handling the errors from the iterator.
I'd rather see the old semantics, i.e. the function should either fail immediately if there are no connections or return an iterator that returns just Arc<Connection>
. Alternatively (but not preferably) you could document the meaning of what the iterator really returns.
8f9e619
to
dbcd9e9
Compare
Reworked |
`get_working_connections()` not only allocated `Vec`s for all nodes, but it even cloned them to one big `Vec`. At least the latter could be avoided by returning an iterator over connections. The new replacement method, `iter_working_connections()`, was put on `ClusterData` instead of `Cluster`, because it borrows from `ClusterData`, so lifetime issues would be brought otherwise. The two uses of `get_working_connections()`, `Session::prepare()` and `Session::check_schema_agreement()`, were updated to use `iter_working_connections()`, and `get_working_connections()` was deleted as not needed anymore.
As `join_all()` accepts any `IntoIterator`, it accepts iterators in particular. Allocating a `Vec` is unnecessary.
dbcd9e9
to
1bc5744
Compare
This is a bunch of performance-related small optimisation of the
cluster
module, mainly focused on avoiding cloning where possible.Pre-review checklist
[ ] I added relevant tests for new features and bug fixes.[ ] I have provided docstrings for the public items that I want to introduce.[ ] I have adjusted the documentation in./docs/source/
.[ ] I added appropriateFixes:
annotations to PR description.