Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic token awareness #59

Merged
merged 1 commit into from
Oct 26, 2020
Merged

Basic token awareness #59

merged 1 commit into from
Oct 26, 2020

Conversation

kbr-
Copy link

@kbr- kbr- commented Oct 23, 2020

This version picks the owner of the token range that the queried primary key's token lies in. The owner is always one of the replicas (irrespective of the replication strategy).

In the end we should calculate the entire set of replicas using the replication strategy of the queried table and if sending the query to one of them fails, try another. But taking just the owner is a start.

Fixes #14
Fixes #15

It is untested yet whether this actually routes statements to the right replicas. But I checked manually that it does indeed choose between different replicas... just dunno if they are the right ones :)

(I wanted to test it using cqlsh-rs, but it crashes: CQL frame LZ4 uncompression failure :()

@kbr- kbr- force-pushed the kbr/ring branch 3 times, most recently from fa29fe1 to 38e6120 Compare October 23, 2020 20:03
@kbr- kbr- requested review from piodul, psarna and penberg and removed request for piodul and psarna October 23, 2020 20:11
@penberg
Copy link
Contributor

penberg commented Oct 24, 2020

@kbr- There's some bug with compression (#42), but you can disable compression from cqlsh-rs for your testing:

diff --git a/examples/cqlsh-rs.rs b/examples/cqlsh-rs.rs
index 9ca6073..c8dade9 100644
--- a/examples/cqlsh-rs.rs
+++ b/examples/cqlsh-rs.rs
@@ -11,7 +11,7 @@ async fn main() -> Result<()> {

     println!("Connecting to {} ...", uri);

-    let session = Session::connect(uri, Some(Compression::LZ4)).await?;
+    let session = Session::connect(uri, None).await?;

     let mut rl = Editor::<()>::new();
     loop {

scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
let conn = match self.pool.write().unwrap().entry(owner) {
Entry::Vacant(entry) => {
let conn = Arc::new(new_conn);
entry.insert(SharedConnection { conn: conn.clone() });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use to_owned() instead of clone() for the odd chance that the clone can be optimized out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean

                entry.insert(Arc::new(new_conn)).to_owned()

?

I can do that (it even looks more elegant), but I don't see how a clone can be optimized out; I want two shared pointers to the connection after all, one in the pool and one returned to the caller.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use to_owned now.

scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/connection.rs Outdated Show resolved Hide resolved
scylla/src/transport/session.rs Outdated Show resolved Hide resolved
scylla/src/transport/topology.rs Outdated Show resolved Hide resolved
scylla/src/transport/topology.rs Outdated Show resolved Hide resolved
scylla/src/transport/topology.rs Show resolved Hide resolved
// while continuing waiting on the previous connection
let (peer_addrs, _) =
match timeout(Duration::from_secs(5), query_peers(conn, false, self.port)).await {
Err(_) => continue,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: maybe we should drop/close the connection in case an error occurs while querying? Apart from that, we could reorganize the code a little bit so that the "establish new connections" loop would always run after this one.

If we do this, this code should become resilient to connection drops without changing it too much - although it would sometimes drop the connection unnecesarily if some nodes return an an UnavailableException or something. What do you think?

Copy link
Author

@kbr- kbr- Oct 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should drop/close the connection in case an error occurs while querying?

that would require modifying our errors to use different types and drop/close connections only when appropriate (based on the type). We certainly should not drop a connection e.g. when a query timeouts, since that might be just a temporary network partition between the driver and that particular server.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and of course it should be done eventually (and in Session too), but it's out of scope of this PR IMO.

Copy link
Collaborator

@piodul piodul Oct 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the problem with current approach is that - if I understood the implementation correctly - is that if connection to a node is broken (e.g. socket is closed), we won't reopen a connection to this node ever.

I suggest to close the connection from the client side every time we get an error. It will cause us to sometimes close a connection unnecessarily e.g. the query timeouts, but we will gain ability to reconnect in case e.g. a node restarts.

I agree that we will be able to solve this problem properly if we modify our error types, but I'd argue that reopening connections would be more robust and easy to implement right now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm dropping connections that return an error. But I don't drop them on timeouts.
Also I always try to establish new connections in the refresh function.

@kbr-
Copy link
Author

kbr- commented Oct 26, 2020

I addressed the comments.

However, I've encountered the following error:

Error: Error (code 8704): Unknown identifier c

need to investigate.

In the meantime you can re-review.

@kbr- kbr- requested review from penberg and piodul October 26, 2020 01:26
@kbr-
Copy link
Author

kbr- commented Oct 26, 2020

@kbr-
Copy link
Author

kbr- commented Oct 26, 2020

Well that's just your standard InvalidRequestException which I didn't immediately recognize, complaining that the column c does not exist.

And rightly so, because in my manual testing I created a table ks.t without column c, which one of our tests then tries to query. Silly me.

In any case, it looks like there's another bug to hunt though:

Error: Error (code 9472): No prepared statement with ID 9c2a18b7044aa640917584852bf73a40 found.

(and there's an obvious reason to why this happens)

But I need to get some sleep first :)

@psarna
Copy link
Contributor

psarna commented Oct 26, 2020

@kbr- this error about prepared statements is most likely a result of the test cases running in parallel and operating on identical table names. One obvious workaround is to use unique names across test cases :) (we should have a utility for that), and you can also always run tests in a single thread. I don't remember the param now, something like --test-threads 1

@psarna
Copy link
Contributor

psarna commented Oct 26, 2020

Oh wait, you didn't add a test case. In that case, the reason is that we currently prepare a statement only on a single connection, while we should do that on each node, since the cache is node-local. It's weird that the driver haven't simply reprepared the statement though...

@@ -282,6 +294,24 @@ fn deser_cql_value(typ: &ColumnType, buf: &mut &[u8]) -> AResult<CQLValue> {
CQLValue::BigInt(buf.read_i64::<BigEndian>()?)
}
Text => CQLValue::Text(str::from_utf8(buf)?.to_owned()),
Inet => CQLValue::Inet(match buf.len() {
4 => {
let ret = IpAddr::from(<[u8; 4]>::try_from(&buf[0..4])?);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if the frame is too short. Please use types::read_raw_bytes, which checks if the slice is long enough first (you will need to reexport it).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change it, you should drop the buf.advance(4), because that read_raw_bytes will do it for you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we're in a match buf.len() case 4, so how could the slice be too short?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, you are right. I got confused and thought this was frame parsing code, and the length was read from the frame. I'll send a fix which changes it back.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the additional check doesn't hurt I think.

ret
}
16 => {
let ret = IpAddr::from(<[u8; 16]>::try_from(&buf[0..16])?);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@piodul
Copy link
Collaborator

piodul commented Oct 26, 2020

@kbr- I'll allow myself to address the final review comments, so that we can merge it and unblock the possibility to send more PRs that would potentially conflict with this.

@penberg penberg deleted the kbr/ring branch October 26, 2020 09:00
@kbr-
Copy link
Author

kbr- commented Oct 26, 2020

@piodul thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Token-aware routing Cluster topology discovery
5 participants