Async Cluster: Connection is created on each request #1025

kamulos · 2024-01-15T18:29:02Z

I am playing around with the async cluster timeouts, and ended up in a strange situation: for each redis request I am triggering, redis-rs will call connect_and_check and create a new connection.

Initially I got in this situation by triggering a timeout and the following reconnection also lead to a timeout. My request is a simple GET.

What I am observing is that in get_connection the branch with connect_and_check is taken and succeeds. But during the next request the same branch is taken, because the connection is still None.

Is it possible, that the connection, that is created in connect_and_check in get_connection needs to be stored back into the conn_lock?

My text is just wild guessing, but the issue is quite reproducible for me, so I can dig deeper if needed.

The text was updated successfully, but these errors were encountered:

nihohit · 2024-01-16T16:50:29Z

Sounds like you reach this line, where the connection is created but not saved -

redis-rs/redis/src/cluster_async/mod.rs

Line 873 in a980a5f

Some((addr, None)) => connect_and_check(&addr, core.cluster_params.clone())

I believe that the solution is something like

match connect_and_check::<C>(&addr, core.cluster_params.clone()).await {
    Ok(conn) => {
        let conn_clone = conn.clone();
        core.conn_lock
            .write()
            .await
            .0
            .insert(addr.clone(), async { conn_clone }.boxed().shared());
        Some((addr, conn))
    }
    Err(_) => None,
}

but I can't recreate this situation in a test, so can't verify it.
Can you please manually test this fix? and if possible, create a test case that is locally reproducible?

kamulos · 2024-01-17T11:52:35Z

Just tested your fix: it works for me. Without it most of the time I get into the situation described above, with it never.

I am not sure how to approach a test case for it. I reproduce it by having a connection with specific settings (both timeouts 400ms and retries set to 1). This connection does a request each 400ms. I now have a cluster with one node, where I execute DEBUG SLEEP 1.5 which seems to be a good blocking time to land there in the code. Longer blocking usually leads to a panic which #968 will hopefully address.

Not really something that lends itself to a reliable test case 🤷

nihohit · 2024-01-18T16:03:28Z

Ok, different take - does this solve your issue?
#1032

IMO both fixes are correct and relevant, but I want to be sure that what I think will work actually will work :)

nihohit · 2024-01-18T17:58:11Z

Managed to write the world's most artificial tests for the first fix, hurrah!
#1033

kamulos · 2024-01-19T15:01:58Z

Ok, different take - does this solve your issue? #1032

IMO both fixes are correct and relevant, but I want to be sure that what I think will work actually will work :)

Yes it does 😊 I am especially enthusiastic about this one, because it also resolves the issue of panics when I use long DEBUG SLEEP times on the server side.

Those two fixes are great, thank you!

kamulos · 2024-03-11T09:57:02Z

Fixed in release 0.25.0

nihohit mentioned this issue Jan 18, 2024

Version 0.25 checklist #970

Closed

22 tasks

nihohit mentioned this issue Jan 18, 2024

Save reconnected connections during retries. #1033

Merged

kamulos closed this as completed Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async Cluster: Connection is created on each request #1025

Async Cluster: Connection is created on each request #1025

kamulos commented Jan 15, 2024

nihohit commented Jan 16, 2024

kamulos commented Jan 17, 2024

nihohit commented Jan 18, 2024

nihohit commented Jan 18, 2024

kamulos commented Jan 19, 2024 •

edited

Loading

kamulos commented Mar 11, 2024

Async Cluster: Connection is created on each request #1025

Async Cluster: Connection is created on each request #1025

Comments

kamulos commented Jan 15, 2024

nihohit commented Jan 16, 2024

kamulos commented Jan 17, 2024

nihohit commented Jan 18, 2024

nihohit commented Jan 18, 2024

kamulos commented Jan 19, 2024 • edited Loading

kamulos commented Mar 11, 2024

kamulos commented Jan 19, 2024 •

edited

Loading