Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISS#2319]fix(DU): Reducing synRetry count for faster connect calls (… #182

Merged
merged 1 commit into from Jan 2, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 18 additions & 0 deletions lib/libzrepl/mgmt_conn.c
Expand Up @@ -31,6 +31,8 @@
#include <sys/socket.h>
#include <sys/eventfd.h>
#include <sys/prctl.h>
#include <netinet/in.h>
#include <netinet/tcp.h>

#include <sys/dsl_dataset.h>
#include <sys/dsl_destroy.h>
Expand Down Expand Up @@ -173,6 +175,7 @@ connect_to_tgt(uzfs_mgmt_conn_t *conn)
{
struct sockaddr_in istgt_addr;
int sfd, rc;
int synretries = 3;

conn->conn_last_connect = time(NULL);

Expand All @@ -185,6 +188,21 @@ connect_to_tgt(uzfs_mgmt_conn_t *conn)
if (sfd < 0)
return (-1);

/*
* synretry count is usually 6, which takes > 2 minutes.
* kernel retries syn at 1, 2, 4, 8, 16, 32 and 64 seconds.
* Due to some reason, even if listener at tgt is available,
* until these retransmissions are complete and start new connect
* call, connection is not getting established, and this is causing
* volume to get into RO state.
* By reducing synretries to 3, next connect call is made in less
* than 20 seconds.
*/
if (setsockopt(sfd, IPPROTO_TCP, TCP_SYNCNT, &synretries,
sizeof (int)) < 0) {
perror("setsockopt(TCP_SYNCNT) failed");
}

rc = connect(sfd, (struct sockaddr *)&istgt_addr, sizeof (istgt_addr));
/* EINPROGRESS means that EPOLLOUT will tell us when connect is done */
if (rc != 0 && errno != EINPROGRESS) {
Expand Down