Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hung connections! #26

Closed
semigodking opened this issue Aug 13, 2012 · 12 comments
Closed

Hung connections! #26

semigodking opened this issue Aug 13, 2012 · 12 comments

Comments

@semigodking
Copy link

While running resocks for one/two days, it stops providing service and generates error: too many open files.
After investigation, i believe this issue is caused by hung connections in some cases.

Here is dump for redsocks which runs for less than a day.

Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: Dumping client list for instance 0x426288:
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: End of client list.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: Dumping client list for instance 0x426168:
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:41502->74.125.71.132:80]: client: 63 (-/W) SHUT_RD, relay: 64 (-/-) SHUT_WR, age: 19578 sec, idle: 19525 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:46380->74.125.71.132:80]: client: 57 (-/W) SHUT_RD, relay: 58 (-/-) SHUT_WR, age: 19578 sec, idle: 19525 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:60613->74.125.71.132:80]: client: 55 (-/W) SHUT_RD, relay: 56 (-/-) SHUT_WR, age: 19578 sec, idle: 19525 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:38119->74.125.71.120:80]: client: 53 (-/W) SHUT_RD, relay: 54 (-/-) SHUT_WR, age: 19578 sec, idle: 19525 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:47581->74.125.71.132:80]: client: 49 (-/W) SHUT_RD, relay: 50 (-/-) SHUT_WR, age: 19581 sec, idle: 19578 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:37708->74.125.71.132:80]: client: 47 (-/W) SHUT_RD, relay: 48 (-/-) SHUT_WR, age: 19581 sec, idle: 19578 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:47209->74.125.71.132:80]: client: 45 (-/W) SHUT_RD, relay: 46 (-/-) SHUT_WR, age: 19581 sec, idle: 19578 sec.
Aug 13 14:03:47 OpenWrt daemon.debug redsocks[3823]: [192.168.10.101:53661->74.125.71.120:80]: client: 41 (-/W) SHUT_RD, relay: 42 (-/-) SHUT_WR, age: 19581 sec, idle: 19578 sec.
root@OpenWrt:# cat /proc/net/sockstat
sockets: used 75
TCP: inuse 6 orphan 0 tw 2 alloc 46 mem 1
UDP: inuse 2
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0
root@OpenWrt:
#

@darkk
Copy link
Owner

darkk commented Aug 13, 2012

Can you post redsocks.conf too?

@semigodking
Copy link
Author

This is the correct one.

cat /etc/config/redsocks.conf
base {
log_debug = on;
log_info = off;
daemon = on;
redirector= iptables;
}

redsocks {
local_ip = 192.168.10.2;
local_port = 1081;
ip = 127.0.0.1;
//listenq = 256;
//port = 1080;
//type = socks5;
port = 8081;
type = direct;
//type = http-connect;
//login = gaeproxy;
//password = gaeproxy;
}
redsocks {
local_ip = 192.168.10.2;
local_port = 1082;
ip = 192.168.10.216;
type = socks5;
port = 9050;
//type = http-connect;
//login = gaeproxy;
//password = gaeproxy;
}

Note: I extended this tool with a new direct method. So that, no proxy is required.

The implementation is simple. Basically, do nothing in relay but invoking start relay on read/write callback. It has a customized connect_relay method to connect to destination directly. I will post my code changes here when i got access to my code.

@darkk
Copy link
Owner

darkk commented Aug 14, 2012

Ok

[192.168.10.101:53661->74.125.71.120:80]: client: 41 (-/W) SHUT_RD, relay: 42 (-/-) SHUT_WR, age: 19581 sec, idle: 19578 sec.

Reads like that:

  1. client has sent EOF and redsocks finished reading from client (client: SHUT_RD)
  2. client has sent EOF and redsocks relayed it to the server (relay: SHUT_WR)
  3. redsocks still has something to send to the client (client: -/W)
  4. MAYBE server has something to sent to redoscks, but redsocks does not wait for it (as soon as it wait for client to consume data)

So it looks like quite valid situation. Moreover, age: 19581 is more than default net.ipv4.tcp_keepalive_time = 7200, so remote end is still alive. Maybe, it's actually dead, but bug prevents redsocks from detecting it.

Maybe that's bug in redsocks but there is another possible reason: descriptor leak in your browser (I've seen in in firefox long-long-ago). Open connection leaks to subprocess (e.g. PDF reader) and is stuck there for quite a long time. See http://bugs.debian.org/410671 for details

Is 192.168.10.101 linux-based? Can you run sudo netstat -tanp and/or sudo lsof there to verify that there is no leak?

And regarding your note: what is the reason to implement direct in redsocks? As far as I see, you can use iptables DNAT for that and it'll be more lightweight than userspace daemon.

@semigodking
Copy link
Author

192.168.10.101 is for a Windows and it was shutdown when I noticed this issue.
For implementation of 'direct', it is to bypass limitation of my ISP. My ISP blocks HTTP connections to some web sites if the connection is shared with a router. It is very interesting that another router behind the 1st level router is not limited. So, to bypass such limitation, I tran-proxied all HTTP/HTTPS traffic through the 2nd level router.
Such behavior of ISP in our country is common.

I plan to add some code in redsocks_shutdown() to detect such case and drop the clients.
The criteria can be something like relay->enabled == 0.
Will let you know the results.

@darkk
Copy link
Owner

darkk commented Aug 14, 2012

How long was 192.168.10.101 shut down? (to check if keepalive worked or not).

I think, the better way is to use TCP_KEEP* options and to detect connection death.

Am I right, that some websites are blocked when you connect like that:
[client] -> [router] -> [isp]
and they are reachable when you connect like that:
[client] --redsocks--> [router1-with-redsocks] --> [router2] --> [isp]
?

I would recommend to play with iptables -j TTL instead of using redsocks in this case.

Another option to do quick-check is to run ubuntu live CD on client with the 1st topology and to check if websites are reachable or not.

@semigodking
Copy link
Author

Almost right. But, some differences.
[client] -> [router1] -> [isp] works only if [client] does PPPoE directly. Mac clone does not make sense.
[client] -> [router2 - w/o redsocks] -> [router1] -> [isp] does not work for client, but work for router2 when MAC clone enabled on router1 (clone the mac of router2]
[client] -> [router2 - w/ redsocks] -> [router1] -> [isp] works

iptables -j TTL and iptables -j IPID are already applied in router2 as well as MAC clone in router1.
router1 is too limited to do any additional work/verification on it.

@darkk
Copy link
Owner

darkk commented Aug 14, 2012

Ok, I see.
Have you applied -j TCPMSS --clamp-mss-to-pmtu on router2 ? It may be another reason of broken connection. And it's especially true if your connection is broken only to some sites in [client] -> [router2 - w/o redsocks] -> [router1] -> [isp] topology.

@darkk
Copy link
Owner

darkk commented Aug 14, 2012

BTW: keepalive is not a silver bullet: http://lkml.indiana.edu/hypermail/linux/kernel/0508.2/0757.html
Subject: 2.6.12.5 bug? per-socket TCP keepalive settings

@semigodking
Copy link
Author

Here is code for implementation of 'direct' method.

void redsocks_direct_connect_relay(redsocks_client *client);
static void direct_relay_init(redsocks_client *client)
{
client->state = 0;
}

static void direct_instance_fini(redsocks_instance *instance)
{
}
static void direct_read_cb(struct bufferevent *buffev, void *_arg)
{
redsocks_client *client = _arg;
redsocks_touch_client(client);
if (client->state == 0)
{
client->state = 1;
redsocks_start_relay(client);
}
}
static void direct_write_cb(struct bufferevent *buffev, void *_arg)
{
redsocks_client *client = _arg;
redsocks_touch_client(client);
if (client->state == 0)
{
client->state = 1;
redsocks_start_relay(client);
}
}
relay_subsys direct_connect_subsys =
{
.name = "direct",
.payload_len = 0,
.instance_payload_len = 0,
.readcb = direct_read_cb,
.writecb = direct_write_cb,
.init = direct_relay_init,
.instance_fini = direct_instance_fini,
.connect_relay = redsocks_direct_connect_relay,
};

void redsocks_direct_connect_relay(redsocks_client *client)
{
client->relay = red_connect_relay(&client->destaddr,
redsocks_relay_connected, redsocks_event_error, client);
if (!client->relay) {
redsocks_log_errno(client, LOG_ERR, "red_connect_relay");
redsocks_drop_client(client);
}
}

@darkk
Copy link
Owner

darkk commented Aug 14, 2012

Have you verified if TCPMSS helps ?

@semigodking
Copy link
Author

No. But I will understand this option and try it later.

Here is how I understand the hung connections:

  1. client connection established (R/W enabled) and relay connection is being setup (W enabled).
  2. peer of client disconnects before relay connection is established. And EOF reaches client and client shutdown RD and relays the EOF to relay (relay shutdown WR).
  3. Since the connection of relay is not established before EOF is relayed, the event of relay now is -/-. Because RD is not enabled on relay yet.
  4. Now the connection of client is half closed. we need an event to close this connection and drop clients. The only possible event is timeout. But, I am not sure if libevent can get such events in this situation.

@semigodking
Copy link
Author

Patch below works fine for me.

diff --git a/redsocks.c b/redsocks.c
index ba5eab2..fff89d3 100644
--- a/redsocks.c
+++ b/redsocks.c
@@ -395,6 +436,11 @@ static void redsocks_shutdown(redsocks_client *client, struct bufferevent *buffe
redsocks_log_error(client, LOG_DEBUG, "both client and server disconnected");
redsocks_drop_client(client);
}

  • else
  • {
  •    if (how == SHUT_WR && buffev == client->relay && client->relay->enabled == 0)
    
  •               redsocks_drop_client(client);
    
  • }
    }

// I assume that -1 is invalid errno value

darkk added a commit that referenced this issue Mar 8, 2016
EOF is forwarded only when the bi-directional connection is established.

Thanks to semigodking for describing the test-case in #26

Moreover, linux kernel may reply SYN-ACK with RST if the now-connecting
socket is brought down with shutdown(fd, SHUT_WR):

connect(26, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("11.22.33.44")}, 16) = -1 EINPROGRESS (Operation now in progress)
IP 192.168.10.254.42578 > 11.22.33.44.8080: Flags [S], seq 813066190, win 27200, options [...], length 0
epoll_ctl(3, EPOLL_CTL_ADD, 26, {EPOLLOUT, {u32=26, u64=26}}) = 0
epoll_wait(3, {{EPOLLIN, {u32=25, u64=25}}}, 32, -1) = 1
clock_gettime(CLOCK_MONOTONIC, {728135, 720450764}) = 0
gettimeofday({1457464453, 327070}, NULL) = 0
ioctl(25, FIONREAD, [0]) = 0
readv(25, [{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096}], 1) = 0
epoll_ctl(3, EPOLL_CTL_DEL, 25, {EPOLLIN, {u32=25, u64=25}}) = 0
shutdown(25, SHUT_RD) = 0
shutdown(26, SHUT_WR) = 0
epoll_ctl(3, EPOLL_CTL_DEL, 26, {EPOLLOUT, {u32=26, u64=26}}) = 0
IP 11.22.33.44.8080 > 192.168.10.254.42578: Flags [S.], seq 481785732, ack 813066191, win 65535, options [...], length 0
IP 192.168.10.254.42578 > 11.22.33.44.8080: Flags [R], seq 813066191, win 0, length 0
epoll_wait(3, ...
@darkk darkk closed this as completed Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants