Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix refcounted FD leak in HTTPClient #2483

Merged
merged 1 commit into from Oct 15, 2020

Conversation

tchaloupka
Copy link
Contributor

I've observed that newly added condition m_conn.waitForDataEx(0.seconds) returns noMoreData even for EAGAIN and so leads to disconnects before the new request.
But there is still m_stream set (TLS) which were not destroyed and so there is FD refcount leak.
Calling disconnect that handles the stream too worked for me.

@tchaloupka
Copy link
Contributor Author

TravisCI fail seems unrelated

@s-ludwig
Copy link
Member

Thanks, this should be the right thing to do in any case and looks like it fixed a serious resource leak.

The wrong behavior of waitForDataEx(0.seconds) looks serious as well, tough. On which OS did you observe this?

@tchaloupka
Copy link
Contributor Author

I've tested it on Fedora Linux 31, kernel 5.8.11-100.fc31.x86_64

@tchaloupka
Copy link
Contributor Author

First I thought it is some problem in ConnectionPool but basically every new HTTP request over HTTPS ended up with connection close, handle leak and new connection.
I had to add some logging to socket creations, errors handling (to see actual system errno codes), refcounting increments/decrements and figured this out.

@tchaloupka
Copy link
Contributor Author

This is basically the output of each client request:

[2020-10-14 15:37:55.113 main(AxlN) trc] returning HTTPClient connection 0 of 2, addr=7FF7AC69B6E0
[2020-10-14 15:37:55.113 main(AxlN) trc] Socket 139, read 0 bytes: wouldBlock
[2020-10-14 15:37:55.113 main(AxlN) trc] HTTPClient: doRequest close(): fd=139
releaseRef(139): 1, base=0 << should be 0
addRef(143): 3
[2020-10-14 15:37:55.122 main(AxlN) trc] connectTCP: fd=143, addr=192.168.0.142:9100
...

And then the leak:

> sudo lsof -a -p <pid> | grep "protocol: TCP"
prog  797620 tomas  139u     sock                0,8       0t0 59183873 protocol: TCP

@s-ludwig s-ludwig merged commit b5afadb into vibe-d:master Oct 15, 2020
2 checks passed
@tchaloupka tchaloupka deleted the httpclient_leak branch October 15, 2020 10:55
s-ludwig added a commit to vibe-d/vibe-core that referenced this pull request Oct 15, 2020
…meout.

IOMode.once causes the read() to return with IOStatus.wouldBlock immediately, which previously resulted in erroneously reporting WaitForDataStatus.noModeData instead of timeout.

See vibe-d/vibe.d#2483
@s-ludwig
Copy link
Member

Okay, I've pushed a fix to vibe-core now: vibe-d/vibe-core#232

AFAICS, the keep-alive logic runs smoothly now, except that the retry-loop doesn't really work as intended, as a connection failure will throw an exception, causing the loop to get canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants