Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https.request failed for certain URL after first visting #8368

Closed
codingfishman opened this issue Sep 1, 2016 · 14 comments
Closed

https.request failed for certain URL after first visting #8368

codingfishman opened this issue Sep 1, 2016 · 14 comments
Labels
http Issues or PRs related to the http subsystem. https Issues or PRs related to the https subsystem.

Comments

@codingfishman
Copy link

  • Version: v6.2.1
  • Platform: osx 15.3.0 Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64 x86_64
  • Subsystem: https.request

When visiting the specified URL with https.request, the first request is OK but the following request will be failed with :socket hang up after certain timeout. To demostrate it , try the code below:

const url = 'https://weixin.qq.com/zh_CN/htmledition/js/qmtool0cae9c.js';
// with url2, it works fine
const url2 = 'https://camo.githubusercontent.com/8d44df6ff63d2d5fec8336f688f5212dc5354fdb/68747470733a2f2f742e616c697061796f626a656374732e636f6d2f746673636f6d2f54314e774666586e306f58585858585858582e6a70675f343030782e6a7067';
require('https').get(url1, function(res) {
    res.on('data', () => {
    });

    res.on('end', () => {
        console.info('header in res1 is: ', res.headers);
    });

    require('https').get(url1, function(res2) {
        res2.on('data', (data) =>{
            console.info('data2 received:', data);
        });
        res2.on('end', () => {
            console.info('\nheader in res2 is: ', res2.headers);
        });
    });
});

The second request will be failed, while for other HTTPS resources, such as the url2, it works fine. And also for http request it works normally either.

@addaleax addaleax added the https Issues or PRs related to the https subsystem. label Sep 1, 2016
@bnoordhuis
Copy link
Member

Is there reason to believe it's an issue with node.js and not with the endpoints that you are trying to connect to?

I infer that you are from the PRC. Is it possible the Great Firewall is blocking your connections?

@codingfishman
Copy link
Author

Hi @bnoordhuis

It could be related to the endpoints, because when I test the request against other endpoints it works just fine. The network was not blocked, which I can confirm.

But it's still confusing, since the issue only occurs if we request the second one inside the first one, if we move the second request out, to be a parallel one, the two requests are all good.

With wireshark, the second request seems to be timeout and the endpoint close the connection, then the hang up error throws. Could there by any 'keep-alive' connections between the socket and the endpoint, and meanwhile one closed the connection incorrectly?

Thanks!

@imyller
Copy link
Member

imyller commented Sep 2, 2016

I'm able to reproduce and IMHO the problem is with Node.js socket pooling agent.

Looks like HTTP socket agent creates a identifier for the socket based on request options. When two identical nested requests are made, the completion (event: end) instructs agent to remove/close the socket. If both sockets are within the same agent (by default http.globalAgent) they collide and socket for second connection gets closed prematurely because their identifiers (name) will match.

This happens only when requests are "slow" and long enough to collide, such as the one from weixin.qq.com. With faster server such as GitHub URL above, the first request gets away before its completion can interfere with the second request.

If you make the requests setting https.get agent property to false in options, nested calls work. This makes all nested calls use their own socket pooling agent thus avoiding socket map collision:

const opts = {
    hostname: 'weixin.qq.com',
    port: 443,
    path: '/zh_CN/htmledition/js/qmtool0cae9c.js',
    agent: false
};

https.get(opts, function (res) { .. nested calls to with same opts here ... });

HTTP socket agent's handling of it's self.requests map should be improved to handle situation where multiple identical requests are processed simultaneously.

@addaleax addaleax added the http Issues or PRs related to the http subsystem. label Sep 2, 2016
@imyller
Copy link
Member

imyller commented Sep 2, 2016

I'd like to add that my analysis above is a rough simplification of what http socket agent code does to pool and manage sockets and as such it doesn't yet exactly pinpoint the bug inside _http_agent.js.

However, I think the idea is right and end result is that when connection options are identical the agent seems to errorneously close the socket of the second nested connection.

Maybe someone more familiar with HTTP Agent's code could comment on this?

@addaleax
Copy link
Member

addaleax commented Sep 2, 2016

I have spent quite a bit of time looking at the agent code when investigating #8279 and your explanation makes perfect sense to me!

Maybe someone more familiar with HTTP Agent's code could comment on this?

/cc @nodejs/http :)

@imyller
Copy link
Member

imyller commented Sep 3, 2016

Did some debugging on this and seems that the reason is not with the HTTP socket pooling agent alone, but in https TLS session cache.

HTTP socket pooling agent is indirectly involved because its _agentKey is used to determine if there is a cached session that could be reused.

So, it seems that new socket is properly created after all but https tries to reuse TLS session object that is being actively used by earlier connection.

Alternative workaround is to disable TLS session caching. This also makes nested connections work:

https.globalAgent.maxCachedSessions = 0;

@imyller
Copy link
Member

imyller commented Sep 3, 2016

I'm not sure if TLS session reuse is supported by all HTTPS servers when the first negotiated TLS session connection is still in progress.

Does Node.js support session renegotiation if server fails when attempting to reuse TLS session?

@indutny
Copy link
Member

indutny commented Sep 3, 2016

@imyller there is no support for this on our side, but TLS protocol has a way for server to ignore the incoming session id and generate a new one. Some servers may be faulty though.

@imyller
Copy link
Member

imyller commented Sep 3, 2016

RFC 2246 7.2:

In this case, other connections corresponding to the session may continue, 
but the session identifier must be invalidated, preventing the failed session 
from being used to establish new connections.

Chrome seems to invalidate session id (remove from cache), reconnect socket and renegotiate. Maybe Node.js should do the same.

@indutny
Copy link
Member

indutny commented Sep 3, 2016

@imyller we invalidate sessions on error, but I believe that user code should be responsible for doing retries in case of faulty servers. Semantics differ per application, and different people have different needs.

@codingfishman
Copy link
Author

codingfishman commented Sep 5, 2016

Thanks so much! @imyller @indutny

This really makes sense to me, but I'm not sure whether I understood it correctly:)

So the root cause is the TLS session cache, the second socket was created but still trying to use the previous one, and the previous one closed after the end event emitted, so the second socket was never actually sent out.

But does this just happened when the request are slow ? camo.githubusercontent.com could be more slow than the weixin.qq.com here, but the nested request works fine agains it. Could it be that Node cache the connection based on the server config, and it does not cache the connectoin for githubusercontent.com ?

@imyller
Copy link
Member

imyller commented Sep 5, 2016

@codingfishman In my testing two connections overlap if the server if slow because the first connection's end event + socket close does not have chance to take place before second connection requests new socket. This is the first thing I started debugging, but it proved not to be the problem. Sure, pooling agent code could be a bit more optimized - but it's functional.

So, the sockets are indeed pooled properly and overlapping sockets are not an issue. The issue with weixin.qq.com comes from two simultaneous TLS sessions sharing same cached session id. The server does not accept the cached session id for the second connection (at least not before the first TLS session/connection has ended) and hangs up the socket; and as @indutny pointed out Node.js does not automatically renegotiate connection when this happens. Some client implementations do - Node.js does not.

I suggest that you disable TLS session cache if you want to make nested connections to weixin.qq.com:

https.globalAgent.maxCachedSessions = 0;

codingfishman added a commit to alibaba/anyproxy that referenced this issue Sep 6, 2016
…e the cache, it will hang up the following request after the cached one.

refer to nodejs/node#8368
@codingfishman
Copy link
Author

@imyller Thanks for the detail, it helps a lot!

@Maheshpote
Copy link

Tried the same but getting this issue.
reason: Client network socket disconnected before secure TLS connection was established,

Could you please help me on that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. https Issues or PRs related to the https subsystem.
Projects
None yet
Development

No branches or pull requests

6 participants