Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Consul connection conflicts when using HTTP/1.1 #715

Closed
miklish opened this issue Jun 4, 2020 · 2 comments
Closed

Fix Consul connection conflicts when using HTTP/1.1 #715

miklish opened this issue Jun 4, 2020 · 2 comments

Comments

@miklish
Copy link
Collaborator

miklish commented Jun 4, 2020

When connecting to Consul using a non-HTTP/2 connection and 2 (or more) threads simultaneously need to get a url for the same service, they can potentially both end up in the ConsulClientImpl.getConnection method.

Refers to PR #716

Method call chain:

LightCluster.serviceToUrl(serviceName)
--> LightCluster.discovery
--> CommandFallbackRegistry.discover (calls super)
--> AbstractRegistry.discover
--> CommandFallbackRegistry.doDiscover
--> ConsulRegistry.discoverService
--> ConsulRegistry.lookupServiceUpdate
--> ConsulRegistry.lookupConsulService
--> ConsulClientImpl.lookupHealthService
--> ConsulClientImpl.getConnection(String cacheKey)

Currently, for each cacheKey, the getConnection method will return the same ConsulConnection regardless of the thread that requests it. This can result in both threads using the same ConsulConnection simultaneously, generating a bad state in the shared ConsulConnection. In practice, this has resulted in NPE's in the serviceToUrl method, and has also resulted in the service being unable to make subsequent requests to Consul.

The solution is to ensure that every thread has its own private ConsulConnection. The downside of this is that it can cause a large cache of ConsulConnection objects (and hence HTTP connection objects) being created. However, the connection cache size cannot grow without bound, as it will be limited by the size of the undertow worker thread pool.

Additionally, with reasonable connection timeouts in Consul and the user service, unused connections contained in cached ConsulConnection objects will eventually be closed, freeing up network resources. If a ConsulConnection object containing a closed connection is requested, a new active HTTP connection will created automatically.

The methods...

  • checkPass
  • checkFail
  • registerService
  • unregisterService
  • lookupHealthService

...have this issue, as they all call getConnection, and don't use keys that are thread-specific.

NOTE: It is recommended to simply use HTTP/2 to connect to Consul, since a connection cache is not needed with HTTP/2.

miklish added a commit that referenced this issue Jun 4, 2020
… a thread to avoid concurrent use of the connections across threads
stevehu pushed a commit that referenced this issue Jun 5, 2020
… a thread to avoid concurrent use of the connections across threads (#716)
@stevehu
Copy link
Contributor

stevehu commented Jun 5, 2020

stevehu pushed a commit that referenced this issue Jun 5, 2020
… a thread to avoid concurrent use of the connections across threads (#716)
@stevehu
Copy link
Contributor

stevehu commented Jun 5, 2020

@stevehu stevehu closed this as completed Jul 3, 2020
younggwon1 pushed a commit to younggwon1/light-4j that referenced this issue Feb 10, 2024
…ecific to a thread to avoid concurrent use of the connections across threads (networknt#716)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants