Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ManagedSelector can livelock under high load #1924

Closed
sbordet opened this issue Oct 27, 2017 · 8 comments
Closed

ManagedSelector can livelock under high load #1924

sbordet opened this issue Oct 27, 2017 · 8 comments
Assignees
Labels
Bug For general bugs on Jetty side High Priority Performance

Comments

@sbordet
Copy link
Contributor

sbordet commented Oct 27, 2017

There is evidence that under high load a thread may livelock running selector actions.

@sbordet sbordet added Bug For general bugs on Jetty side High Priority Performance labels Oct 27, 2017
sbordet added a commit that referenced this issue Oct 29, 2017
* Actually using the MAX_ACTION_PERIOD value rather than a harcoded one.
* Waking up the selector outside the sync block.
sbordet added a commit that referenced this issue Oct 29, 2017
Reverted CreateEndPoint to be non-blocking, as
the real issue was determined to be #1924 instead.
@sbordet
Copy link
Contributor Author

sbordet commented Oct 29, 2017

@gregw I have reviewed and updated your code, see 333c22d for a description of the changes.

@gregw
Copy link
Contributor

gregw commented Oct 30, 2017

@sbordet your changes LGTM!

@gregw gregw closed this as completed Nov 3, 2017
@lgangloff
Copy link

lgangloff commented Nov 7, 2017

Hello,
We're facing the same issue under high load test with the lastest snapshot 9.4.8-20171107.091315.
We're using jetty-runner with this xml config (maxThreads & requestHeaderSize were uppered)

<Get name="ThreadPool">
      <Set name="minThreads" type="int"><SystemProperty name="threads.min" default="50"/></Set>
      <Set name="maxThreads" type="int"><SystemProperty name="threads.max" default="350"/></Set>
      <Set name="idleTimeout" type="int"><SystemProperty name="threads.timeout" default="60000"/></Set>
      <Set name="detailedDump">false</Set>
    </Get>

	<New id="httpConfig" class="org.eclipse.jetty.server.HttpConfiguration">
		<Set name="secureScheme">https</Set>
		<Set name="outputBufferSize">32768</Set>
		<Set name="requestHeaderSize">16384</Set>
		<Set name="responseHeaderSize">8192</Set>

		<Call name="addCustomizer">
			<Arg>
				<New class="org.eclipse.jetty.server.ForwardedRequestCustomizer" />
			</Arg>
		</Call>
	</New>

  <Call name="addConnector">
    <Arg>
      <New id="httpConnector" class="org.eclipse.jetty.server.ServerConnector">
        <Arg name="server"><Ref refid="Server" /></Arg>
        <Arg name="acceptors" type="int"><Property name="jetty.http.acceptors" default="-1"/></Arg>
        <Arg name="selectors" type="int"><Property name="jetty.http.selectors" default="-1"/></Arg>
        <Arg name="factories">
          <Array type="org.eclipse.jetty.server.ConnectionFactory">
            <Item>
              <New class="org.eclipse.jetty.server.HttpConnectionFactory">
                <Arg name="config"><Ref refid="httpConfig" /></Arg>
              </New>
            </Item>
          </Array>
        </Arg>
        <Set name="host"><Property name="jetty.http.host" /></Set>
        <Set name="port"><SystemProperty name="jetty.http.port" default="8080" /></Set>
        <Set name="idleTimeout"><Property name="jetty.http.idleTimeout" default="30000"/></Set>
        <Set name="soLingerTime"><Property name="jetty.http.soLingerTime" default="-1"/></Set>
        <Set name="acceptorPriorityDelta"><Property name="jetty.http.acceptorPriorityDelta" default="0"/></Set>
        <Set name="acceptQueueSize"><Property name="jetty.http.acceptQueueSize" default="0"/></Set>
      </New>
    </Arg>
  </Call>

We have 2 virtual server each hosting 3 Jetty instance (on port 8080/8081/8082) having 1 webapp per instance.
Requests are load balanced with haproxy and the sticky session strategy. We store the session data under PostgreSQL with a JDBCSessionDataStore.

Here are the thread dump that shows:

  • 1 deadlock
  • 2 threads locking on monitors

Our high load test reaches the maxThreads and Jetty doesn't respond anymore.
Note that if we disable the session persistence, everything works fine.

Thanks for your reply.

thread-dump-out.zip

@sbordet
Copy link
Contributor Author

sbordet commented Nov 7, 2017

@lgangloff your issue is different from what described in this issue, but it is indeed a bug.
Can you please open a new issue ? Thanks !

@gregw
Copy link
Contributor

gregw commented Nov 7, 2017

@sbordet is it a bug? I did an analysis and no deadlock detected. I do see many threads waiting on the lock for the same session, so there is some contention on session.... anyway if it is a bug it is definitely a different issue

@lgangloff
Copy link

Sorry, the fact that the threads are locking monitor at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:362) made me feel that was the same issue. Anyway, see #1947
Thanks !

@sbordet
Copy link
Contributor Author

sbordet commented Nov 7, 2017

@gregw yes it is a bug as we call user code while holding locks.
Not sure about the tool you used, but the thread dump from @lgangloff shows clearly a deadlock in the session code. I commented on #1947.

@gregw
Copy link
Contributor

gregw commented Nov 12, 2017

There is some evidence that we are favouring selection too much and that we should always consume the action list as it exists initially before selecting.

@gregw gregw reopened this Nov 12, 2017
gregw added a commit that referenced this issue Nov 12, 2017
Alternate implementation that is count based rather than time based.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
gregw added a commit that referenced this issue Nov 12, 2017
Signed-off-by: Greg Wilkins <gregw@webtide.com>
gregw added a commit that referenced this issue Nov 13, 2017
after review

Signed-off-by: Greg Wilkins <gregw@webtide.com>
gregw added a commit that referenced this issue Nov 14, 2017
* Issue #1924 - ManagedSelector livelock.

Alternate implementation that is count based rather than time based.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
@joakime joakime changed the title ManagedSelector livelock ManagedSelector can livelock under high load Nov 21, 2017
@gregw gregw closed this as completed Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side High Priority Performance
Projects
None yet
Development

No branches or pull requests

3 participants