JdbcChannelMessageStore: poor performance for long queue of messages #2629

kamil-gawlik · 2018-11-14T14:30:39Z

Hi,
We recently faced serious performance issues with spring-integration-jdbc:5.0.7 and PostgreSQL 9.5.13.

With DB backed queue containing over 1.8 million messages, the average speed of processing stored data was 2messages per second. After a long investigation we found out, that problem was caused by the function polling messages and more precisely part checking the size of the queue.
For millions of records MessageGroupQueue.size method called following SQL from AbstractChannelMessageStoreQueryProvider.getCountAllMessagesInGroupQuery:

SELECT COUNT(MESSAGE_ID) from %PREFIX%CHANNEL_MESSAGE where GROUP_KEY=? and REGION=?

which could take even few seconds to complete (see: slow counting).

Quick fix:

Our workaround consisted on extending PostgresChannelMessageStoreQueryProvider.getCountAllMessagesInGroupQuery method to use faster count method using approach similar to presented here: fast count.

Improvement suggestion

I would suggest changing following part of MessageGroupQueue.poll:

try {
  while (this.size() == 0 && timeoutInNanos > 0) {
    timeoutInNanos = this.messageStoreNotEmpty.awaitNanos(timeoutInNanos);
  }
  message = this.doPoll();
}

to use new method this.isEmpty(), which for PosgresSQL can run following query:
select exists(select 1 from %PREFIX%channel_message where GROUP_KEY=? and REGION=?)

Using our workaround seems to be not the best solution as it returns only estimated value.

Regards, Kamil

PS. It may be connected to #2628 issue

The text was updated successfully, but these errors were encountered:

artembilan · 2018-11-14T14:45:16Z

Not sure why you show a MessageGroupQueue, since the story definitely should fall only into the store implementation and queries and DB configuration.

Fully not clear how that exists query can help us with the count, but what we have realized that we need to fix index to the:

CREATE INDEX INT_CHANNEL_MSG_DATE_IDX ON INT_CHANNEL_MESSAGE (GROUP_KEY, REGION, CREATED_DATE, MESSAGE_SEQUENCE);

I believe you can do that just right now on your DB and come back to us with the feedback after that.

Also I would like to say that framework doesn't call MessageGroupQueue.size() explicitly.

Or you have a problem in other place, or you your problem is really about that poll query, but not count...

artembilan · 2018-11-14T15:19:51Z

Oh! Sorry, I see the size() call from the poll(long timeout, TimeUnit unit).

So, I think your idea about extra isEmpty() contract on the message store will really make sense and improve our performance.

garyrussell · 2018-11-14T16:00:02Z

Setting the consumer's receive timeout to 0 will avoid that size() call.

artembilan · 2018-11-14T16:03:04Z

Yeah... Default one is:

private volatile long receiveTimeout = 1000;

Does it help somehow in your test with Oracle, Gary?

I can install MySQL though, to be sure that we do as much testing as possible.

garyrussell · 2018-11-14T16:07:44Z

Yes; I am testing with oracle now; improved a little so far, but not a lot.

garyrussell · 2018-11-14T16:21:25Z

Seems to me, we can replace this

while (this.queue.size() == 0 && nanos > 0) {

To simply poll() and check for non-null.

artembilan · 2018-11-14T16:26:41Z

Yeah... Looks like our pollMessageFromGroup() is never blocked. Does not make sense to emulate blocking with an extra size() call.

garyrussell · 2018-11-14T16:33:30Z

I mean this...

long nanos = TimeUnit.MILLISECONDS.toNanos(timeout);
long deadline = System.nanoTime() + nanos;
Message<?> message = this.queue.poll();
while (message == null && nanos > 0) {
	this.queueSemaphore.tryAcquire(nanos, TimeUnit.NANOSECONDS); // NOSONAR - ok to ignore result
	message = this.queue.poll();
	if (message == null) {
		nanos = deadline - System.nanoTime();
	}
}
return message;

artembilan · 2018-11-14T16:41:27Z

Looks like you talk about the code in the QueueChannel.doReceive() and that piece with the size() is not related to our MessageGroupQueue since this one is indeed BlockingQueue where we call its own poll(long timeout, TimeUnit unit).

But I agree that that one has to be fixed as we are discussing it here.

garyrussell · 2018-11-14T16:43:07Z

Oh, right; yes, of course.

artembilan · 2018-11-15T00:56:44Z

See JIRA https://jira.spring.io/browse/INT-4553 where we are going to remove usage of that size() together with some indexes improvements.

JIRA: https://jira.spring.io/browse/INT-4553 Fixes spring-projects#2628 Fixes spring-projects#2629 - Avoid `size()` calls on the MGS, use `poll()` instead. - Optimize the indexes for the `INT_CHANNEL_MESSAGE` table.

JIRA: https://jira.spring.io/browse/INT-4553 Fixes #2628 Fixes #2629 - Avoid `size()` calls on the MGS, use `poll()` instead. - Optimize the indexes for the `INT_CHANNEL_MESSAGE` table. Avoid size call when no timeout too. Polishing - PR Comments Missed a doc fix Another missed %PREFIX% Fix underscores Polishing; PR comments; make MGQ extendable. Fix version in doc. * Polishing `@since` * Use diamonds whenever it is possible **Cherry-pick to 5.0.x** # Conflicts: # src/reference/asciidoc/jdbc.adoc # src/reference/asciidoc/whats-new.adoc

artembilan added the status: waiting-for-reporter Needs a feedback from the reporter label Nov 14, 2018

artembilan mentioned this issue Nov 14, 2018

JdbcChannelMessageStore: poor performance for large buffers (Oracle+?) #2628

Closed

garyrussell mentioned this issue Nov 16, 2018

INT-4553: Store-backed QueueChannel improvements #2632

Closed

artembilan closed this as completed in 7135d06 Nov 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JdbcChannelMessageStore: poor performance for long queue of messages #2629

JdbcChannelMessageStore: poor performance for long queue of messages #2629

kamil-gawlik commented Nov 14, 2018

artembilan commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 15, 2018 •

edited

Loading

JdbcChannelMessageStore: poor performance for long queue of messages #2629

JdbcChannelMessageStore: poor performance for long queue of messages #2629

Comments

kamil-gawlik commented Nov 14, 2018

Quick fix:

Improvement suggestion

artembilan commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 14, 2018

garyrussell commented Nov 14, 2018

artembilan commented Nov 15, 2018 • edited Loading

artembilan commented Nov 15, 2018 •

edited

Loading