Lower Performance Compared to Java Client #2103

HY1310 · 2022-01-07T03:56:43Z

Versions

Please specify real version numbers or git SHAs, not just "Latest" since that changes fairly regularly.

Sarama	Kafka	Go
Version 1.30.0 (2021-09-29)	2.6	1.16

Configuration

What configuration values are you using for Sarama and Kafka?

        s.SaramaConfig = sarama.NewConfig()
	s.SaramaConfig.Version = sarama.V1_1_0_0
	s.SaramaConfig.Metadata.Timeout = time.Second * 30

	s.SaramaConfig.Producer.RequiredAcks = sarama.WaitForLocal
	

	s.SaramaConfig.Producer.Return.Successes = true
	s.SaramaConfig.Producer.Return.Errors = true

	s.SaramaConfig.Producer.Retry.Max = 0 // disable retry, use application level retry
	s.SaramaConfig.Producer.Flush.Bytes = 1024 * 1024
	s.SaramaConfig.Producer.Flush.Frequency = time.Millisecond
	s.SaramaConfig.Net.MaxOpenRequests = 5


	s.SaramaConfig.Producer.Flush.MaxMessages = 10
	s.SaramaConfig.Producer.Flush.Messages = 10
	s.SaramaConfig.Producer.MaxMessageBytes = 10 * 1024 * 1024

Problem Description

client	qps
java	1895
Sarma	610

test scenario,
message size - 1KB
network latency - 20ms

If the network latency is lower than 10ms Sarama will have similar qps as java client. But Sarama's performance decreases a lot when the network latency is high.

Looking into "async_producer.go" and "broker.go" we found that it's actually a sync invocation,

func (b *Broker) Produce(request *ProduceRequest) (*ProduceResponse, error) {
	var (
		response *ProduceResponse
		err      error
	)

	if request.RequiredAcks == NoResponse {
		err = b.sendAndReceive(request, nil)
	} else {
		response = new(ProduceResponse)
		err = b.sendAndReceive(request, response)
	}

	if err != nil {
		return nil, err
	}

	return response, nil
}

This explains the qps decreasing. So is it possible to optimize the throughput of Sarama in high latency scenario? Thanks a lot.

dnwe · 2022-01-07T09:42:54Z

@HY1310 there was a proposed PR raised over Christmas that is intended to improve this particular situation. I haven’t yet had a chance to review it since returning from vacation, but if you could give it a try and see if it helps and report back that would be great!

see #2094

HY1310 · 2022-01-10T02:46:36Z

@dnwe I have tested the PR, the overall throughput of our application increased from about 500 qps to 2500 qps. It's awesome. Hope the PR can be merged soon.

dnwe · 2022-01-23T13:49:38Z

@HY1310 PR is merged and a new version 1.31.0 has been released, enjoy!

Thanks to @slaunay for the excellent contribution

HY1310 · 2022-01-24T10:09:53Z

@dnwe @slaunay thanks a lot:)

slaunay · 2022-01-24T17:50:50Z

@HY1310, glad to hear you are getting better performance out of Sarama.

If you have a constant flow of records and 20ms of latency (including the time it takes to persist records), you can send up to 1000 / 20 = 50 Produce requests per seconds using 1.30.0 because request pipelining was not honoured.
With config.Producer.Flush.MaxMessages = 10 that should translates to up to 50 * 10 = 500 records/s written to a single broker (likely more if you write to multiple brokers).

With 1.31.0 you are now able to use request pipelining with up to 5/6 Produce requests per broker so you could theoretically write up to (1000 / 20) * 5 = 250 Produce requests per seconds or 2,500 records/s to a single broker.

But if you generate records at a higher rate you can actually get better throughput by accumulating more than 10 records in a Produce request.
You would need to:

removeconfig.Producer.Flush.MaxMessages (hard bound) and possibly config.Producer.Flush.Messages (soft bound) configuration properties
increase lingering (config.Producer.Flush.Frequency)

By fitting/batching more records in a Produce request you will increase your throughput and also reduce the per request overhead on the target broker (larger writes are generally better than lots of "small" writes).
The main drawback is that some records will be delayed longer but you kind of control that with config.Producer.Flush.Frequency by setting a soft upper bound on that lingering.

dnwe closed this as completed Jan 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower Performance Compared to Java Client #2103

Lower Performance Compared to Java Client #2103

HY1310 commented Jan 7, 2022

dnwe commented Jan 7, 2022

HY1310 commented Jan 10, 2022

dnwe commented Jan 23, 2022 •

edited

HY1310 commented Jan 24, 2022

slaunay commented Jan 24, 2022

Lower Performance Compared to Java Client #2103

Lower Performance Compared to Java Client #2103

Comments

HY1310 commented Jan 7, 2022

Versions

Configuration

Problem Description

dnwe commented Jan 7, 2022

HY1310 commented Jan 10, 2022

dnwe commented Jan 23, 2022 • edited

HY1310 commented Jan 24, 2022

slaunay commented Jan 24, 2022

dnwe commented Jan 23, 2022 •

edited