-
Notifications
You must be signed in to change notification settings - Fork 16
Kafka streams config fix/tuning: hypertrace-ingester #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #91 +/- ##
=========================================
Coverage 68.63% 68.63%
Complexity 819 819
=========================================
Files 84 84
Lines 3475 3475
Branches 367 367
=========================================
Hits 2385 2385
Misses 946 946
Partials 144 144
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
|
@laxmanchekka Can you add some description and context for this change? |
| default: | ||
| aliases: | ||
| - pinot-controller | ||
| - pinot-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laxmanchekka This has fixed the e2e test for hypetrace-ingester based on docker-compose. Recently, I have added new alias and so, it was failing.
| consumer.session.timeout.ms = "{{ int .Values.rawSpansGrouperConfig.kafkaStreamsConfig.consumerSessionTimeoutMs }}" | ||
| # Changelog topic configs | ||
| replication.factor = "{{ int .Values.rawSpansGrouperConfig.kafkaStreamsConfig.replicationFactor }}" | ||
| topic.cleanup.policy = "delete,compact" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always wonder shouldn't it be compact,delete? what does delete,compact mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compact,delete delete,compact both are same.
It's basically a list of values. Order doesn't matter.
raw-spans-grouper/helm/values.yaml
Outdated
| - retention.bytes=4294967296 | ||
| - retention.ms=259200000 | ||
| - retention.bytes=8589934592 # default = -1 | ||
| - retention.ms=86400000 # default = 604800000 (7 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A retention period of 12 hours should be sufficient for this topic.
span-normalizer/helm/values.yaml
Outdated
| - retention.bytes=4294967296 | ||
| - retention.ms=259200000 | ||
| - retention.bytes=8589934592 # default = -1 | ||
| - retention.ms=86400000 # default = 604800000 (7 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 hours
| schemaRegistryUrl: "http://schema-registry-service:8081" | ||
| # Core config | ||
| numStreamThreads: 2 # default = 1 | ||
| commitIntervalMs: 30000 # default = 30000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are keeping the default from kafka, does it make sense to add an if in the chart and thus not requiring us to pass a default? cc @JBAhire
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes the chart more complex by adding many if-else branches.
These are very important handpicked kafka/kafka-streams configs that needs to be tuned as per the application performance/throughput requirements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Do you think if we can put it as a commented block if a default is the same and it can be uncommented if some values need to be changed?
# default = 30000 override it for Xyz
# commitIntervalMs: 30000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah above will again need if/else block in the template which you mentioned. Got it now that this was done for keeping the template simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if/else blocks saves us the need to define the defaults ourselves because it adds noise to a already big set of settings. I don't think we will have complex if { if else } else { if else }, it is just a simple if.
| consumerMaxPollRecords: 1000 # default = 1000 (kafka streams default) | ||
| consumerSessionTimeoutMs: 10000 # default = 10000 | ||
| # Others | ||
| metricsRecordingLevel: INFO # default = INFO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as in https://github.com/hypertrace/hypertrace-ingester/pull/91/files#r549230957
| producerBufferMemory: 134217728 # default = 33554432 | ||
| # Consumer configs | ||
| consumerMaxPartitionFetchBytes: 4194304 # default = 1048576 | ||
| consumerMaxPollRecords: 1000 # default = 1000 (kafka streams default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as in https://github.com/hypertrace/hypertrace-ingester/pull/91/files#r549230957
| topics: | ||
| - name: raw-spans-from-jaeger-spans | ||
| replicationFactor: 2 | ||
| replicationFactor: 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested on this change 2->3. Any particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended and default replication factor itself is 3.
Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com>
Co-authored-by: José Carlos Chávez <jcchavezs@gmail.com>
| schemaRegistryUrl: "http://schema-registry-service:8081" | ||
| # Core config | ||
| numStreamThreads: 2 # default = 1 | ||
| commitIntervalMs: 30000 # default = 30000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah above will again need if/else block in the template which you mentioned. Got it now that this was done for keeping the template simple.
No description provided.