Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(clients): updates the tuning content for producers and consumers #9260

Merged
merged 5 commits into from Feb 22, 2024

Conversation

PaulRMellor
Copy link
Contributor

@PaulRMellor PaulRMellor commented Oct 18, 2023

Documentation

Some doc updates related to producer and consumer configuration (based on the feedback for the blog posts on developing producer and consumer applications)

  • adds an overview of the configuration options for securing access to Kafka
  • adds the key configuration considerations for producer and consumer performance
  • updates to the producer and consumer tuning content

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. But TBH, I'm not sure this works for me. It tries to cover too much on very little space and that leads to oversimplification. So I'm not sure fro whom is this is really written. People who know Kafka will know this already. For people who do not know Kafka this does not explain enough and might just confuse them.

Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @PaulRMellor, thanks. I think that, for every tuning recommendation, we should always add the relevant metrics that the user can check to see if the configuration change has the desired effect in their use case. We should also say that it is recommended to change one configuration at a time. You can also try them using the simple example we created for the client blog posts.

Signed-off-by: prmellor <pmellor@redhat.com>
@PaulRMellor
Copy link
Contributor Author

I left some comments. But TBH, I'm not sure this works for me. It tries to cover too much on very little space and that leads to oversimplification. So I'm not sure fro whom is this is really written. People who know Kafka will know this already. For people who do not know Kafka this does not explain enough and might just confuse them.

It's a fair point. I think the original reason for adding the tuning content was to give an idea of the common ways to tune the components to satisfy specific use cases. The idea is that a user might want to know which properties would be suitable for Increasing bandwidth for high latency connections, for example. Our content provides some tips for those kinds of use cases. I imagine that kind of content is searched for a lot. Maybe we could do better on the information we provide. Federico suggests adding metrics info, for example.

@scholzj
Copy link
Member

scholzj commented Oct 31, 2023

I left some comments. But TBH, I'm not sure this works for me. It tries to cover too much on very little space and that leads to oversimplification. So I'm not sure fro whom is this is really written. People who know Kafka will know this already. For people who do not know Kafka this does not explain enough and might just confuse them.

It's a fair point. I think the original reason for adding the tuning content was to give an idea of the common ways to tune the components to satisfy specific use cases. The idea is that a user might want to know which properties would be suitable for Increasing bandwidth for high latency connections, for example. Our content provides some tips for those kinds of use cases. I imagine that kind of content is searched for a lot. Maybe we could do better on the information we provide. Federico suggests adding metrics info, for example.

TBH, I'm not sure we want to replace the Kafka docs. Maintaining this and testing this for every release would be a lot of load of work. Its not like a blog post where you can say that it was published a few years ago and might not apply anymore. And at least as my personal opinion - I think there is no way you can make this short and simple and yet understandable. You either need to document how Kafka works and how to write the client which is a lot of initial as well as ongoing work and I would argue world will be better of if that is done in Apache Kafka. Or you keep it short by making it arguably too simple and it would be either not helpful or spawn new questions we do not have capacity to answer.

Signed-off-by: prmellor <pmellor@redhat.com>
@scholzj
Copy link
Member

scholzj commented Nov 30, 2023

Discussed on the community call on 30.11.2023: We should think about what do we want to do with this content.

@scholzj scholzj modified the milestones: 0.39.0, 0.40.0 Dec 13, 2023
@PaulRMellor
Copy link
Contributor Author

@mimaison , @showuon - We're currently discussing maintenance of the (broker, producer, consumer) tuning content in the Strimzi documentation and your advice would be useful in shaping any decision.

The main points of discussion are as follows:

  1. Maintainability: Do we want to keep maintaining this type of content?
  2. Depth: Is the content sufficiently in-depth? If not, is it better to drop the content rather than invest more time in expanding and maintaining it?
  3. Inclusion in Kafka Documentation: Shouldn't this type of content be referenced from the Kafka documentation? And if there are gaps in the Kafka documentation, can that be rectified?

Do you think the tuning content is useful to have in the Strimzi documentation, and if so, how can we address the maintainability concerns? Alternatively, do you agree that it might be more appropriate to have such content in the Kafka documentation?

@mimaison
Copy link
Contributor

mimaison commented Jan 2, 2024

I think this content is fine. It brings some context around the configurations and their interactions. It stays relatively high level to give users an idea of the main concepts. Diving deeper is tricky as it often depends on specific use cases. The Kafka docs on the other side are much more a reference with just the descriptions of the configurations.

@PaulRMellor
Copy link
Contributor Author

I think this content is fine. It brings some context around the configurations and their interactions. It stays relatively high level to give users an idea of the main concepts. Diving deeper is tricky as it often depends on specific use cases. The Kafka docs on the other side are much more a reference with just the descriptions of the configurations.

Thanks @mimaison. I appreciate your perspective. Our goal is to provide practical insights into best practices and common configurations. Regarding maintenance, the content has remained relatively stable since its creation. However, if you feel there are areas we're not addressing, please let me know.

Signed-off-by: prmellor <pmellor@redhat.com>
Copy link
Contributor

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!
There's just the comment about the assignment strategy which I'm not sure whether we could make a better recommendation.

Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I also think this doc is helpful. You can check what upstream kafka's doc about important configs:

image

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after discussion with @PaulRMellor. If anyone has anymore comments, please share them. Otherwise we are going to merge it.

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit. LGTM.


For more stable partition assignments, consider the *sticky* and *cooperative sticky* strategies.
Sticky strategies maintain assigned partitions during rebalances, when possible.
If a consumer was previously assigned certain partitions, the sticky strategies aim to reassign those same partitions back to the same consumer after a rebalance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC it's not going to "reassign" in the sense that it revokes and reassigns, leaving the consumer not consuming until the reassignment happens. IIRC it doesn't revoke the partitions which should stick with that consumer. Just revokes the ones which will be moved to another consumer. @mimaison can you confirma that? If yes, @PaulRMellor could we make it clearer here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I don't think reassign is a good way of wording this. I've updated here: docs(review): edits to doc from review by PP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You fine with this @ppatierno? Can we merge it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's fine with me. Thanks.

Signed-off-by: prmellor <pmellor@redhat.com>
@scholzj scholzj merged commit 6975ee3 into strimzi:main Feb 22, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants