Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embrace auto.topic.create; trust defaults for production topics #107

Merged
merged 10 commits into from
Dec 18, 2017

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Dec 13, 2017

This is a mindset change for us, meant to address at least one bullet in #101: we will omit the CLI args for number of replicas and partitions when creating production topics. Instead the defaults should reflect the current cluster size - and stage. Test clusters may replicate less, in order to save storage cost for example.

Producers still need to configure acks, but broker config helps here too. We should set all by default (or -1 if it has to be an int), and let min.insync.replicas be configured so that single node loss is a non-event.

Our defaults will be for stability. If some topics need better througput (#88, #92 ?) we'll explicitly create (or reconfigure) those.

so that we can omit replication-factor when creating topics,
as we benefit from reusing such commands or definitions across dev-qa-prod.
topics at runtime, such as splitting a stream based on some business enum.
Producers and Kafka Streams apps would otherwise need to set up an AdminClient to do that.

This reverts commit: 0681cc5
in one place. These values are critical to maintain for those,
like us, who make use of auto create topics for production data.

Also a step towards #72 and #77.
@solsson solsson added this to the v3.1 milestone Dec 13, 2017
and easy to reconfigure per topic as they grow.

We already had that, but this branch cares about grouping such conf.
It also encourages topic defaults geared towards persistent data.
config location suggested by Kafka's sample conf.
as we now encourage close scrutiny of the config file.
@solsson
Copy link
Contributor Author

solsson commented Dec 13, 2017

Actually ended up replacing #77 here (cons there were outdated), fixes #72.

	log.dir = /tmp/kafka-logs
	log.dirs = /var/lib/kafka/data/topics

but this is the lesser of two evils compared to duplicate values
@solsson solsson merged commit e9e6b24 into master Dec 18, 2017
solsson added a commit that referenced this pull request Nov 28, 2018
so we want to undo #107.
It was partially based on a false assumption, as pointed out in
#101 (comment)

Topics are created not only at produce but also at for example kafkacat -C.
Typos cost us more time than it would take to automate topic creation
and run ./bin/kafka-topics.sh in a temporary pod when we haven't automated.
solsson added a commit that referenced this pull request Dec 20, 2018
solsson added a commit that referenced this pull request Oct 8, 2019
I though that the feature would be neat for development (with replication factor 1, see #222)
but it causes just as much confusion and useless troubleshooting there,
for example race conditions between intentional topic creation and a container
starting up to produce to the topic. You actually never know which topic config you're getting.

Related: #107

The duplication is a workaround for kubernetes-sigs/kustomize#642
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant