Skip to content

Provide production check list #274

@Yang-33

Description

@Yang-33

As one of Decaton user with several years of experience, I've noticed there are several key pieces of information that beginners often overlook. Though the Javadoc itself is well written and contains no hidden traps, it would be great if we could create documentation that enables beginners to use Decaton with the confidence of an expert.

To that end, we'd like to propose adding more documentation, specifically in the form of a Production Checklist and Troubleshooting guide. While documentation can drift from the code over time, I don't believe this will create an unmanageable maintenance burden. Decaton's core fundamentals are already stable, and AI tools can assist with updates, though it's not perfect.

At a minimum, I'd like to cover the following topics. I'll also be gathering feedback from other Decaton users.

  • Metrics List: A list of key metrics, their names, and when to use them. (Add a document about metrics #189 is lated)
  • Alerting: What alerts should be configured?
  • Subpartitions: When should you increase subpartitions, and when should you not?
  • Pending Records: When should you increase the pending records limit, and when should you not?
  • deferCompletion: What safety nets should be in place when using it?(alert, log, timeout, ...)
  • Retry: You should create retry topic, because you encounter failure someday. Decide eternal retry or giving up, with proper retry backoff
  • Assume at-least-once: You can't achieve exactly-once
  • Select runtime: You should care resource efficiency, if your consumer processes massive messages
  • Use better kafka client: producing may fail even decaton's retry, you should care this case
  • Rebalancing, Restarting app: It takes time when your configuration is wrong
  • Ordering Guarantees: When is ordering guaranteed, and when is it not?
  • Safer chaining: Best practices for safer chaining (e.g., the producer on consumer pattern).
  • Skew: Strategies for handling a sudden spike in traffic for a specific key (e.g., ignore key, subpartitioning, PerKeyQuota).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions