diff --git a/README.md b/README.md index d582e63b8..8da5e62f9 100644 --- a/README.md +++ b/README.md @@ -8,9 +8,8 @@ ## Summary -Numaflow is a Kubernetes-native platform for running massively parallel data processing and streaming jobs. +Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices. -A Numaflow Pipeline is implemented as a Kubernetes custom resource, and consists of one or more sources, data processing and sink vertices. Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platforms. @@ -21,6 +20,12 @@ Numaflow installs in a few minutes and is easier and cheaper to use for simple d - Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted. - Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed. + +## Data Integrity Guarantees: +- Minimally provide at-least-once semantics +- Provide exactly-once semantics for unbounded and near real-time data sources +- Preserving order is not required + ## Roadmap - Data aggregation (e.g. group-by)