doc: kill-switch when buffer is full (#1034)

Signed-off-by: Vigith Maurice <vigith@gmail.com>
numaproj · Sep 14, 2023 · 8d5c56f · 8d5c56f
1 parent 73db23a
commit 8d5c56f
Show file tree

Hide file tree

Showing 3 changed files with 40 additions and 1 deletion.
diff --git a/docs/user-guide/reference/edge-tuning.md b/docs/user-guide/reference/edge-tuning.md
@@ -0,0 +1,36 @@
+# Edge Tuning
+
+## Drop message onFull
+
+We need to have an edge level setting to drop the messages if the `buffer.isFull == true`. Even if the UDF or UDSink drops
+a message due to some internal error in the user-defined code, the processing latency will spike up causing a natural 
+back pressure. A kill switch to drop messages can help alleviate/avoid any repercussions on the rest of the DAG.
+
+This setting is an edge-level setting and can be enabled by `onFull` and the default is `retryUntilSuccess` (other option
+is `discardLatest`).
+
+This is a **data loss scenario** but can be useful in cases where we are doing user-introduced experimentations, 
+like A/B testing, on the pipeline. It is totally okay for the experimentation side of the DAG to have data loss while 
+the production is unaffected.
+
+### discardLatest
+
+Setting `onFull` to `discardLatest` will drop the message on the floor if the edge is full.
+
+```yaml
+  edges:
+    - from: a
+      to: b
+      onFull: discardLatest
+```
+
+### retryUntilSuccess
+
+The default setting for `onFull` in `retryUntilSuccess` which will make sure the message is retried until successful. 
+
+```yaml
+  edges:
+    - from: a
+      to: b
+      onFull: retryUntilSuccess
+```
diff --git a/docs/user-guide/reference/pipeline-tuning.md b/docs/user-guide/reference/pipeline-tuning.md
@@ -1,6 +1,8 @@
 # Pipeline Tuning
 
-For a data processing pipeline, each vertex keeps running the cycle of reading data from an Inter-Step Buffer (or data source), processing the data, and writing to next Inter-Step Buffers (or sinks). It is possible to make some tuning for this data processing cycle.
+For a data processing pipeline, each vertex keeps running the cycle of reading data from an Inter-Step Buffer (or data source), 
+processing the data, and writing to next Inter-Step Buffers (or sinks). It is possible to make some tuning for this data 
+processing cycle.
 
 - `readBatchSize` - How many messages to read for each cycle, defaults to `500`.
 - `bufferMaxLength` - How many unprocessed messages can be existing in the Inter-Step Buffer, defaults to `30000`.

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -82,6 +82,7 @@ nav:
               - Examples: "user-guide/user-defined-functions/reduce/examples.md"
       - Reference:
           - user-guide/reference/pipeline-tuning.md
+          - user-guide/reference/edge-tuning.md
           - user-guide/reference/autoscaling.md
           - user-guide/reference/conditional-forwarding.md
           - user-guide/reference/join-vertex.md