From 8518baa70e1ec0dc609dbd36889e246752bc995e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Isma=C3=ABl=20Mej=C3=ADa?= Date: Wed, 15 Feb 2017 17:45:25 +0100 Subject: [PATCH] Fix some typos and small formatting issues. --- src/_posts/2017-02-13-stateful-processing.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/src/_posts/2017-02-13-stateful-processing.md b/src/_posts/2017-02-13-stateful-processing.md index b00361aac02..b3e0c63ad16 100644 --- a/src/_posts/2017-02-13-stateful-processing.md +++ b/src/_posts/2017-02-13-stateful-processing.md @@ -196,7 +196,7 @@ want to write a transform that maps input to output like this: Assigning arbitrary but unique indices to each element + width="180"> The order of the elements A, B, C, D, E is arbitrary, hence their assigned indices are arbitrary, but downstream transforms just need to be OK with this. @@ -238,9 +238,13 @@ key+window pairs, like this: keys and windows are independent dimensions) You can provide the opportunity for parallelism by making sure that table has -enough columns, either via many keys in few windows - for example, a globally -windowed stateful computation keyed by user ID - or via many windows over few -keys - for example, a fixed windowed stateful computation over a global key. +enough columns, either via: + +- Many keys in few windows for example, a globally windowed stateful computation + keyed by user ID. +- Many windows over few keys for example, a fixed windowed stateful computation + over a global key. + Caveat: all Beam runners today parallelize only over the key. Most often your mental model of state can be focused on only a single column of @@ -444,7 +448,7 @@ outputs from the `ParDo` that will be processed downstream. If the output, then you cannot use a `Filter` transform to reduce data volume downstream. Stateful processing lets you address both the latency problem of side inputs -and the cost problem of excessive uninterseting output. Here is the code, using +and the cost problem of excessive uninteresting output. Here is the code, using only features I have already introduced: ```java