Word Probe is a dataflow that shows how to use state across services. This example has two services. The first service reads sentences
, divides them into words, and stores their count in an aggregate. The second service reads words,
looks them up in the aggregate, and returns their count.
The dataflow uses the following primitives:
- map
- flat-map
- assign-key
- update-state
Take a look at the dataflow.yaml to get an idea of what we're doing.
Make sure to Install SDF and start a Fluvio cluster.
Use sdf
command line tool to run the dataflow:
sdf run --ui
Use --ui
to open the Studio.
For this example, we'll use the following data files: ./sample-data/sentences.txt
and ./sample-data/words.txt
.
# sentences.txt
behind every great man is a woman rolling her eyes
the eyes reflect what is in the heart and soul
keep your eyes on the stars and your feet on the ground
# words.txt
eyes
stars
the
Produce the data to the sentences
and words
topics:
fluvio produce sentences -f ./sample-data/sentences.txt
fluvio produce words -f ./sample-data/words.txt
Observe the data produced in both topics:
fluvio consume sentences -Bd
fluvio consume words -Bd
Consume from word-counts
topic to check the result:
fluvio consume word-counts -Bd -O json
You should see something like this:
{
"count": 3,
"word": "eyes"
}
{
"count": 1,
"word": "stars"
}
{
"count": 4,
"word": "the"
}
Use the show state
in sdf
terminal to watch the internal state of the windows:
show state count-words/count-per-word/state
You should see something like this:
Key count
a 1
and 2
behind 1
every 1
eyes 3
feet 1
great 1
...
Note: The dataflow stops processing records when you close the intractive editor. To resume processing, run sdf run
again.
Congratulations! You've successfully built and run a dataflow!
Exit sdf
terminal and clean-up. The --force
flag removes the topics:
sdf clean --force