New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(flow): shared in-memory state for dataflow operator #3508
Conversation
f40f643
to
03f0269
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3508 +/- ##
==========================================
- Coverage 85.30% 85.00% -0.31%
==========================================
Files 904 907 +3
Lines 150170 151406 +1236
==========================================
+ Hits 128106 128704 +598
- Misses 22064 22702 +638 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
#3187
What's changed and what's your intention?
A shared state of key-value pair for various state
in dataflow execution, called
Arrangement
i.e: Mfp operator with temporal filter need to store it's future output so that it can add now, and delete later. To get all needed updates in a time span, use
get_updates_in_range
And reduce operator need full state of it's output, so that it can query(and modify by calling
apply_updates
)existing state, also need a way to expire keys. To get a key's current value, use
get
with time beingnow
so it's like:
mfp operator -> arrange(store futures only, no expire) -> reduce operator <-> arrange(full, with key expiring time) -> output
Note the two way arrow between reduce operator and arrange, it's because reduce operator need to query existing state
and also need to update existing state.
This state will be used in building&execution dataflow graph
Checklist