-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbsp: New FallbackKeyBatch and FallbackValBatch types. #1656
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`cargo doc` complained about: ``` [`Antichain`](crate::time::Antichain) ``` saying that the explicit reference was the same as the default one. Signed-off-by: Ben Pfaff <blp@feldera.com>
I'd copy-pasted this in several places and this consolidates the implementation. Signed-off-by: Ben Pfaff <blp@feldera.com>
…ck`. This code was copy-pasted privately into two modules. It could be useful more broadly (and it will be used more broadly in upcoming commits), so this moves it into `dbsp::trace::cursor` and makes it public. Signed-off-by: Ben Pfaff <blp@feldera.com>
`Cursor::map_times` and related functions are usually the right way to work with time-diff pairs in a `Cursor`. However, a cursor interface is sometimes useful. This commit adds such an interface and implements it for the batches where it will be needed in an upcoming commit. Signed-off-by: Ben Pfaff <blp@feldera.com>
…mes. This will be needed in an upcoming commit. Signed-off-by: Ben Pfaff <blp@feldera.com>
…rger`. This merger can merge any two batch types into a third type. This is useful because the "fallback" implementations can need to merge one file or vector batch with another one to produce a third. This commit uses the merger in `FallbackIndexedWSet` and `FallbackWSet`. An upcoming commit will use it in `FallbackKeyBatch` and `FallbackValBatch` as well. Signed-off-by: Ben Pfaff <blp@feldera.com>
These complete the set of batch types that choose between memory or storage implementations at creation time. These fix the runtime for the `galen` benchmark, which regressed when storage was introduced because it uses valbatches. Signed-off-by: Ben Pfaff <blp@feldera.com>
blp
added
DBSP core
Related to the core DBSP library
performance
storage
Persistence for internal state in DBSP operators
rust
Pull requests that update Rust code
labels
Apr 18, 2024
Benchmark resultsNexmark
Galen
|
You can see that this worked from the galen benchmark results above, where the runtime fell from >2 hours to 34 seconds. |
ryzhyk
approved these changes
Apr 18, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
DBSP core
Related to the core DBSP library
performance
rust
Pull requests that update Rust code
storage
Persistence for internal state in DBSP operators
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The individual commits in this series are meaningful and it's worth looking at
them individually.
These complete the set of batch types that choose between memory or
storage implementations at creation time. These fix the runtime for
the
galen
benchmark, which regressed when storage was introduced becauseit uses valbatches.
Is this a user-visible change (yes/no): no