Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve profiling output to make storage performance easier to understand #2059

Merged
merged 4 commits into from
Jul 17, 2024

Conversation

blp
Copy link
Member

@blp blp commented Jul 17, 2024

@gz, this modifies the async spine code, mostly to simplify it a bit, but also to add some more profiling output to make it easier to understand what's happening storage-wise.

@ryzhyk, I changed the profiling output a bit to shorten paths because that makes the boxes a lot smaller and therefore it's easier to read the profiles. Would like your OK on that.

Is this a user-visible change (yes/no): no

blp added 4 commits July 16, 2024 16:41
I couldn't figure out why there was so much complication around
distinguish the kinds of batches.  I think it wasn't really necessary.
This simplifies the code a bit.

Signed-off-by: Ben Pfaff <blp@feldera.com>
I found this test confusing because it inserted a couple of batches that
should get merged but asserted that they hadn't been merged yet.

Signed-off-by: Ben Pfaff <blp@feldera.com>
A lot of the nodes in the diagrams I look at are wider than otherwise
necessary, which in turn makes the diagrams themselves wider than otherwise
necessary, because they have the string `database-stream-processor/src/`
and other long directory names in them.  This gets rid of that by
replacing each directory name in the nodes by just its first letter,
which is enough to make the file name unambiguous in practice.

Signed-off-by: Ben Pfaff <blp@feldera.com>
This makes it easier to understand what's going on with a spine while
looking at a profile.

Signed-off-by: Ben Pfaff <blp@feldera.com>
@blp blp added enhancement New feature or request DBSP core Related to the core DBSP library storage Persistence for internal state in DBSP operators rust Pull requests that update Rust code labels Jul 17, 2024
@blp blp requested review from gz and ryzhyk July 17, 2024 17:04
@blp blp self-assigned this Jul 17, 2024
Copy link
Collaborator

@gz gz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, fyi we already have a metric for merge reduction too https://github.com/feldera/feldera/blob/main/crates/dbsp/src/trace/spine_async/mod.rs#L613

@blp
Copy link
Member Author

blp commented Jul 17, 2024

lgtm, fyi we already have a metric for merge reduction too https://github.com/feldera/feldera/blob/main/crates/dbsp/src/trace/spine_async/mod.rs#L613

Oh, that's good to know (it's per-process though instead of per-spine). I hadn't taken proper notice of those metrics.

@blp blp merged commit a68ca96 into main Jul 17, 2024
5 checks passed
@blp blp deleted the profiles branch July 17, 2024 18:01
@gz
Copy link
Collaborator

gz commented Jul 17, 2024

we could make them per-spine with lables so they can be aggregates.. long term maybe it's worth doing it like this rather than having to maintain our own profiler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DBSP core Related to the core DBSP library enhancement New feature or request rust Pull requests that update Rust code storage Persistence for internal state in DBSP operators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants