Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability of DDS to reference prior summary blobs when generating new summary #6568

Closed
vladsud opened this issue Jun 25, 2021 · 9 comments
Closed
Labels
area: runtime: summarizer design-required This issue requires design thought
Milestone

Comments

@vladsud
Copy link
Contributor

vladsud commented Jun 25, 2021

This topic came up in discussing Whiteboard current payloads and perf & bandwidth impact.
The core of the problem is that tree DDS keeps a lot of data (history) and most of it stays the same from summary to summary.
We spend a lot of CPU resources and a lot of bandwidth to re-upload same data from summary to summary.
If blob content is the same across summaries, then storage will dedup it, but it happens too late.

Summary process as a whole allows referencing blobs from prior summary.
However the way it's implemented is using full blob paths within summary, and DDS does not have that information (i.e. path of content it generates in final summary). Thus DDS can't leverage that mechanism I believe.

We need to explore how to add one.

@vladsud
Copy link
Contributor Author

vladsud commented Jun 29, 2021

@DLehenbauer - can you please help with prioritization here? Is this something that is needed soon by Whiteboard?
Does it alleviate current issues, or this alone task would not be sufficient?
Currently it is not scheduled for July, if it needs to be changed, we better know about it soon/

@vladsud vladsud added this to the July 2021 milestone Jun 29, 2021
@vladsud
Copy link
Contributor Author

vladsud commented Jun 29, 2021

@arinwt, @agarwal-navin - any chance you can take a look? It would be great if you could add your thoughts on how to properly design it, and we can see if someone else could own actual implementation. Thanks!

@arinwt
Copy link
Contributor

arinwt commented Jun 29, 2021

I think this is a reopen of #2025. I was thinking of using relative paths before and resolving them in the channel context.

@arinwt
Copy link
Contributor

arinwt commented Jun 30, 2021

After reviewing the old issue, I think most of the reason hasn't changed. The only thing I want to point out is that this conversion to path is generally the responsibility of SummarizerNode now. It would be nice to make sure we don't pass through the nodes 2+ times as we work our way up the tree.

@agarwal-navin
Copy link
Contributor

That sounds like a good plan and we already have all the functionality needed to this. We just have to add code in summarizer node to covert the "relative" handle path sent by the DDS to a full path.

@vladsud
Copy link
Contributor Author

vladsud commented Jul 1, 2021

Is this something one of you wants to take on?
Alternatively we would need a bit more detailed instructions for someone else to own implementation.

@agarwal-navin
Copy link
Contributor

Arin is working on onDemand summaries and the word ask which may be a higher priority right? I can probably pick this up in the second half of July. If not, we can add more detail to it and someone else can pick it up.

@ghost
Copy link

ghost commented Mar 28, 2022

This issue has been automatically marked as stale because it has had no activity for 180 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework!

@agarwal-navin
Copy link
Contributor

agarwal-navin commented May 24, 2022

Tracked by AB#423

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: runtime: summarizer design-required This issue requires design thought
Projects
None yet
Development

No branches or pull requests

4 participants