Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adr: add API for retrieving data for a particular namespace only #302
adr: add API for retrieving data for a particular namespace only #302
Changes from 6 commits
319593e
9e6d782
ab1084a
d2939b3
d8b0b1e
582dc9f
8cff586
3879e4f
18f403a
cb6b4e1
916b289
6e6d87b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add that the ADR would also be helpful for DAS consensus nodes that want to verify internal state messages only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will add that! Now that you've mentioned it, I guess we could also use a modified version of tendermint's internal gossiping mechanism for that: currently, the gossip routine pushes out random chunks of the whole (regular) block data (aka Parts) but instead, it could send out only the state relevant portions instead (as namespaced shares).
I feel like this could have several benefits: this gossiping is potentially faster than requesting the data from ipfs, the code is already there (but would need to be modified), we would remove some pressure from the proposer more quickly as we decrease the time where it is the only node on the network having all block data; it would buy some time for doing the DA sampling in parallel or pipelined shortly after receiving a few shares from the gossip routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added use-case: 916b289
I think the idea of using the gossiping mechanism should be discussed in an ADR / issue. Will leave the comment unresolved s.t. I can quickly find it when going through the open comments here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's an interesting idea and for its benefits, it can be better than what I was doing. I was recently thinking about/partially implementing that, but instead of parts, I am sending DAHeaders for nodes to catch up.
But I am concerned this is not a long-term solution and may require changes that soon will be removed, as IPFS is our main data source/sharing mechanism. Also, this is quite a sharp turn and should be documented as an issue somewhere first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but previously we were not optimizing for the use-case of light validator nodes that need to download all state relevant Txs. I guess with the deferred execution model that doesn't matter much as validators can start downloading these while the previous block is reaching consensus (it's the previous block's Tx the app hash is computed for) but with immediate execution, this will become a time-sensitive (and kinda consensus critical) issue. I think a simple hybrid gossiping approach that pushes out some data that every validator will need anyway paired with IPFS to get all other data on demand (e.g. for DAS or for downloading the whole block) could be a better approach. I think we have yet to decide on a specific approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that this is not a decision either. I'm proposing to investigate this further: e.g. by looking into how easy it would be to change the current gossiping routine to only gossip reserved Txs only. If it turns out not to be easy we could decide to go down the other route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing the shares are flattened?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there might be some edge case issues parsing the data after the fact if there are multiple messages in the same namespace, but I need to think about it a bit more.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think it would make sense to first implement this function. It should essentially return the raw, unparsed version. The other two methods will have to return everything already parsed. And the case you are talking about should be taken care of in those two.
cc @renaynay if you find time to look into this, you don't need to think about parsing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the decision here to return a byte slice, or a shares slice? I feel it should return a shares slice, but maybe not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be a shares slice. I'll update this before merging. I also plan to incorporate some of Evan's feedback before merging too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
18f403a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know which package this code will end up in, but we might want to export some of the merging functions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, not sure if it's in the scope of this ADR, but it might also be useful to make a
RetrieveRow
func, to function somewhere imbetweenGetLeafData
andRetrieveBlockData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't sure either. I feel like there are a few things that will be used as a library by other repositories and projects (e.g. optimint). I think these should be put into a separate package with very clear documentation, or, maybe even in a separate repo (later).
Yeah, that is even mentioned somewhere in this ADR:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good idea! I think this should go into ADR 002 though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
whoops! nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this is necessary to be added, but I also consider the current state of searches for non-existing namespaces as negative and expensive. Zero result request for node actually means downloading and traversing some non-trivial amount of nodes. While selectors can make things better, it's still a case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is right! In the worst-case, this could mean going through all row roots (e.g. say the last tree has a min/max NID range s.t. the namespace in question could be included) and then traversing the tree once from the root to the leaf that proves that this namespace is not included. I guess this is still OK as it is
O(squareSize)
local operations (linear search) and additionally readingO(log squareSize)
nodes for the proof. Also, I think there is no great way around this unless we track somewhere the ranges for the namespaces.