Enhancing Data Delivery Efficiency in `chainHead_unstable_storage` Calls #1600

josepot · 2024-01-23T21:48:47Z

Issue Overview

While the current behavior of the chainHead_unstable_storage call in fetching descendantValues is compliant with the specification, I propose a discussion on improving the client-side experience. This is particularly relevant when dealing with large storage entries, such as Staking.Nominators on Polkadot.

Current Behavior

Contrary to my initial expectations, requesting the descendantValues of a large storage entry does not result in multiple operationStorageItems events over time. Instead, it yields a single operationStorageItems event with a vast array of items, followed immediately by an operationStorageDone event.

On the client side, handling this extensive data in one go has proven to be challenging. The main issue I've encountered is the intensive resource requirement to decode a vast number of items simultaneously, which leads to browser performance issues. I recognize that part of this challenge stems from my approach to data handling. Optimizing my method to process such large datasets in a more efficient manner is a key area of improvement on my end.

Proposal for Progressive Data Delivery

Despite the above, I propose considering a more progressive approach to data delivery. Delivering data in smaller chunks could significantly enhance the efficiency of client-side processing. This method would allow for a more responsive user experience, as data can be presented and handled incrementally.

Suggestion for Specification Enhancement

In light of this, revisiting the specification to encourage or facilitate progressive data delivery for large datasets might be beneficial. Such a change could help developers handle data more effectively, leading to better performance across various client applications.

Requesting Feedback

I'm interested in your thoughts on this suggestion and any additional insights you may have.

The text was updated successfully, but these errors were encountered:

tomaka · 2024-01-23T22:29:31Z

Giving the data progressively is completely intended, it's simply not 100% properly implemented in smoldot at the moment.

In light of this, revisiting the specification to encourage or facilitate progressive data delivery for large datasets might be beneficial.

You mean reverting paritytech/json-rpc-interface-spec#47 ?

josepot · 2024-01-23T22:57:58Z

Giving the data progressively is completely intended

This is what I'm requesting, and what I don't currently see happening 🙂.

it's simply not 100% properly implemented in smoldot at the moment

Which is why I'm opening this issue. Because, even if the current behaviour is "spec compliant", it's far from ideal.

Just to be clear: the issue is that I don't receive any items notifications for a long time (between 30 to 50 secs) and then I suddenly receive just one items notification containing thousands of rows, and immediately after that I receive the operationStorageDone event.

Ideally, I would like to receive a notification with a few hundred items every now and then and/or whenever possible, rather than waiting to receive them all in one go once the whole data-set is ready.

You mean reverting paritytech/json-rpc-interface-spec#47 ?

There is no need to revert paritytech/json-rpc-interface-spec#47.

Even if it was reverted, if smoldot behaved in the same way: wait until it has all the items and then bomb the client with thousands of events all at once, then we would be in a situation even more difficult to deal with from the client side. So, it's not about reverting that, it's about asking the server to send the data progressively whenever possible.

IMO this should also be beneficial for smoldot, as it would be able to free up memory faster.

tomaka · 2024-01-24T07:11:43Z

Even if it was reverted, if smoldot behaved in the same way: wait until it has all the items and then bomb the client with thousands of events all at once, then we would be in a situation even more difficult to deal with from the client side.

If you received thousands of events, you could parse each event individually and add a delay between parsing.

What you seem to want is for smoldot to wait a little bit before sending chunks of items. This is not something I will ever implement. I will implement sending items as soon as they're received, which will indeed lead to multiple small chunks, but this is in no way a guarantee that there will be a delay between these chunks.

josepot · 2024-01-24T07:18:48Z

Even if it was reverted, if smoldot behaved in the same way: wait until it has all the items and then bomb the client with thousands of events all at once, then we would be in a situation even more difficult to deal with from the client side.

If you received thousands of events, you could parse each event individually and add a delay between parsing.

Not really, as each callback gets invoked by the event-queue. So, having ~50K callbacks invoked "all at once" (one immediately after the other) in a queue of microtasks that I have no means of knowing its size, is a lot more annoying to deal with than 1 callback with 50K items. However, once again, the issue is IMO the fact that the data is not being delivered progressively.

EDIT: actually, with the perf optimization that I plan on using it would be about the same. However, the whole point is that as I said in my previous comment: "it's not about reverting that, it's about asking the server to send the data progressively".

What you seem to want is for smoldot to wait a little bit before sending chunks of items.

Nope, that's not what I want.

This is not something I will ever implement.

Phew!

I will implement sending items as soon as they're received

This is exactly what I would like to have! 🙌

Once again, as I already said in my initial message:

I recognize that part of this challenge stems from my approach to data handling. Optimizing my method to process such large datasets in a more efficient manner is a key area of improvement on my end.

So, yeah, I should improve the way that handle dealing with large data-sets, no doubt. However, that doesn't change the fact that ideally smoldot should try not to hoard all the items until its done receiving them, that's all.

tomaka mentioned this issue Jan 24, 2024

Deliver chainHead_storage items progressively #1605

Merged

tomaka closed this as completed in #1605 Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancing Data Delivery Efficiency in `chainHead_unstable_storage` Calls #1600

Enhancing Data Delivery Efficiency in `chainHead_unstable_storage` Calls #1600

josepot commented Jan 23, 2024

tomaka commented Jan 23, 2024

josepot commented Jan 23, 2024 •

edited

tomaka commented Jan 24, 2024

josepot commented Jan 24, 2024 •

edited

Enhancing Data Delivery Efficiency in chainHead_unstable_storage Calls #1600

Enhancing Data Delivery Efficiency in chainHead_unstable_storage Calls #1600

Comments

josepot commented Jan 23, 2024

tomaka commented Jan 23, 2024

josepot commented Jan 23, 2024 • edited

tomaka commented Jan 24, 2024

josepot commented Jan 24, 2024 • edited

Enhancing Data Delivery Efficiency in `chainHead_unstable_storage` Calls #1600

Enhancing Data Delivery Efficiency in `chainHead_unstable_storage` Calls #1600

josepot commented Jan 23, 2024 •

edited

josepot commented Jan 24, 2024 •

edited