Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming of Feeds #36

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Streaming of Feeds #36

wants to merge 6 commits into from

Conversation

nugaon
Copy link
Member

@nugaon nugaon commented Jun 30, 2021

The proposal introduces a new method to handle feed streams.


### Feed Topic Construction

The `feed topic` has to contain the initial time (`T0`) and optionally can be prefixed/suffixed with an additional identification so that the uploader with the same key can maintain many distinct feed streams.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave the topic alone. topic is topical. info related to indexing need not be hashed into anything, hashing with the topic in the soc obfuscates it already

function getIndexForArbitraryTime(Tx: number, T0: number, updatePeriod: number): number {
return Math.floor((Tx - T0) / updatePeriod) // which is the `i`
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should just use absolute time in unix time and use the epoch grid defined as per the book with the level as your delta (powers of 2 in seconds)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aggree, I wanted to refer to the UNIX Epoch Time by timestamps in the text, but that phrasing of mine is really not that concrete, I will change that.
Regarding the delta should be power based 2, I have some concerns about this because it restricts the arbitrary time upload period that the uploader was free to choose in this original concept.
Suppose I want to set my delta to 86400 seconds (1 day), then I can easily set my cron job to upload feeds with this time period, but on the other hand I cannot do that with time period of power based 2.

The most reasonable approach here is check the previous (`i-1`) and the next fetch index (`i+1`) until the worst case respectively `0` and last (`n`) index.
This lookup also can happen paralelly and checking _n_ chunks simulteniously on both sides in order to raise the certainty for the successful hit.

### Download Feed Stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave this aside. i think historical data should be time-index in a POT data structure, put thecurrent root hash in the feed update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed feed indexing method and lookup is an extension of sequential/periodic feeds, which has the advantage of not only the arbitrary topmost epoch base time, but its lookup, that pulls the closest available chunk with the certainty it should be retreivable, but if not, it will surely find the quickliest retrievable chunk in this way.
This approach of the proposal also could be an extension of epoch-based feeds, but IMO that wouldn't bring any real yield, but additional constraints.


The downloading of the stream is really straightforward, we should download all feeds one-by-one or paralelly starting from `0` index until and included `getIndexForArbitraryTime(Tp, T0, updatePeriod)` index.

The integrity check of the stream can only happen by putting versioning metadata into the feed segments, because the content creator may not intend to upload for every uploading time period (despite of the incentivised factor of this feed type).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also independent of all else, lets leave it

It is similar to that analogy when a hard drive is fragmenting because of the many frequent writes within one sector (base fragment of the epoch time lookup).
This proposed feed indexing method is the opposite as at lookups:
if there is expected amount of writes periodically, the retrieval of the data is faster, that even can be _O(1)_.
Compared to the epoch-based feeds, basically the length of the base segment is arbitrary, and it encourages the user to make periodic uploads and stick to this base segment instead of sporadic uploading in exchange for better retrieval time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hitting on strawman here. with deltas as levels, this is subumed under epochbased feeds

- the initial timestamp (`T0`) should be changed to that point from where the time period changes (`T1`)
- change the new time period (`Δ2`) from the old one (`Δ1`) for state uploads

If the registry settled on blockchain in a smart contract, then it can have a defined event on metadata change of the content, on which the clients can listen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole point of ffeeds is that we want sponteineous unpredictable sporadicity

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing update frequency on the blockchain i dont see realistic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sporadicity is unpredictable in this concept as well, because as it is mentioned the feed uploader not surely uploaded on every period or a chunk in the series is just simply not available. Therefore the sponteineous sporadicity should be higher than delta, if not then the change of delta should be acknowledged by the user.
IMO it is realistic to sort out even with blockchain event listening on client side.

if there is expected amount of writes periodically, the retrieval of the data is faster, that even can be _O(1)_.
Compared to the epoch-based feeds, basically the length of the base segment is arbitrary, and it encourages the user to make periodic uploads and stick to this base segment instead of sporadic uploading in exchange for better retrieval time.

The downside is within the concerted update time period the content creator cannot update the state of the feed - although it is possible to overwrite the content of a feed by uploading it again with different payload, but on download-side it is possible to still to get back the old one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use the full 64 bit unix nano time (nanosecond resolution) but in the current realistic network scenarios, subsecond updates would be futile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so too, there is a section where I mention this problem.

All of these are timestamps and their smallest unit is 1 second, e.g. even with this time unit it is still uncertain whether the latest update can be downloaded if Tp = Tn, because of the nature of P2P storage systems.

I think second as smallest unit is reasonable.

Thereby, the consumers of the feed can immediately react to the upload frequence change and they can poll according to the new rules.
If somehow the blockchain listening on client side is not suitable, it is possible to put the `uploading time period` and `initial timestamp` metadatas into all of the feed stream updates so that the consumers can sync to the stream after `MAX(Δ1,Δ2)` time if `MAX(Δ1,Δ2) % MIN(Δ1,Δ2) = 0` and there is one `k` and `m` positive integer where `T0 + (k * Δ1) = T0 + (m * Δ2) = T1`

Though it is stated this approach does not address downloading the whole feed stream, it is still possible:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ets leave this aside. i think historical data should be time-indexed in a POT data structure, put the current root hash in the feed update


## Backwards Compatibility
<!--All SWIPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The SWIP must explain how the author proposes to deal with these incompatibilities. SWIP submissions without a sufficient backwards compatibility treatise may be rejected outright.-->
The whole idea can be implemented on application layer using single owner chunks, but optionally the solution also can be integrated to the P2P client.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would need every publisher update to emit an event (PSS) and a version/height eg height=1 + feed topic gets the 1st update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not related to PSS

@crtahlin crtahlin added the check-SWIP-status Check if the SWIP is still relevant and being pursued. label Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
check-SWIP-status Check if the SWIP is still relevant and being pursued. pull-request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants