Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is DataObject Accepted status necessary? #5058

Open
mnaamani opened this issue Jan 23, 2024 · 1 comment
Open

Is DataObject Accepted status necessary? #5058

mnaamani opened this issue Jan 23, 2024 · 1 comment

Comments

@mnaamani
Copy link
Contributor

mnaamani commented Jan 23, 2024

In the runtime a data object is represented as:

pub struct DataObject<RepayableBloatBond> {
    /// Defines whether the data object was accepted by a liason.
    pub accepted: bool,

    /// Bloat bond for storing the data object in the runtime state.
    pub state_bloat_bond: RepayableBloatBond,

    /// Object size in bytes.
    pub size: u64,

    /// Content identifier presented as base-58 encoded multihash.
    pub ipfs_content_id: Base58Multihash,
}

When a data object is created in the runtime the accepted value is false. It can be flipped to true by any worker that is operating a bucket which holds a bag containing the object, with the dispatch storage::accept_pending_data_objects().
This is done by the storage-node once it processes an upload request for an object.

PendingDataObjectsAccepted(StorageBucketId, WorkerId, BagId, BTreeSet<DataObjectId>),

processed by Query Node and Orion, although they only seem to store last worker id that made the dispatch call.

The main consumers of this state are distributor nodes and atlas when deciding whether an object is available in the storage system to even attempt to fetch said object.

It is not clear if this property is adding any real valuable state.
Should we continue to use it? Is it contributing to state bloat?

If there are plans to add tooling for the storage lead to penalize operators that have indicated that they accepted an object but cannot produce it on request (if they are still operating a bucket that is obligated to store that object), then we can keep it but the database schemas and mappings in QN and Orion should be update to keep track of all operators. The storage lead also suggested that it might be valuable for a storage node signal in a similar fashion when it has synced an object from another node. This data could help maintain a history the replication status of an object across buckets to help identify where/when objects are lost.

Maybe there is a case for this state to not be necessary on chain but only through Event data?

@bedeho
Copy link
Member

bedeho commented Jan 23, 2024

It is not clear if this property is adding any real valuable state.

Having someone sign off on the fact that they did in fact get a valid upload of the data and that it matched the hash, that seems quite important. How else does anyone even determine if the upload was ever completed or even initiated? This data can of course be put in some central off-chain location.

The real issue with the storage pallet is that really it does not need to be native runtime code, as long as we are not doing any actual automated on-chain slashing or rewarding based on some proof of storage scheme or something. It could all just be metaprotocol stuff, which would be much more flexible, dramatically reduce fees and avoids the whole bloat issue. But this is a big change. See here: #4940

@mnaamani mnaamani changed the title Data Object Accepted Is DataObject Accepted status necessary? Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants