Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Specs for Acropolis storage module #45
Would it be accurate to say that what is missing still to be specified is
Perhaps we can limit the scope for Acropolis, yet still achieve much of what we want for our OKRs, by making a good number of things just configured in the genesis state - without thinking of adding/removing/editing, such as
Should we even depends on Actors module? I never understood it, and Mokhtar has questioned it multiple times.
Mime key map
Data Object Type Registry module has a map
Why both DataObjectTypeConstraints and DataObjectTypes
What does DataObjectTypeConstraints add, or vice versa? ultimately, you are trying to establish a mapping between a offchain data filtering rule and an on chain tranche. Prior this all lived in DataObjectTypes. Is the point that DataObjectTypeConstraints don’t point to tranches, only DataObjectTypes do?
The “submodules” 1-3 on the Structure list appear to mutually depend on each other. and nether can sensibly be reused on its own. Therefore organizing them a single Substrate modules seems to be more appropriate.Mind you, we don’t have any good convention on how to apply the Substrate module abstraction, so we just going on case by case basis for now. But re-usability in other runtimes seems like one plausible guiding concern.
Separate Content Directory from Storage Module
The Content Directory (CD) really should be in its own proper Substrate module, which would depend on the storage module. The CD is going to substantially grow in scope as more complicated high level business logic/state is introduced there, and none of it is relied upon by anything else in the storage module proper.
Payload based model
The model chosen here makes the set of tranches available for a given type/constraint a function of the raw data payload. Moreover, it obliges the client to deduce this.
This has a number of limitations:
A) The functional role of a data object, that is how it is actually going to be used in user applications, may have substantial bearing on how it should be stored and distributed, and this cannot be deduced from the raw payload. The appropriate storage tranche may be sensitive such functional dimensions, and the distribution tranches/system likewise certainly also will be. For example, a certain type of image may be used in a way in the system that its stored in a highly redundant way with highly staked actors. But an equivalent image payload, in every way, may be used in applications in a way that its not critical at all, and it could be stored with tranches with unstaked newcomers with low redundancy.
B) the quota system, for example on uploads, becomes blind to functional roles of objects, and can only operate on raw size and similar metrics, which may not be as flexible as one wants.
As is, if new members are given a 1GB upload quota for example (which is quite low), they may chose to use that by uploading 1 milllion 1Kb images for example, which is both abusive and makes no sense.
C) the distribution system will be very sensitive to the role of a data object, as this is major determinant for future bandwidth requirement profile, hence having this explicitly represented will be very useful. <== speculative point, as we have not thought much about distribution.
D) it may be impractical for certain usage environments to even do this , e.g. a browser having process the actual internal encoding information of a large payload.
The alternative implied model by these observations is to have a set of data object types explicitly defined by functional role of the data, and there is only one type open to a given data object. Client applications will have to be hard coded to use the correct type for a given purpose. Likewise the content directory (schematas) would for example require that data objects pointed to have a specific type
To summarize the discussion we had about this briefly:
We did discuss the payload-based model for a while, and came to two changes:
Does that seem to capture it?
Ok, I think this is more or less the state we should reach - with the caveat that instead of referring to the actors module as before, I'm referring to a currently non-existent "tranche staking" sub-module. If @mnaamani is up to that, great, otherwise I can put it together from the current actors module.
referenced this pull request
May 24, 2019
Ok, @mnaamani - would be good if you could skim the data directory specs wrt staking. I also made a modification to the staking spec I didn't think about before; a tranche needs to be created with a matching data object type (but that should not be possible to modify later).
One could argue for a state