New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: nomenclature #5

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
6 participants
@joehand
Copy link
Member

joehand commented Feb 6, 2018

This DEP is still work-in-progress.

I've added the summary/motivation and Bryan's questions. I need to put a bit of thought in how to best organize this, in the case we have a lot of terms in here.

We also have existing terms in the Dat documentation: https://docs.datproject.org/terms. I can update/consolidate those to this DEP.

I will try to collect some good examples of other nomeclature/naming convention docs as motivation, if you have suggestions.

@pfrazee

This comment has been minimized.

Copy link
Member

pfrazee commented Feb 6, 2018

Good call. I share the question about registers.

@joehand

This comment has been minimized.

Copy link
Member

joehand commented Feb 6, 2018

On that topic, I started to try to add some decision making criteria when deciding between a few possible words (realizing now I should make a section for this):

By defining Dat nomenclature, we can ensure the writing of the wider Dat community also uses the preferred terms. ... To reduce barriers to entry, this DEP will prefer words that are less technical while conveying the same meaning.

So, though register is more technically accurate it seems like feed or log may be preferred.

@joehand

This comment has been minimized.

Copy link
Member

joehand commented Mar 21, 2018

Note to self from previous meeting:

discuss syncing/seeding/etc in nomenclature DEP - jhand will formalize terms that need definition

@martinheidegger

This comment has been minimized.

Copy link
Contributor

martinheidegger commented Nov 15, 2018

Things I miss specified:

  • Version: What is considered a version in DAT?
  • Bootstrapping: What do we mean when bootstrapping the network?
  • Sparse: DAT's can be "sparsely" replicated?!
  • Checkout: What is considered a checkout?
  • Live: There are properties in code that refer to something being "live"

... and a link to the terminology used in the dat protocol book: https://github.com/datprotocol/book/blob/a3ca149853b9153c7140876d6f749ecad5c6edbb/src/ch03-01-terminology.md

@pfrazee

This comment has been minimized.

Copy link
Member

pfrazee commented Nov 15, 2018

I'll offer some definitions here...

  • Version: Internally every dat data-structure is composed of append-only logs (hypercores). Any time an entry is appended to the log, a new version is created. The version is identified according to the semantics of the data-structure. In the case of single-writer hyperdrive, it's currently being identified by the metadata log's latest message number.
  • Bootstrapping: This is probably referring to getting connected to the discovery DHT network.
  • Sparse: Means that the data-set is only partially downloaded/replicated.
  • Checkout: Viewing a previous version of a dat.
  • Live: This one is a little vague but usually means "connected to peers and downloading updates as they come."

@martinheidegger martinheidegger referenced this pull request Nov 17, 2018

Open

History view #349

@bnewbold

This comment has been minimized.

Copy link
Contributor

bnewbold commented Nov 17, 2018

I would define a Checkout as a folder containing files from a dat/hyperdrive feed at a specific version (which could be the most recent version; doesn't need to be "previous"). This is distinct from having the same content stored locally in SLEEP files. The terminology comes from git and git checkout.

To clarify Version, it's the integer message number of a hypercore feed. These days, with multi-writer, the term gets a bit more ambiguous because there are multiple feeds, so the version of a hyperdb overall can be an array of (feed, integer) pairs. UX/nomenclature around this will probably need an update for dat-on-multiwriter-hyperdb.

Sparse usually implies that not only is the dataset/feed only partially replicated, but that it's intentionally only partially replicated: the user only wanted, eg, a sub-directory, or only specific versions replicated. I don't think there is clarity/terminology around the case of having "the entire most recent version of a hyperdrive/hyperdb" (eg, full values for all keys/files at the most recent version) but not full history: is that considered Sparse? In conversation i've usually heard people refer to this as the default condition (just having the most recent version), and having the full history of the feed being a speciall "Full History" or "Archival" copy.

I agree with pfrazee on Bootstrapping and Live.

@aral

This comment has been minimized.

Copy link

aral commented Dec 14, 2018

Suggestion regarding key naming, to strengthen the intent and usage of the keys and remove ambiguity about what abilities various keys grant:

  • Public key → Read key
  • Secret key → Write key
  • Discovery key → Discovery key (unchanged)

This way, people new to the system will not be misled into thinking, for example, that the public key is public (where would they get such an idea?) ;P

And the keys do exactly what they say on the tin.

Example usage:

The Read Key grants read access to a DAT whereas the Write Key is required to write to a DAT. The Discovery Key is used to discover a DAT and it is derived as a hash of the Read Key. The Read Key and Write Key should both be kept secret.

Thoughts?

@yoshuawuyts

This comment has been minimized.

Copy link
Contributor

yoshuawuyts commented Dec 18, 2018

@aral I think that's a pretty reasonable suggestion that could remove some ambiguity. I also always mistake "secret key" with "private key", which this would also help solve.

@martinheidegger

This comment has been minimized.

Copy link
Contributor

martinheidegger commented Jan 18, 2019

@aral I am considering working on an "encrypted DAT". :DATs that are additionally encrypted with yet another key in order to implement proxies/bridges that don't know about the content of a DAT. Do you have any idea how this Key should be called? ;)

@aral

This comment has been minimized.

Copy link

aral commented Jan 18, 2019

If you mean encrypting the contents of hypercores, I’d say “encryption key” does what it says on the tin.

@martinheidegger

This comment has been minimized.

Copy link
Contributor

martinheidegger commented Jan 18, 2019

Thanks, naming is hard :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment