New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2019 OKRs #2

Open
mikeal opened this Issue Nov 27, 2018 · 17 comments

Comments

4 participants
@mikeal
Copy link
Member

mikeal commented Nov 27, 2018

It's time to start defining the goal posts we'd like to reach by the end of 2019. I've started drafting a few and would love some feedback.

Objective: Common data-structures are available in IPLD.

  • Key Result: HAMT in IPLD is standardized and available in 2 implementations
    Collections (single namespace split over many blocks) are available.
  • Key Result: Specification and implementation in one language of an efficient sorted collection (e.g. B-tree).
  • Key Result: Specification and implementation in one language of an efficient multi-dimensional collection (e.g. R-tree).

Objective: Structured indexes of IPLD graphs are available.

  • Key Result: One implementation of an index that enables any IPLD graph to be searched by secondary attributes.

Objective: Developers can easily use encrypted data-structures.

  • Key Result: Specification and implementations of a common pattern for encrypting any IPLD data-structure.

Objective: Applications are built and adopted using IPLD data-structures.

  • Key Result: IPLD is presented to 10K people.
  • Key Result: IPLD articles are viewed by 100K people.
  • Key Result: 5K unique people engage with IPLD related projects.

Objective: unixfs-v2 is widely implemented.

  • Key Result: unixfs-v2 is implemented in js-ipfs.
  • Key Result: unixfs-v2 is implemented in go-ipfs.
  • Key Result: unixfs-v2 is implemented as a stand-alone library in Node.js.

Objective: There is a replication alternative to Bitswap.

  • Key Result: Non-local blocks can be fetched by a mechanism which isn't Bitswap.
  • Key Result: System is pluggable, new ideas can be implemented easily.
@vmx

This comment has been minimized.

Copy link
Member

vmx commented Nov 28, 2018

  • Key Result: Specification and implementation in 2 languages of an efficient B-tree (sorted collections).

I would rephrase it to of an efficient sorted collection (e.g. B-tree) as I'm not sure if it will be an B-tree. I would also reduce it to 1 language. I think the focus should be on getting it working in one, rather than having two half finished implementations. Once one is done, the other should be easy.

  • Key Result: Specification and implementation in 2 languages of an efficient R-tree (geo-spatial sorted collection)

I would rephrase it to: of an efficient multi-dimensional collection (e.g. R-tree). Those collections don't need to be sorted (they could be, but likely are not). I would also reduce it to 1 language for the same reason as mentioned above.

As I'm still unsure how hard this would be and I rather see a future in search, I might even make this one a "nice to have" and not a key result. So perhaps a new KR could be (which doesn't really fit the objective yet):

  • Key Result: Any IPLD graph can be searched by secondary attributes

Objective: Developers can easily use encrypted data-structures.

This sounds like a huge one. Has anyone already put thought into this? I would probably move that to 2020 or change it to something like "Ideas around encrypted data-structures are noted down".

  • Key Result: unixfs-v2 is implemented as a stand-alone library in Node.js.

Isn't this automatically done with the "unixfs-v2 is implemented in js-ipfs" one?

  • What kind of goals should we have for replication?

Objective: There is an alternative to Bitswap

  • Key Result: Non-local blocks can be fetched by a mechanism which isn't Bitswap
  • Key Result: System is plugable, new ideas can be implemented easily
@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 28, 2018

@vmx

  • altered the B-Tree and R-Tree KR's.
  • added a new OKR for indexing to cover the search case.
  • added replication ORK's you noted.

Isn't this automatically done with the "unixfs-v2 is implemented in js-ipfs" one?

The current unixfs implementation in JS isn't exactly standalone, it is a single module but it basically requires all of IPFS in order to function. I mean something more along the lines of https://github.com/mikeal/js-unixfsv2-draft which is currently written to the old unixfsv2 draft.

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 28, 2018

Objective: Developers can easily use encrypted data-structures.

This sounds like a huge one. Has anyone already put thought into this? I would probably move that to 2020 or change it to something like "Ideas around encrypted data-structures are noted down".

I've actually been thinking about this a lot lately and there's a model I want to try using WebAssembly. The wasm work for this is likely to be done anyway for IPFS/libp2p so it shouldn't be very hard at all to implement the approach I'm thinking about, which would be a self-implemented encryption format for any IPLD object that supports the data model.

@vmx

This comment has been minimized.

Copy link
Member

vmx commented Nov 28, 2018

Thanks @mikeal for updating. Things look good to me!

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Nov 29, 2018

I've actually been thinking about this a lot lately and there's a model I want to try using WebAssembly. The wasm work for this is likely to be done anyway for IPFS/libp2p so it shouldn't be very hard at all to implement the approach I'm thinking about, which would be a self-implemented encryption format for any IPLD object that supports the data model.

So, decryption with webasm is a bit tricky. We cannot, ever, ever pass private data (keys) into untrusted code. That means we can't pass our private key into a function referenced by the encrypted data. Instead, we have to make the key itself point to the decryption algorithm.

Also, there's still the structure/privacy tradeoff. We could:

  1. Encrypt outside the structure (hiding the IPLD structure)
  2. Encrypt inside, leaving unnamed "links" so we can at least fetch an entire dag.
  3. Create some transformed/encrypted DAG. You can get some really nice privacy properties here but it's a really hard problem.
@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 29, 2018

Safety is sort of the whole point of WASM so I wouldn't exactly call it "untrusted." It has no access to the filesystem, to the network, to any other sys calls, or any memory other than the slab you explicitly give it. Is there a particular scenario you're worried about that would allow the key to leak?

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Nov 29, 2018

Is there a particular scenario you're worried about that would allow the key to leak?

Side channels, including the human side channel.

For example, let's say I have a shared filesystem and an attacker can change the "decrypt" program. In that case, the attacker can:

  1. Create a custom "decrypt" function that, in addition to decrypting a directory, modifies the ACL on the directory to give the attacker access.
  2. Attach this custom decrypt function to a directory it can't decrypt.
  3. Convince someone with access to modify the directory. The directory will be re-encrypted with the new ACLs, giving the attacker access.

Note: the solution is simply to make the decryption key itself the program. That is, if someone gives me both the key and the decryption program at the same time, I know that they already knew the "secret" inputs to the function so there's no security issue.

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 29, 2018

I'm not following.

  • If the author of the data-structure is linking to the decryption program then it isn't open to attack by someone other than the author.
  • The decryption program could modify the data it's returning, yes, but the data and the decryption program were written into the same IPLD object so the author of the node is in control of the ACLs anyway. Essentially, if a decryption program that is linked to from the block that has the encrypted data wants to alter the decrypted data then the return value from the encryption program is essentially correct since the encrypted data is unreadable and has no semantics without the decryption program.
@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 29, 2018

Here's an illustration to try and make things clearer:

// Encrypted Block
{ @encryption: 
 { to: Binary(), // public key
   from: Binary(), // public key
   decrypt: Link() // link to raw node of wasm code.
 },
 data: Binary(),
 links: [
   Link() // link to another encrypted node that is referenced in the encrypted data.
 ],
}
@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Nov 30, 2018

I can take an existing IPLD object with existing encrypted data and then create a new object with the same encrypted data but with a new program. I can then give it to you and say "I made this, please decrypt it".

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 30, 2018

Why would I decrypt a data-structure that references a specific decryption program with some random program you give me instead?

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Nov 30, 2018

Another thing to consider, if there is a standardized interface like this for IPLD then you're just going to ask the IPLD Interface API to decrypt things for you and pass in the private key or some kind of key manager interface. That API won't even have the option of running a decryption program that isn't reference by the node itself.

@Stebalien

This comment has been minimized.

Copy link

Stebalien commented Nov 30, 2018

Why would I decrypt a data-structure that references a specific decryption program with some random program you give me instead?

Given a data structure of the form { decryptionProgram: good, data: encryptedSecret }, an attacker would create a new data structure of the form { decryptionProgram: bad, data: encryptedSecret }.

Really, there's a simple solution: attach the decryption program to the key. That is:

key = {
  decrypt: ...,
  encrypt: ...,
  other: ...,
  secret: "key data"
}

You'd then call key.decrypt(key, message) to decrypt message.

@mikeal mikeal closed this Dec 19, 2018

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Dec 19, 2018

My bad, this shouldn't be closed yet, I thought it was about the Q1 OKR's.

@mikeal mikeal reopened this Dec 19, 2018

@momack2

This comment has been minimized.

Copy link

momack2 commented Dec 20, 2018

anything on graphsync/selectors spec going to make it into Q1? We have an important use case for having that implemented mid-year, so it isn’t something we can put off until later quarters without consequences... I know it feels blocked right now, but I think unblocking it and getting it ready for go development in Q1 seems valuable

@vmx

This comment has been minimized.

Copy link
Member

vmx commented Dec 20, 2018

@momack2: the Graphsync related one is the "Demo of IPLD replication without Bitswap" one.

@mikeal

This comment has been minimized.

Copy link
Member

mikeal commented Dec 20, 2018

@momack2 what use case is that? Is there a link to the details? We need to understand the use case in order to prioritize replication features that will support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment