coreunix, mfs, and unixfs are closely coupled to merkledag #5488

rob-deutsch · 2018-09-19T03:47:12Z

Background

I have been hacking away on IPFS, trying to build an application on top of it, but I've found it quite challenging. In this issue I'd like to share my observations and get feedback.

I'm trying to build filesystem-like application, so I'm trying to utilise as much of go-ipfs/core/coreunix, go-mfs, go-unixfs, and go-merkledag as possible. Its desirable for all these packages to be utilised, because they are handy building blocks for other IPFS app devs.

My observation

The most challenging desire I have is to use my own custom versions of go-merkledag.ProtoNode and go-merkledag.RawNode. This is not as easy as it should be.

What I'm seeing is that go-ipfs and its dependancies are tightly coupled to the go-merkledag.ProtoNode and go-merkledag.RawNode types. And this is preventing me from using my own custom types.

I've seen this occur in two ways:

Many places in the codebase specifically build ProtoNodes and RawNodes. It's tricky to have them build anything else.
Both ProtoNode and RawNode are often wrapped in the restrictive go-ipld-format.Node type, and are often casted back. This is done by either explitly casting to a type, or a switch statement that errors if the ipld.Node isn't a ProtoNode or RawNode

How I'm handling it

I've spent many hours digging through the codebase, trying to implement various solution, to no significant success. 30 minutes ago I realised what my 'best' option was - I need to use the existing IPFS codebase to generate ipld.Node interfaces, and then pass it to my code which will implement the same not-ideal type casting to ProtoNodes and RawNodes that I can then translate to my own nodes...

This is really not ideal, because I'll end up having to reinvent a lot of wheels, such as the way that coreunix.Adder (which I'll use to buld my ipld.Nodes) interacts with a Pinner, Blockstore, and maybe even DAGService.

Any thoughts?

The text was updated successfully, but these errors were encountered:

magik6k · 2018-09-19T22:34:04Z

In hopefully near future go-ipfs/core/coreunix will get mostly replaced by coreapi.Unixfs() (from core/coreapi).

Can you tell a bit more about your use-case and why you need to dive to those lower-level layers? (just a lack of unified interface?)

schomatis · 2018-09-20T03:19:41Z

Agreed with pretty much everything you say, also interested in knowing a bit more about your use case and the code you're building.

rob-deutsch · 2018-09-20T06:41:24Z

Thanks for the info on coreapi.Unixfs(). I just took a look at the code in master which is seeming like a work in progress. What's the best way to get up to speed on it? Is there a specific branch or issue to take a look at? (I searched but didn't come up with much)

The use-case is a tad tricky to explain concisely, because there's two ways it can be seen 1) the thing I'm trying to build, 2) the exact personal problem I have that I'm trying to solve. But I'll give it a shot...

The short story

I want to build private AWS S3-style buckets on top of IPFS. I'm naming these 'vaults'. They're basically directories of files, and they're 'private' because they're encrypted.

Motivation

I like to share files between my 3 computers (cloud server, laptop, and phone). The cloud server obtains the files, but I want to have the files available on my laptop and phone.

My current setup is: The cloud server obtains the files and caches them, I periodically rsync them to my laptop and delete them from the cloud server. If I ever need to get them onto my phone its a PITA.

There are two annoyances with with getting them onto my phone 1) They could be on my cloud server (if I haven't rsync'd yet) or my laptop, and 2) Regardless of where they are, getting them to my phone is a pain.

IPFS could solve this natively, but these are private files, and I don't want other peers to read them. I could achieve this by restricting which peers my nodes will send the blocks too, but that's a pain. So lets use encryption.

The solution

I COULD achieve this by just encrypting the file before making them available on IPFS, but there are a few practical problems with this (e.g. hiding the filenames) and I want something that acts with minimal effort on my part (e.g. as transparent as possible).

So my idea was to reuse everything already in IPFS, but encrypt the blocks with AES256 in my Blockstore and when I send them out over Bitswap.

The implementation

In order to achieve this , I want all of my IPFS nodes to have a store of secrets. Each secret will be (friendly name, secret, fingerprint of secret). In this example lets say all 3 of my computers just have a single secret like (myvault, aes256 key, sha256 of aes256 key).

Every block would them be encrypted into something like the following format (details TBD):

finger print of secret aes key , aes256(nonce , length of raw block , multicodec of block , raw block)

That way I can distribute my files between my 3 computers without worrying about anyone else getting a hold of the blocks. Want to request them from one of my nodes? Fine, go ahead, I don't care. They'll do that, but they'll also be contributing to the broader DHT etc.

How the implementation has gone so far

Adding to the go-ipfs code such that it could handle these encrypted blocks was easy(ish).

Adding to the go-ipfs code such that it would generate these encrypted blocks is very tough. coreunix.Adder just steams ahead creating ProtoNode and RawNode and taking the CID and pushing them out to the network.

I want to yell at coreunix.Adder, "just tell me the raw data you want in the raw block, and let met tell you what CID you should use to get the raw data (because its actually going to come from an encrypted block), and then of course let me push it out to the BlockService"

magik6k · 2018-09-20T10:19:49Z

Have you tried setting custom fileAdder.CidBuilder? It seems to be what you want:

type Builder interface {
	Sum(data []byte) (Cid, error)
	GetCodec() uint64
	WithCodec(uint64) Builder
}

Example implementation: https://github.com/ipfs/go-cidutil/blob/master/inline.go

Integrating the read part will be likely much trickier, involving lots of poking in go-unixfs

schomatis · 2018-09-20T12:27:41Z

I COULD achieve this by just encrypting the file before making them available on IPFS, but there are a few practical problems with this (e.g. hiding the filenames) and I want something that acts with minimal effort on my part (e.g. as transparent as possible).

Also, maybe this encryption layer can be of use: https://github.com/jbenet/ipfs-senc.

rob-deutsch · 2018-09-20T12:42:11Z

Have you tried setting custom fileAdder.CidBuilder?

I have considered this, but it doesn't work. Firstly, in addition to a custom fileAdder.CidBuilder I'll also need a custom BlockService/DAGService. This can be done, but the problem is that they both somehow need to know either:

a) A nonce that's added inside the encrypted block
b) An IV at the beginning of the block

The read part was actually easy to implement. I've already done it. It just required some additions to go-ipld-format, go-merkledag and an additional package I named go-ipld-aes. I made it so that go-unixfs just thinks its dealing with regular ProtoNode/RawNode. Of course, this too would be much more elegant if parts of the codebase were decoupled from go-merkledag.

Also, maybe this encryption layer can be of use: https://github.com/jbenet/ipfs-senc.

I didn't know about ipfs-senc. Thanks!

Unfortunately, it's not what I want. It tar's an entire directory into a single file.

I want to retain all of the cool DAG functionality of "normal" IPFS. I want to achieve this by just encrypting individual blocks. The main one I need in my use case is the ability to treat it as a folder that I can add files to without deleting the old files.

schomatis · 2018-09-20T12:55:27Z

I want to retain all of the cool DAG functionality of "normal" IPFS. I want to achieve this by just encrypting individual blocks. The main one I need in my use case is the ability to treat it as a folder that I can add files to without deleting the old files.

I don't fully understand (but you don't need to answer this) why this needs to be implemented at the block level and not at the UnixFS/MFS layers adding some kind of encrypted file type. Proto/raw nodes (seems to me) are more about how we cut a file up for convenience of transport and storage but I would encrypt the source (file) instead of the bit streams generated from it.

rob-deutsch · 2018-09-20T13:00:32Z

It could be done at that level, but I don't think its the right way to do it.

I'm not entirely sure what type of implementation you've got in mind, but the biggest issue I see is "how do you also encrypt the file names?"

schomatis · 2018-09-20T13:33:20Z

but the biggest issue I see is "how do you also encrypt the file names?"

Good point, that is stored at the DAG level, you'd also need to implement your own type of MFS directory that would store the name of its files as part of its content instead of relying in lower layers. But yes, your project sounds more like an encrypted volume ("vault" as you call it) and the current code is not prepared for it. I would be interested in taking a look at your encrypting implementation (if you can share that part of your code).

rob-deutsch · 2018-09-22T05:07:17Z

@schomatis, do you mean my decryption implementation (built into go-ipld-format) or how I plan to actually encrypt blocks?

Also, is the plan to keep IPFS using ProtoNode and RawNode, or is it planned to move everything to cbor nodes?

schomatis · 2018-09-23T16:32:01Z

@schomatis, do you mean my decryption implementation (built into go-ipld-format) or how I plan to actually encrypt blocks?

both, I used the encrypt term to actually mean encrypt/decrypt.

Also, is the plan to keep IPFS using ProtoNode and RawNode, or is it planned to move everything to cbor nodes?

I think those nodes won't be deprecated, but I can't say for sure.

rob-deutsch · 2018-09-24T00:34:24Z

My POC is available here: rob-deutsch/go-merkledag/tree/poc/decrypt.

It's not too much code, so its all in a single commit.

I can give the following summary:

Previously, dagService.Get() called ipld.Decode() directly to turn a block into into an ipld.Node(). I've modified dagService.Get() so that it first checks if its an encrypted block (multicodec 0x1337 for testing purposes), and decrypts the block if required before its passed to ipld.Decode().
The decryption function func DecryptBlock(rawData []byte, repo keyStore) (multicodec uint64, plaintext []byte, err error) was added in a file called aes.go. The expectation is that the block's CIDv1 will have the 0x1337 multicodec, so the first part of the decrypted block is the multicodec of the decrypted block (e.g. dag-pb or raw)
All encrypted blocks are prepended with an SHA-256 hash of the encryption key. This is basically a fingerprint that we can use to determine if we have the encryption key. go-ipfs actually passes go-merkledag an interface built ontop of of the Repo which implements:

type keyStore interface {
	GetByHash(mh.Multihash) ([]byte, error)
}

schomatis · 2018-12-13T23:00:16Z

Moving to the backlog, I don't think there's anything we can do here at the moment.

schomatis added the status/deferred Conscious decision to pause or backlog label Dec 13, 2018

Stebalien removed the status/deferred Conscious decision to pause or backlog label Dec 18, 2018

momack2 added this to Inbox in ipfs/go-ipfs May 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coreunix, mfs, and unixfs are closely coupled to merkledag #5488

coreunix, mfs, and unixfs are closely coupled to merkledag #5488

rob-deutsch commented Sep 19, 2018 •

edited

magik6k commented Sep 19, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018 •

edited

magik6k commented Sep 20, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018 •

edited

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 22, 2018

schomatis commented Sep 23, 2018

rob-deutsch commented Sep 24, 2018 •

edited

schomatis commented Dec 13, 2018

coreunix, mfs, and unixfs are closely coupled to merkledag #5488

coreunix, mfs, and unixfs are closely coupled to merkledag #5488

Comments

rob-deutsch commented Sep 19, 2018 • edited

Background

My observation

How I'm handling it

magik6k commented Sep 19, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018 • edited

The short story

Motivation

The solution

The implementation

How the implementation has gone so far

magik6k commented Sep 20, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018 • edited

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 20, 2018

schomatis commented Sep 20, 2018

rob-deutsch commented Sep 22, 2018

schomatis commented Sep 23, 2018

rob-deutsch commented Sep 24, 2018 • edited

schomatis commented Dec 13, 2018

rob-deutsch commented Sep 19, 2018 •

edited

rob-deutsch commented Sep 20, 2018 •

edited

rob-deutsch commented Sep 20, 2018 •

edited

rob-deutsch commented Sep 24, 2018 •

edited