Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Merkle File Transfer - Efficient and pain free file uploads, smart chunking for processing queues #2
After experiencing a considerable amount of pain points and frustrations when uploading a large file (e.g. a video to youtube or vimeo), seeing it being canceled because the external HDD suddenly unmounted or because the Internet connection dropped, having to restart the upload from scratch, I started looking if where isn't a better way to do this file upload thing.
I didn't have to look to much into this, as IPFS (and pretty much any other Merkle'lised data transfer), does the chunking + resuming an upload pretty well! However, IPFS has a large scope and goals and for this specific case, we are just focused on bitswap, the exchange protocol found inside IPFS, for a 1:1 transfer unidirectional transfer.
Merkle File Transfer (MFT)
The goals of a MFT are:
Traditional File Uploading
Traditional file uploads (FTP, SFTP, HTTP, etc) have a very simple algorithm
This makes it extremely simple, but also extremely wasteful of resources, if a connection drops, even if only one bit is missing, the whole file has to be transferred again.
Merkle'lised File Uploading
With a MFT, you chunk and import a file into a MerkleDAG format (ref: jbenet/random-ideas#20) and send the list of chunks. With that list, the receiver can ask for the specific chunks that it is still missing, avoid to transfer chunks that it already has available.
Merkle File Processing Queues
One of the use cases that can benefit greatly from MFT is Processing Queues, depending on the type of data, we can make the chunking algorithm be smarter on how it breaks the file into parts, so that the receiver can start processing the chunks as soon as it receives them (e.g chunk a video by key frames so that effects can be applied).