Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upProvide APIs for interacting with dats #10
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Feb 21, 2018
Owner
It wonder if it'd be possible to in fact re-use the DatArchive module. It's not exposed as a module on npm though and it'd be a shame to write it again. Thoughts @pfrazee ?
|
It wonder if it'd be possible to in fact re-use the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Feb 21, 2018
The related source code in beaker is the web API for the web-side and Background app API for the non-web side.
RangerMauve
commented
Feb 21, 2018
|
The related source code in beaker is the web API for the web-side and Background app API for the non-web side. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
commented
Feb 21, 2018
|
Which all seems to be wrapping over pauls-dat-api |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Feb 21, 2018
Maybe the daemon could expose pauls-dat-api using the manifest here over a protobuf RPC
RangerMauve
commented
Feb 21, 2018
|
Maybe the daemon could expose pauls-dat-api using the manifest here over a protobuf RPC |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Feb 21, 2018
There's a somewhat hacked-together module on npm right now, https://github.com/beakerbrowser/node-dat-archive
pauls-dat-api is the core of the logic, if you want to mimic its interface, that's where I'd start. The question I'd ask is, how do you plan to expose the API? Is the idea that dat-daemon could be embedded in a node app and then leverage DatArchive, or are you planning to have apps connect to an active daemon process and leverage DatArchive as a wrapper around RPC?
pfrazee
commented
Feb 21, 2018
|
There's a somewhat hacked-together module on npm right now, https://github.com/beakerbrowser/node-dat-archive pauls-dat-api is the core of the logic, if you want to mimic its interface, that's where I'd start. The question I'd ask is, how do you plan to expose the API? Is the idea that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Yes but the high-level API is also neat |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Feb 21, 2018
@soyuka at one point I found myself needing the DatArchive interface in Beaker's background process, and I had to re-implement the interface there: https://github.com/beakerbrowser/beaker/blob/beaker-0.8/app/lib/bg/dat-archive.js. Not my proudest but I couldn't think of a better solution. I think I ended up getting rid of the code that required that though.
pfrazee
commented
Feb 21, 2018
|
@soyuka at one point I found myself needing the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Feb 21, 2018
Owner
are you planning to have apps connect to an active daemon process and leverage DatArchive as a wrapper around RPC?
This. With the help of websockets for example, it'd allow to build dat-aware extensions or whatever. IMO the dat command line utility should also be decoupled through a daemon (using websockets or tcp).
Though, my first intent was to be able to have a daemon that shares dat on my own server and/or 24/24 low-power computer at home. As we're seeing more interest/ideas that could wrap arround an RPC I'm all ears :).
About the last message, thanks might be useful.
See also initial thoughts here: dat-land/dat-desktop#434
This. With the help of websockets for example, it'd allow to build dat-aware extensions or whatever. IMO the dat command line utility should also be decoupled through a daemon (using websockets or tcp). Though, my first intent was to be able to have a daemon that shares dat on my own server and/or 24/24 low-power computer at home. As we're seeing more interest/ideas that could wrap arround an RPC I'm all ears :). About the last message, thanks might be useful. See also initial thoughts here: dat-land/dat-desktop#434 |
soyuka
added
the
enhancement
label
Feb 21, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Feb 21, 2018
Yeah an RPC-able DatArchive daemon/backend has been on my mind. Should be useful.
pfrazee
commented
Feb 21, 2018
|
Yeah an RPC-able |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
martinheidegger
Feb 23, 2018
Well, Hyperdrive uses a random-access-... storage approach to collect data. random-access-datdaemon could allow transparent communication to the dat's managed by dat-daemon?
martinheidegger
commented
Feb 23, 2018
|
Well, Hyperdrive uses a |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Feb 23, 2018
Owner
IMO using hyperdrive or random-access is too low level for this library which should try to stick with dat-node and high-level stuff.
|
IMO using hyperdrive or random-access is too low level for this library which should try to stick with |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Feb 23, 2018
@martinheidegger Yeah that's an interesting idea, Id ask: 1) would the low-level semantics make it difficult to add things like permissions, if ever needed? 2) would that access level be inefficient because it puts messaging at the wrong layer? Point 2 would concern me because of how hyperdrive does lookups (against the metadata log).
pfrazee
commented
Feb 23, 2018
|
@martinheidegger Yeah that's an interesting idea, Id ask: 1) would the low-level semantics make it difficult to add things like permissions, if ever needed? 2) would that access level be inefficient because it puts messaging at the wrong layer? Point 2 would concern me because of how hyperdrive does lookups (against the metadata log). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Feb 23, 2018
@pfrazee From the work the IPFS community has been doing on their daemon, access restrictions are being done within the browser extension.
The IPFS-companion extension has full access to the IPFS daemon, and then it provides the web with an IPFS global that has access restrictions scoped to the origin being served (kinda like how browser permissions work for webrtc and the such already).
RangerMauve
commented
Feb 23, 2018
|
@pfrazee From the work the IPFS community has been doing on their daemon, access restrictions are being done within the browser extension. The IPFS-companion extension has full access to the IPFS daemon, and then it provides the web with an IPFS global that has access restrictions scoped to the origin being served (kinda like how browser permissions work for webrtc and the such already). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Feb 23, 2018
So the daemon would allow anything running locally to connect to it, and higher-level applications that require access restrictions based on some sort of criteria can then do so.
RangerMauve
commented
Feb 23, 2018
|
So the daemon would allow anything running locally to connect to it, and higher-level applications that require access restrictions based on some sort of criteria can then do so. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 1, 2018
There's already a library, node-dat-archive which implements the same API as DatArchive in the beaker browser.
What about providing an RPC API that looks like this?
RangerMauve
commented
Mar 1, 2018
|
There's already a library, node-dat-archive which implements the same API as DatArchive in the beaker browser. What about providing an RPC API that looks like this? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Mar 2, 2018
Owner
What about providing an RPC API that looks like this?
Totally! I've some pending work on this but there are some edge cases to consider when trying to translate this to an RPC API.
I'm going to document my progress on the protocol this weekend. One thing that bother's me with the DatArchive API are the readFile/writeFile things. I'd prefer if the RPC was only working with streams when writing/reading. I have a proof of concept with streams and it works really great.
Also, if I want to continue to use protocol buffers, I need to structure everything. For example say we want to readdir, how is the response? Do we allow some kind of "generic" data in the response protocol or do we "type" everything?
When talking about streams there's also the same problem. If we stick with the following statement:
The daemon receives an "Instruction" and sends an "Answer"
How would streaming work? I don't really want to wrap binary data with metadata. Therefore I've come up with file access "endpoints" (#11 (comment)). I'm still not sure that this is the best solution but it allows to:
- keep "instruction" "answer" thing for standard instructions
- having a streaming fs interface to read/write
Anyway I'm going to first document the RPC protocol before I implement it :).
Totally! I've some pending work on this but there are some edge cases to consider when trying to translate this to an RPC API. I'm going to document my progress on the protocol this weekend. One thing that bother's me with the DatArchive API are the Also, if I want to continue to use protocol buffers, I need to structure everything. For example say we want to
How would streaming work? I don't really want to wrap binary data with metadata. Therefore I've come up with file access "endpoints" (#11 (comment)). I'm still not sure that this is the best solution but it allows to:
Anyway I'm going to first document the RPC protocol before I implement it :). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
Good point about readFile/Write file. It's a higher level API to make it easier to work with so it makes sense to do something with more control for the protocol.
Do we allow some kind of "generic" data in the response protocol or do we "type" everything?
From what I've seen, people will set up multiple message types for the responses and requests. That way if you know you're sending a "ReadDirmessage, you're going to expect aDirResults` in the response.
If you allow for "generic" data, you might as well just go with JSON or something else more dynamic since you're losing a lot of the benefit of protobuf structured messages.
What about having using websockets but instead of having the "endpoints" in the URL, having that information in the first protobuf message sent to the daemon?
RangerMauve
commented
Mar 2, 2018
|
Good point about readFile/Write file. It's a higher level API to make it easier to work with so it makes sense to do something with more control for the protocol.
From what I've seen, people will set up multiple message types for the responses and requests. That way if you know you're sending a "ReadDir What about having using websockets but instead of having the "endpoints" in the URL, having that information in the first protobuf message sent to the daemon? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
What about using pauls-electron-rpc as a basis? It's what beaker already uses for RPC between the browser window and the node process for the DatArchive API
RangerMauve
commented
Mar 2, 2018
|
What about using pauls-electron-rpc as a basis? It's what beaker already uses for RPC between the browser window and the node process for the DatArchive API |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Mar 2, 2018
Owner
If you allow for "generic" data, you might as well just go with JSON or something else more dynamic since you're losing a lot of the benefit of protobuf structured messages.
Exactly! I'm going to stick with protobuf.
What about having using websockets but instead of having the "endpoints" in the URL, having that information in the first protobuf message sent to the daemon?
Yes I thought about this as well but it's not that easy. Say it works like that:
- [client] => I want to write a File
- [daemon] => ok I'm ready (and now only accepting a stream)
- [client] => sends stream
Now, how do we know that the stream ends? Is the client sending an End instruction? But what if the daemon only accepts streams without checking for more instruction?
I've prototyped something that works like this but I it's less developer-friendly and more complicated to handle then just to open a stream when you want to write/read.
Translated to javascript it's ~10 lines of client code vs only 1 with streams for example.
Oh and this is client => daemon writing, reading is the same but complexity will be on the client side.
Also with dedicated stream channels, you know that you'll only get data and no protobuf there, it's kinda good separation of concerns imo.
What about using pauls-electron-rpc as a basis? It's what beaker already uses for RPC between the browser window and the node process for the DatArchive API
Because I'm not fond of the abstraction proposed in this module. I dunno why. I'm still trying to re-use the more code I can from paul's dat api!
Exactly! I'm going to stick with protobuf.
Yes I thought about this as well but it's not that easy. Say it works like that:
Now, how do we know that the stream ends? Is the client sending an I've prototyped something that works like this but I it's less developer-friendly and more complicated to handle then just to open a stream when you want to write/read. Oh and this is client => daemon writing, reading is the same but complexity will be on the client side. Also with dedicated stream channels, you know that you'll only get data and no protobuf there, it's kinda good separation of concerns imo.
Because I'm not fond of the abstraction proposed in this module. I dunno why. I'm still trying to re-use the more code I can from paul's dat api! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
Now, how do we know that the stream ends
I assumed you were using one connection per request, so the socket closing would be enough.
Alternately one could have a content length in the first message.
Another option is to use something like multistream from IPFS which allows you to multiplex several streams over one along with all the stream lifecycle events.
RangerMauve
commented
Mar 2, 2018
•
I assumed you were using one connection per request, so the socket closing would be enough. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
The main worry I have with putting information in the URL is that it doesn't scale to arbitrary streams, so it won't work for unix sockets or TCP sockets (or arbitrary streams)
RangerMauve
commented
Mar 2, 2018
|
The main worry I have with putting information in the URL is that it doesn't scale to arbitrary streams, so it won't work for unix sockets or TCP sockets (or arbitrary streams) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Mar 2, 2018
Owner
I assumed you were using one connection per request, so the socket closing would be enough.
Opening a connection for each request looks weird, usually you keep the connection open and send a bunch of requests no?
The main worry I have with putting information in the URL is that it doesn't scale to arbitrary streams, so it won't work for unix sockets or TCP sockets (or arbitrary streams)
My though as well. Though, I could still implement simple readFile, writeFile that transfers the whole data in one call as a fallback. Or we could just use websockets everywhere :D.
Opening a connection for each request looks weird, usually you keep the connection open and send a bunch of requests no?
My though as well. Though, I could still implement simple readFile, writeFile that transfers the whole data in one call as a fallback. Or we could just use websockets everywhere :D. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Mar 2, 2018
Owner
I think that we also have to keep in mind that this daemon is only used to interact with local dats and will not be available from the outside. Low interest in tcp/unix sockets as a websocket also works well.
|
I think that we also have to keep in mind that this daemon is only used to interact with local dats and will not be available from the outside. Low interest in tcp/unix sockets as a websocket also works well. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
If you're not opening a socket per request, how are you specifying the data you want in the URL? Is the URL just the dat archive ID?
Do you have any code I could read for this yet? I think I'm misunderstanding and confusing myself. :P
RangerMauve
commented
Mar 2, 2018
|
If you're not opening a socket per request, how are you specifying the data you want in the URL? Is the URL just the dat archive ID? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Mar 2, 2018
Owner
Haha sorry this is my bad.
You're indeed having one socket per stream or live data.
On the other hand you should have one and only one socket opened to interact with the daemon (add, list, remove, readdir etc.)
For example (say it's a client interface in JS):
var client = new Client() // this keeps an open connection, now that I think about it maybe it doesn't need to keep it open but it definitely could
var answer = await client.send({action: LIST})
// answer is a list of dats
var write = client.createWriteStream('key/path/bar') // creates a new connection
write.write('foo')
write.end() // which closes here
var readdir = await client.send({action: READDIR, path: 'path'})
assert(readdir[0] === 'bar')
var statistics = client.createFileActivityStream() or createNetworkActivityStream() // these are also opening new connections
statistics.on('data', function(stats) {
//do something
})|
Haha sorry this is my bad. For example (say it's a client interface in JS): var client = new Client() // this keeps an open connection, now that I think about it maybe it doesn't need to keep it open but it definitely could
var answer = await client.send({action: LIST})
// answer is a list of dats
var write = client.createWriteStream('key/path/bar') // creates a new connection
write.write('foo')
write.end() // which closes here
var readdir = await client.send({action: READDIR, path: 'path'})
assert(readdir[0] === 'bar')
var statistics = client.createFileActivityStream() or createNetworkActivityStream() // these are also opening new connections
statistics.on('data', function(stats) {
//do something
}) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Mar 2, 2018
FYI the only reason the DatArchive interface doesn't have streams yet is because I'm waiting to see if browser stream APIs stabilize more
pfrazee
commented
Mar 2, 2018
|
FYI the only reason the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Mar 2, 2018
What about targeting async iterators instead of actual streams for the DatArchive API?
Plus, with Firefox and Chrome both supporting WHATWG streams I doubt they'll be doing massive changes there.
RangerMauve
commented
Mar 2, 2018
|
What about targeting async iterators instead of actual streams for the DatArchive API? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pfrazee
Mar 2, 2018
@RangerMauve If WHATWG streams have taken the lead then I'll probably use them. Based on https://streams.spec.whatwg.org/#example-manual-read it looks like you can do async iterators in that spec. (https://jakearchibald.com/2017/async-iterators-and-generators/ seems to confirm that)
pfrazee
commented
Mar 2, 2018
|
@RangerMauve If WHATWG streams have taken the lead then I'll probably use them. Based on https://streams.spec.whatwg.org/#example-manual-read it looks like you can do async iterators in that spec. (https://jakearchibald.com/2017/async-iterators-and-generators/ seems to confirm that) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
martinheidegger
Mar 6, 2018
I have given this a little more thought and to me it seems that the best API of interacting with dats is the same API that dats use to sync each-other. Once dat-daemon returns the list of DATs. the web/desktop app could simply open the dats "sparse".
const client = new Client()
const Dat = require('dat-node')
client.watch({ action: LIST_DATS }, (datsDiff) => {
removeDats(datsDiff.removed)
addDats(datsDiff.added)
})
const dats = {}
function addDats (datKeys) {
datKeys.forEach(datKey => {
dats[datKey] = new Dat(new HyperDrive(ram), datKey, {sparse: true, sparseMetadata: true})
})
}
// etc.The new data could now show what files there are, what the latest version is, what versions exist, etc. all given by the Dat API. Now: going through the bittorrent stuff for a local up talking to a local service is definitely overkill, but lucky for us Dat doesn't really specify which transport protocol is supposed to be used. It could simply run over a tcp port that its automatically connected to.
martinheidegger
commented
Mar 6, 2018
|
I have given this a little more thought and to me it seems that the best API of interacting with dats is the same API that dats use to sync each-other. Once const client = new Client()
const Dat = require('dat-node')
client.watch({ action: LIST_DATS }, (datsDiff) => {
removeDats(datsDiff.removed)
addDats(datsDiff.added)
})
const dats = {}
function addDats (datKeys) {
datKeys.forEach(datKey => {
dats[datKey] = new Dat(new HyperDrive(ram), datKey, {sparse: true, sparseMetadata: true})
})
}
// etc.The new data could now show what files there are, what the latest version is, what versions exist, etc. all given by the Dat API. Now: going through the bittorrent stuff for a local up talking to a local service is definitely overkill, but lucky for us Dat doesn't really specify which transport protocol is supposed to be used. It could simply run over a tcp port that its automatically connected to. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
commented
Mar 6, 2018
|
Are there docs on what the protocol can do? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
martinheidegger
Mar 7, 2018
@RangerMauve What do you mean? It contains all the information on everything about a dat. https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf
martinheidegger
commented
Mar 7, 2018
|
@RangerMauve What do you mean? It contains all the information on everything about a dat. https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Apr 17, 2018
Owner
I'm working on a high-level API that'll have the same api as the DatArchive from beaker backed up by the daemon.
https://github.com/soyuka/dat-daemon/tree/master/packages/client (more work tbd) be closing this.
Daemon rfc: https://github.com/soyuka/dat-daemon/blob/master/rfc.md
|
I'm working on a high-level API that'll have the same api as the DatArchive from beaker backed up by the daemon. |
soyuka
closed this
Apr 17, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Apr 17, 2018
That's awesome. Have you looked into the approach I did that accomplishes a similar goal?
Instead of having more API in the gateway, I had the gateway provide a replication stream through websockets and create a hyperdrive instance client-side.
RangerMauve
commented
Apr 17, 2018
|
That's awesome. Have you looked into the approach I did that accomplishes a similar goal? Instead of having more API in the gateway, I had the gateway provide a replication stream through websockets and create a hyperdrive instance client-side. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
soyuka
Apr 17, 2018
Owner
Oh nice ! You should've pinged me there ! I'm going to take a look at it !
|
Oh nice ! You should've pinged me there ! I'm going to take a look at it ! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RangerMauve
Apr 17, 2018
Sorry, I got really engrossed in the devlopment and didn't think to. :D
You should check out this issue, there's gonna be a video call between some people interested in this stuff.
RangerMauve
commented
Apr 17, 2018
|
Sorry, I got really engrossed in the devlopment and didn't think to. :D You should check out this issue, there's gonna be a video call between some people interested in this stuff. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Oh real nice thanks for this one :D. |
RangerMauve commentedFeb 21, 2018
Having a daemon for easily saving dats locally is useful, but I think people would get even more out of it if the daemon could allow them to actually interact with the dats.
My end goal is to abstract interaction with the daemon using the same interface as the DatArchive API in the Beaker Browser.