New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for pluggable types of remotes #97

Open
RangerMauve opened this Issue Mar 15, 2018 · 49 comments

Comments

Projects
None yet
4 participants
@RangerMauve

RangerMauve commented Mar 15, 2018

I want to use this in conjunction with the dat archive API in BeakerBrowser.

The way it works is you have what looks like a filesystem that's distributed in a peer to peer network.
If someone has access to the URL (a public key) of your FS, they can find the data from peers in the network.
Only the owner of the filesystem (that has the private key) can write to the FS, but anyone can read and replicate it if they have the URL.

BeakerBrowser provides APIs for interacting with these Dat archives to JS in a browser context.
This means that people can make applications that are totally decentralized and don't need any third party services for storing their data.

What I'd like to be able to do is to have a dat archive be used as somebody's git history. That way other people can get the data without you having to store it in a git server.

This will also be the foundation for something like Github, but totally decentralized and p2p.

Ideally, what I'd like to do is set up git remotes which point to other people's dat archives, and to read the other person's dat archive directly instead of trying to make requests to a git server.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Mar 16, 2018

Member

I love this. And I am working on a project in a similar vein, but not using dat only WebRTC to enable P2P git cloning.

How would a dat-based git remote work? If you can treat it like a file system then you can use isomorphic-git with it by passing a custom "fs" argument.

Member

wmhilton commented Mar 16, 2018

I love this. And I am working on a project in a similar vein, but not using dat only WebRTC to enable P2P git cloning.

How would a dat-based git remote work? If you can treat it like a file system then you can use isomorphic-git with it by passing a custom "fs" argument.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Mar 16, 2018

That's exactly what I was thinking. git remote add otherperson dat://privatekeyhere git merge otherperson/master

From reading the code it looked like the the commands for working with remotes were using the GitRemoteHTTP API, so it didn't look like a custom fs argument was going to cut it.

I don't know git too well, does checking out a branch require writes to the .git folder? That might make it harder to just plug a read-only dat into git, though that could probably be overcome with an overlay-fs of some sort on top of the dat fs.

RangerMauve commented Mar 16, 2018

That's exactly what I was thinking. git remote add otherperson dat://privatekeyhere git merge otherperson/master

From reading the code it looked like the the commands for working with remotes were using the GitRemoteHTTP API, so it didn't look like a custom fs argument was going to cut it.

I don't know git too well, does checking out a branch require writes to the .git folder? That might make it harder to just plug a read-only dat into git, though that could probably be overcome with an overlay-fs of some sort on top of the dat fs.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Mar 28, 2018

Member

So technically, since dats are just folders, depending on how you are reading the files, you wouldn't even need the concept of a "dat remote". You could just treat dat repos like local repos, that happen to sync.

That being said, dat and git are pretty similar under-the-hood (content addressable, hash-based, append-only graphs) so if you're not careful about how you create "git repos over dat" you might end up with a system that does double the work. (E.g. store the history twice.) But that's not any worse than say, storing your git repos in a Dropbox folder. (I've done that.)

Do you know if there's a convention for sharing git repos over dat? Is it just a matter of taking a repo and doing "dat share .git"?

Member

wmhilton commented Mar 28, 2018

So technically, since dats are just folders, depending on how you are reading the files, you wouldn't even need the concept of a "dat remote". You could just treat dat repos like local repos, that happen to sync.

That being said, dat and git are pretty similar under-the-hood (content addressable, hash-based, append-only graphs) so if you're not careful about how you create "git repos over dat" you might end up with a system that does double the work. (E.g. store the history twice.) But that's not any worse than say, storing your git repos in a Dropbox folder. (I've done that.)

Do you know if there's a convention for sharing git repos over dat? Is it just a matter of taking a repo and doing "dat share .git"?

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Mar 28, 2018

I don't think there's been anything done for integrating dat with git yet. I think this issue is pretty close to cutting-edge in the community. :P

Can git do merges between folders?

RangerMauve commented Mar 28, 2018

I don't think there's been anything done for integrating dat with git yet. I think this issue is pretty close to cutting-edge in the community. :P

Can git do merges between folders?

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Mar 30, 2018

Member

Can git do merges between folders?

The canonical git implementation can. Isomorphic-git cannot do anything except basic fast-forward merges at the moment, although full support for git merge is on the roadmap!

Member

wmhilton commented Mar 30, 2018

Can git do merges between folders?

The canonical git implementation can. Isomorphic-git cannot do anything except basic fast-forward merges at the moment, although full support for git merge is on the roadmap!

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Mar 30, 2018

Member

I don't think there's been anything done for integrating dat with git yet. I think this issue is pretty close to cutting-edge in the community.

Today's cutting edge JavaScript is next year's need-to-know framework! 😆 I'm sure it (git for dat) will happen in the next year or two, I just don't know who will make it happen. There's already a pretty amazing implementation of git for secure-scuttlebutt which has nothing to do with dat, but I always compare them since they are the main p2p projects I follow. Ultimately my goal is to turn git into something just as decentralized as dat or ssb - it was designed to be decentralized (in fact it was designed to share code patches via email!) but that aspect got lost as centralized services like Github started building proprietary features and encouraging a centralized mentality (which was possibly needed to help CVS and SVN users). But I believe that by adding a very simple crypto system for signing commits, files, and directories, is will be possible to use git directly to create the peer-to-peer web that projects like Beaker are currently using dat to build. At least that's my working hypothesis.

Member

wmhilton commented Mar 30, 2018

I don't think there's been anything done for integrating dat with git yet. I think this issue is pretty close to cutting-edge in the community.

Today's cutting edge JavaScript is next year's need-to-know framework! 😆 I'm sure it (git for dat) will happen in the next year or two, I just don't know who will make it happen. There's already a pretty amazing implementation of git for secure-scuttlebutt which has nothing to do with dat, but I always compare them since they are the main p2p projects I follow. Ultimately my goal is to turn git into something just as decentralized as dat or ssb - it was designed to be decentralized (in fact it was designed to share code patches via email!) but that aspect got lost as centralized services like Github started building proprietary features and encouraging a centralized mentality (which was possibly needed to help CVS and SVN users). But I believe that by adding a very simple crypto system for signing commits, files, and directories, is will be possible to use git directly to create the peer-to-peer web that projects like Beaker are currently using dat to build. At least that's my working hypothesis.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 2, 2018

I think that supporting a FS path for the remote would be sufficient to patch Dat into the workflow.
Would that be a difficult change to make? git remote add can take a path on the FS to a .git folder already, so it would work for standard git workflows.

I would then intercept requests to /dat/DATURLHERE/.git to interact with the DatArchive API and pass trough everything else to regular FS commands.

RangerMauve commented Apr 2, 2018

I think that supporting a FS path for the remote would be sufficient to patch Dat into the workflow.
Would that be a difficult change to make? git remote add can take a path on the FS to a .git folder already, so it would work for standard git workflows.

I would then intercept requests to /dat/DATURLHERE/.git to interact with the DatArchive API and pass trough everything else to regular FS commands.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 2, 2018

I see that it's actually a lot of work since this library is using the smart protocol for remotes, so you couldn't just plug an FS.

RangerMauve commented Apr 2, 2018

I see that it's actually a lot of work since this library is using the smart protocol for remotes, so you couldn't just plug an FS.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 2, 2018

Member

Why would you need a remote at all? Couldn't you just make commits directly in the dat folder?

Member

wmhilton commented Apr 2, 2018

Why would you need a remote at all? Couldn't you just make commits directly in the dat folder?

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 3, 2018

I was thinking remotes were necessary for collaboration. I want to have something like a github but with all your data being stored in dat. You get links to your peers git dats and can merge their branches into your own. Also to support pulling from an upstream into your own copy of somebody's repo.

Is there a way to do merges without remotes?

RangerMauve commented Apr 3, 2018

I was thinking remotes were necessary for collaboration. I want to have something like a github but with all your data being stored in dat. You get links to your peers git dats and can merge their branches into your own. Also to support pulling from an upstream into your own copy of somebody's repo.

Is there a way to do merges without remotes?

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

Mmm.. hmmm.. I see. I was thinking of purely technical hurdles but there's a conceptual/organizational hurdle as well. You can do merges without remotes (not yet in isomorphic-git) by simply using another directory. E.g.:

cd myrepo
git merge ../yourrepo master

At least IIRC. Googling it seems to come up with recipes that use "git remote add mauve ../yourrepo" which does seem to suggest that git treats it as a special kind of remote. Actually, it looks a lot like adding a regular remote, except instead of a URL you use a file path. So maybe we DO need pluggable remotes! I'll have to figure out how git does it. It'd be awesome to have a solution for dat-ifying a git repo that worked with canonical git as well.

Member

wmhilton commented Apr 4, 2018

Mmm.. hmmm.. I see. I was thinking of purely technical hurdles but there's a conceptual/organizational hurdle as well. You can do merges without remotes (not yet in isomorphic-git) by simply using another directory. E.g.:

cd myrepo
git merge ../yourrepo master

At least IIRC. Googling it seems to come up with recipes that use "git remote add mauve ../yourrepo" which does seem to suggest that git treats it as a special kind of remote. Actually, it looks a lot like adding a regular remote, except instead of a URL you use a file path. So maybe we DO need pluggable remotes! I'll have to figure out how git does it. It'd be awesome to have a solution for dat-ifying a git repo that worked with canonical git as well.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

Oh boy, I had forgotten this whole side of git existed:
https://git-scm.com/docs/git-remote-helpers

Member

wmhilton commented Apr 4, 2018

Oh boy, I had forgotten this whole side of git existed:
https://git-scm.com/docs/git-remote-helpers

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

That means git should be able to natively support dat:// URLs! Oh surely somebody has started on this...

Member

wmhilton commented Apr 4, 2018

That means git should be able to natively support dat:// URLs! Oh surely somebody has started on this...

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

Awesome! One of the best people started on this: https://github.com/substack/git-dat

Doesn't look like he finished it though!

Member

wmhilton commented Apr 4, 2018

Awesome! One of the best people started on this: https://github.com/substack/git-dat

Doesn't look like he finished it though!

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

@RangerMauve see if you can start figuring out how that would work. Meanwhile, I'll figure out how to port the git-remote-helper API to JavaScript so we can use it with isomorphic-git.

Member

wmhilton commented Apr 4, 2018

@RangerMauve see if you can start figuring out how that would work. Meanwhile, I'll figure out how to port the git-remote-helper API to JavaScript so we can use it with isomorphic-git.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 4, 2018

Member

You might find this example more useful than substack's, because it is actually complete: https://github.com/cjb/GitTorrent/blob/master/git-remote-gittorrent

gittorrent is one of my favorite projects. I don't think it deserved to die; I think it was ahead of its time.

Member

wmhilton commented Apr 4, 2018

You might find this example more useful than substack's, because it is actually complete: https://github.com/cjb/GitTorrent/blob/master/git-remote-gittorrent

gittorrent is one of my favorite projects. I don't think it deserved to die; I think it was ahead of its time.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 4, 2018

Funny hour our active times are 12 hours apart. :D
Would your goal be to communicate with the remote helper through a pipe from isomorphic git?
As cool as helpers for regular git are, I'm really more inclined to use something that will work with isomorphic git specifically so that I could enabled github-like collaboration in Beaker Browser.

If isomorphic git interacted with helpers through a stream, existing helpers could be refactored to have a version that takes a stream instead of STDIN.
Then helpers could provide the cli bit to interact with regular git, and a js module for isomorphic git.
Smiles all around, etc.

You could probably refactor isomorphic git to use an FS git remote helper and an HTTP git remote helper to make things cleaner.

RangerMauve commented Apr 4, 2018

Funny hour our active times are 12 hours apart. :D
Would your goal be to communicate with the remote helper through a pipe from isomorphic git?
As cool as helpers for regular git are, I'm really more inclined to use something that will work with isomorphic git specifically so that I could enabled github-like collaboration in Beaker Browser.

If isomorphic git interacted with helpers through a stream, existing helpers could be refactored to have a version that takes a stream instead of STDIN.
Then helpers could provide the cli bit to interact with regular git, and a js module for isomorphic git.
Smiles all around, etc.

You could probably refactor isomorphic git to use an FS git remote helper and an HTTP git remote helper to make things cleaner.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 6, 2018

Member

If isomorphic git interacted with helpers through a stream, existing helpers could be refactored to have a version that takes a stream instead of STDIN.

That would be the upside - simplifying integration with existing remote helpers. The downside is that new helpers - even if they were intended only for using in the browser - would have to parse a stream-based API, which is certainly an awkward way to build a JS API.

I think once I write an fs remote helper, I'll have a much better idea of what's involved.

Member

wmhilton commented Apr 6, 2018

If isomorphic git interacted with helpers through a stream, existing helpers could be refactored to have a version that takes a stream instead of STDIN.

That would be the upside - simplifying integration with existing remote helpers. The downside is that new helpers - even if they were intended only for using in the browser - would have to parse a stream-based API, which is certainly an awkward way to build a JS API.

I think once I write an fs remote helper, I'll have a much better idea of what's involved.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 6, 2018

I talked to substack on IRC and it seems he's taking a different approach to version control over dat, from what it seems he isn't targeting git integration for his approach.

I'm currently working on groundwork for getting dat to work in browsers so that more people could make use of this stuff in the first place.

RangerMauve commented Apr 6, 2018

I talked to substack on IRC and it seems he's taking a different approach to version control over dat, from what it seems he isn't targeting git integration for his approach.

I'm currently working on groundwork for getting dat to work in browsers so that more people could make use of this stuff in the first place.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 10, 2018

Member

Alright. I'm refactoring GitRemoteHTTP into a more general GitRemoteDuplexStream so that it'll be usable to clone over WebRTC and WebSocket streams. As part of that, I'm going to add pluggable remote handlers, adhering to the same rules and functions as git-remote-helpers, but without using streams for everything because that's overkill.

Remote helpers will be registered per-protocol. So maybe it would look like:

git.registerRemoteHelper('dat', DatRemoteHelper);

To use that remote helper, you would use a URL starting with dat:// or dat::.

When git encounters a URL that starts with a protocol other than (http|https), it will go through its list of registered helpers until it finds one for that protocol. It will then call the "capabilities" commands on that helper. Maybe that would be something like:

let remote = new DatRemoteHelper({remote, url, gitdir});
remote.on('error', console.log);
let caps = await remote.capabilities();

Here caps would be expected to be either ['connect'] for establishing a tunnel for a GitRemoteDuplexStream (e.g. with HTTP2 or WebRTC or WebSockets or UDP) or ['fetch', 'push'] for remote helpers that use their own logic (such as dat?).

But again, I'm not entirely sure how you would use dat with git. I don't think dat supports a lot of git's notions like branches and tags. I'm not opposed to it... I just feel like git by itself has most of the same features (all of them, once you add cryptographic signatures and a DHT), and with isomorphic-git you wouldn't even need a browser extension (like in Firefox) or a completely new browser (like Beaker).

Member

wmhilton commented Apr 10, 2018

Alright. I'm refactoring GitRemoteHTTP into a more general GitRemoteDuplexStream so that it'll be usable to clone over WebRTC and WebSocket streams. As part of that, I'm going to add pluggable remote handlers, adhering to the same rules and functions as git-remote-helpers, but without using streams for everything because that's overkill.

Remote helpers will be registered per-protocol. So maybe it would look like:

git.registerRemoteHelper('dat', DatRemoteHelper);

To use that remote helper, you would use a URL starting with dat:// or dat::.

When git encounters a URL that starts with a protocol other than (http|https), it will go through its list of registered helpers until it finds one for that protocol. It will then call the "capabilities" commands on that helper. Maybe that would be something like:

let remote = new DatRemoteHelper({remote, url, gitdir});
remote.on('error', console.log);
let caps = await remote.capabilities();

Here caps would be expected to be either ['connect'] for establishing a tunnel for a GitRemoteDuplexStream (e.g. with HTTP2 or WebRTC or WebSockets or UDP) or ['fetch', 'push'] for remote helpers that use their own logic (such as dat?).

But again, I'm not entirely sure how you would use dat with git. I don't think dat supports a lot of git's notions like branches and tags. I'm not opposed to it... I just feel like git by itself has most of the same features (all of them, once you add cryptographic signatures and a DHT), and with isomorphic-git you wouldn't even need a browser extension (like in Firefox) or a completely new browser (like Beaker).

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 10, 2018

That's awesome!
With regards to dat and git, I'm thinking of using dat as a sort of FS that's easy to share between users.

If you're planning on making a remote helper for folders, I could base the dat helper off of that since all the logic would be the same.

For example, picture a github where people's repositories are just links to dat URLS where they keep their .git folder.

Checking out someone's repo doesn't require the person to have a git server hosted along with whatever authentication is needed. All you need to do is fetch the latest state of the dat repo and treat it like an FS.

The more people use a repo the more the load for fetching it is spread out across the network.

I agree that git is already great for sharing data in a decentralized fashion, but I don't see any sort of ecosystem with apps in the browser using git in a p2p fashion. I've seen git browser apps speak to HTTP git servers, but it seems to be focused on version control only and not general applications. Dat is general enough that it could be used for a bunch of types of applications now using Beaker Browser and soon in all browsers via extensions.

I really want to make something with p2p git in the browser, and Dat just looks like the easiest path for me to achieve that. If there was an alternative that was more widespread or easier to use, I'd be all for it.

RangerMauve commented Apr 10, 2018

That's awesome!
With regards to dat and git, I'm thinking of using dat as a sort of FS that's easy to share between users.

If you're planning on making a remote helper for folders, I could base the dat helper off of that since all the logic would be the same.

For example, picture a github where people's repositories are just links to dat URLS where they keep their .git folder.

Checking out someone's repo doesn't require the person to have a git server hosted along with whatever authentication is needed. All you need to do is fetch the latest state of the dat repo and treat it like an FS.

The more people use a repo the more the load for fetching it is spread out across the network.

I agree that git is already great for sharing data in a decentralized fashion, but I don't see any sort of ecosystem with apps in the browser using git in a p2p fashion. I've seen git browser apps speak to HTTP git servers, but it seems to be focused on version control only and not general applications. Dat is general enough that it could be used for a bunch of types of applications now using Beaker Browser and soon in all browsers via extensions.

I really want to make something with p2p git in the browser, and Dat just looks like the easiest path for me to achieve that. If there was an alternative that was more widespread or easier to use, I'd be all for it.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 10, 2018

Member

Dat is general enough that it could be used for a bunch of types of applications now using Beaker Browser and soon in all browsers via extensions.

I'll concede that dat has first-mover advantage. That's the very reason that Beaker chose dat over IPFS et al.

I really want to make something with p2p git in the browser, and Dat just looks like the easiest path for me to achieve that. If there was an alternative that was more widespread or easier to use, I'd be all for it.

Stay tuned. Stay very tuned.

Member

wmhilton commented Apr 10, 2018

Dat is general enough that it could be used for a bunch of types of applications now using Beaker Browser and soon in all browsers via extensions.

I'll concede that dat has first-mover advantage. That's the very reason that Beaker chose dat over IPFS et al.

I really want to make something with p2p git in the browser, and Dat just looks like the easiest path for me to achieve that. If there was an alternative that was more widespread or easier to use, I'd be all for it.

Stay tuned. Stay very tuned.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 11, 2018

Re: first-mover advantage.

Well, in addition to it already being popular, I like it's security model. Unless someone shares a dat:// url with you, there's no way for you to access that data and everything is cryptographically secure and end-to-end encrypted (other than the discovery process, I guess).

This feature is pretty different from stuff like IPFS and BitTorrent.

Stay tuned

Anything I could get a sneak peek at? :D

RangerMauve commented Apr 11, 2018

Re: first-mover advantage.

Well, in addition to it already being popular, I like it's security model. Unless someone shares a dat:// url with you, there's no way for you to access that data and everything is cryptographically secure and end-to-end encrypted (other than the discovery process, I guess).

This feature is pretty different from stuff like IPFS and BitTorrent.

Stay tuned

Anything I could get a sneak peek at? :D

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 14, 2018

Member

Anything I could get a sneak peek at?

Until I get bidirectional packfile transfer working over WebRTC, there's not a lot of p2p possible. But you can look at https://git-app-manager.now.sh/ for one potential direction I am trying to head. Not shown in the UI, there's actually some code for announcing and querying git hashes using a GunDB database. For a given SHA1 ref, it tells you who on the network has that commit. But I don't yet have a way to use that knowledge - which is where doing clone/push over WebRTC comes in.

Member

wmhilton commented Apr 14, 2018

Anything I could get a sneak peek at?

Until I get bidirectional packfile transfer working over WebRTC, there's not a lot of p2p possible. But you can look at https://git-app-manager.now.sh/ for one potential direction I am trying to head. Not shown in the UI, there's actually some code for announcing and querying git hashes using a GunDB database. For a given SHA1 ref, it tells you who on the network has that commit. But I don't yet have a way to use that knowledge - which is where doing clone/push over WebRTC comes in.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 25, 2018

I noticed the #132 PR you recently did, does that mean the library is ready for new types of remotes?

RangerMauve commented Apr 25, 2018

I noticed the #132 PR you recently did, does that mean the library is ready for new types of remotes?

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 26, 2018

Member

Almost! I started the refactoring in that PR but I haven't exposed a way to register plugins yet. Once #138 is merged I'll actually know what the exact API will look like and I'll expose that.

Member

wmhilton commented Apr 26, 2018

Almost! I started the refactoring in that PR but I haven't exposed a way to register plugins yet. Once #138 is merged I'll actually know what the exact API will look like and I'll expose that.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 26, 2018

Really excited! I'm just finishing up a bunch of work to make the DatArchive API available in regular browsers, so hopefully we'll finish up around the same time and I can start working on this. :D

RangerMauve commented Apr 26, 2018

Really excited! I'm just finishing up a bunch of work to make the DatArchive API available in regular browsers, so hopefully we'll finish up around the same time and I can start working on this. :D

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Apr 26, 2018

Member

I'm just finishing up a bunch of work to make the DatArchive API available in regular browsers

Wait whaaaaat? How are you doing that? That would be freaking awesome.

Member

wmhilton commented Apr 26, 2018

I'm just finishing up a bunch of work to make the DatArchive API available in regular browsers

Wait whaaaaat? How are you doing that? That would be freaking awesome.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Apr 26, 2018

Its kinda cheating since I'm using a gateway, but users can set up a local gateway and it'll work just as well.

The gist is I create a local dat archive in the browser that replicates with the gateway via a websocket. Then the gateway serves files to the browser using http.

Check out the following:

https://github.com/RangerMauve/dat-archive-web

https://github.com/RangerMauve/dat-gateway

datproject/discussions#84

sammacbeth/dat-fox#1

https://github.com/RangerMauve/dat-polyfill

RangerMauve commented Apr 26, 2018

Its kinda cheating since I'm using a gateway, but users can set up a local gateway and it'll work just as well.

The gist is I create a local dat archive in the browser that replicates with the gateway via a websocket. Then the gateway serves files to the browser using http.

Check out the following:

https://github.com/RangerMauve/dat-archive-web

https://github.com/RangerMauve/dat-gateway

datproject/discussions#84

sammacbeth/dat-fox#1

https://github.com/RangerMauve/dat-polyfill

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 15, 2018

Following this because I'm very interested, and working on a related project.

One question is how multiple users can have merge access to the same tracking remote - if the tracking remote is run locally then I'm not sure how to resolve concurrent merges. If the tracking remote is run in one place only, then we have to make sure that machine stays online at all times. Any ideas?

benwiley4000 commented May 15, 2018

Following this because I'm very interested, and working on a related project.

One question is how multiple users can have merge access to the same tracking remote - if the tracking remote is run locally then I'm not sure how to resolve concurrent merges. If the tracking remote is run in one place only, then we have to make sure that machine stays online at all times. Any ideas?

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve May 15, 2018

For now, I'm aiming at a pull-based approach to remotes. You only push to a remote if you have write access (ie, it's your repo), and everyone else pulls from other remotes.

With centralized git you really need to have a central server to push all your changes to, because that's the only way for peers to then get the changes to each other because there's no way for them to connect directly.

However with decentralized git, everyone pulls data from each other directly when necessary.

If you do want to have a way for multiple people to push to the same remote, you can use the multiwriter support coming from hyperdb to allow multiple peers to "write" changes. I think whatever conflict-resolution strategy hyperdrive ends up using is what will need to be adopted by the system.

Other than that, if you want to track changes from multiple peers, you just track the multiple remotes and merge with them when you want their changes.

RangerMauve commented May 15, 2018

For now, I'm aiming at a pull-based approach to remotes. You only push to a remote if you have write access (ie, it's your repo), and everyone else pulls from other remotes.

With centralized git you really need to have a central server to push all your changes to, because that's the only way for peers to then get the changes to each other because there's no way for them to connect directly.

However with decentralized git, everyone pulls data from each other directly when necessary.

If you do want to have a way for multiple people to push to the same remote, you can use the multiwriter support coming from hyperdb to allow multiple peers to "write" changes. I think whatever conflict-resolution strategy hyperdrive ends up using is what will need to be adopted by the system.

Other than that, if you want to track changes from multiple peers, you just track the multiple remotes and merge with them when you want their changes.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton May 16, 2018

Member

However with decentralized git, everyone pulls data from each other directly when necessary.

You can do like Bittorrent does, and use a tracker server or a DHT (distributed hash table). A tracker server is a kind of an address book of "which peers can I get this repo from?" and a DHT is basically every peer has a copy (or partial copy) of the address book.

It is tricky.

Member

wmhilton commented May 16, 2018

However with decentralized git, everyone pulls data from each other directly when necessary.

You can do like Bittorrent does, and use a tracker server or a DHT (distributed hash table). A tracker server is a kind of an address book of "which peers can I get this repo from?" and a DHT is basically every peer has a copy (or partial copy) of the address book.

It is tricky.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton May 16, 2018

Member

It is tricky.

Because the addresses in the address book are constantly changing. And now all you've done is trade one hard problem "how do I distribute git repositories peer-to-peer" for another "how do I distribute an address book".

Although... now that I've put that in words... it wouldn't be too hard to keep an address book in a... [wait for it] git repo!

So now, you've traded "how do I distribute git repositories peer-to-peer" to "how can I distribute a single git repository and let everyone read and write to it".

Member

wmhilton commented May 16, 2018

It is tricky.

Because the addresses in the address book are constantly changing. And now all you've done is trade one hard problem "how do I distribute git repositories peer-to-peer" for another "how do I distribute an address book".

Although... now that I've put that in words... it wouldn't be too hard to keep an address book in a... [wait for it] git repo!

So now, you've traded "how do I distribute git repositories peer-to-peer" to "how can I distribute a single git repository and let everyone read and write to it".

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton May 16, 2018

Member

[at this point someone in the audience inevitably shouts "Blockchain!" but there's probably an alternative]

Member

wmhilton commented May 16, 2018

[at this point someone in the audience inevitably shouts "Blockchain!" but there's probably an alternative]

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve May 16, 2018

@wmhilton Discovering collaborators isn't too hard. An existing collaborator can add new ones by adding a reference to the other person's dat within the repo dat. That way you can reference the main dat URL, but find all the other dats involved by having the application look for them.

RangerMauve commented May 16, 2018

@wmhilton Discovering collaborators isn't too hard. An existing collaborator can add new ones by adding a reference to the other person's dat within the repo dat. That way you can reference the main dat URL, but find all the other dats involved by having the application look for them.

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 16, 2018

I discussed this on Dat irc but I'm not excited about a "one writer per repo" approach because I don't believe it will work well organizationally. I think there's a lot of value in having community agreement about which version of the code is the one that will be released regularly, and to make it a common goal to contribute back to that repo. The ability to fork is a healthy and necessary part of an open source ecosystem but I don't want that to be the way everything happens, because I want to avoid a situation where many popular forks of the same library are being actively maintained and a software project's dependency graph ends up with 8 or 10 different incompatible versions of the same package, not even on the same release path.

Even in the scenario where a community does congregate around the technical or charismatic appeal of one person's version of the code - I think it establishes an unnecessary power structure. And then, if a group has a political disagreement with that person, that can be sufficient reason to fork off and stop contributing changes to the person's repo, which means the broader community now has to decide which version to depend upon. Obviously this is a thing that happens in modern open source development but I think it would happen much more with an ecosystem of repos where only one person ever has write access, even for very large projects.

So all that is a long winded way to say that I would like to solve the multi writer problem. :) I think it makes for better teamwork.

benwiley4000 commented May 16, 2018

I discussed this on Dat irc but I'm not excited about a "one writer per repo" approach because I don't believe it will work well organizationally. I think there's a lot of value in having community agreement about which version of the code is the one that will be released regularly, and to make it a common goal to contribute back to that repo. The ability to fork is a healthy and necessary part of an open source ecosystem but I don't want that to be the way everything happens, because I want to avoid a situation where many popular forks of the same library are being actively maintained and a software project's dependency graph ends up with 8 or 10 different incompatible versions of the same package, not even on the same release path.

Even in the scenario where a community does congregate around the technical or charismatic appeal of one person's version of the code - I think it establishes an unnecessary power structure. And then, if a group has a political disagreement with that person, that can be sufficient reason to fork off and stop contributing changes to the person's repo, which means the broader community now has to decide which version to depend upon. Obviously this is a thing that happens in modern open source development but I think it would happen much more with an ecosystem of repos where only one person ever has write access, even for very large projects.

So all that is a long winded way to say that I would like to solve the multi writer problem. :) I think it makes for better teamwork.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve May 16, 2018

The thing to consider, too, is that it's not just that a single person has write access to a Dat archive. A single machine has access. So if you want to develop from multiple devices without corrupting your archive, you have to have multiwriter either through HyperDB, or through something fancy at the application level.

RangerMauve commented May 16, 2018

The thing to consider, too, is that it's not just that a single person has write access to a Dat archive. A single machine has access. So if you want to develop from multiple devices without corrupting your archive, you have to have multiwriter either through HyperDB, or through something fancy at the application level.

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 16, 2018

Yeah exactly. For hypergit (which is now just a discussion threads spec) I'm planning for everything to be stored in a Hyperdb and you can use an existing device to authorize a new one as being attached to the same user - so having a mobile phone interface is kind of a must (it's unrealistic to expect someone to have two of their computers in the same place).

benwiley4000 commented May 16, 2018

Yeah exactly. For hypergit (which is now just a discussion threads spec) I'm planning for everything to be stored in a Hyperdb and you can use an existing device to authorize a new one as being attached to the same user - so having a mobile phone interface is kind of a must (it's unrealistic to expect someone to have two of their computers in the same place).

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve May 16, 2018

How do you plan to do conflict resolution between people with write access? It sounds like you're going to have to just re-implement the idea of git, but using hyperdb instead of git's existing object graph stored on the FS.

RangerMauve commented May 16, 2018

How do you plan to do conflict resolution between people with write access? It sounds like you're going to have to just re-implement the idea of git, but using hyperdb instead of git's existing object graph stored on the FS.

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 16, 2018

As @noffle pointed out, git-ssb puts people's changes on namespaced branches by writer. My idea was to do this under the hood (for the shared remote which is stored in a multi-writer hyperdrive), but only display the most recent version when viewed through the UI, without the namespace. If we end up with competing histories for the same branch, users with write access will be promoted to resolve the conflict between the branches so a new head can be established. Does that make sense at all?

benwiley4000 commented May 16, 2018

As @noffle pointed out, git-ssb puts people's changes on namespaced branches by writer. My idea was to do this under the hood (for the shared remote which is stored in a multi-writer hyperdrive), but only display the most recent version when viewed through the UI, without the namespace. If we end up with competing histories for the same branch, users with write access will be promoted to resolve the conflict between the branches so a new head can be established. Does that make sense at all?

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 16, 2018

This would also require a remote helper I guess, because we would want people to be able to push to a branch name without thinking about user hash namespaces.

benwiley4000 commented May 16, 2018

This would also require a remote helper I guess, because we would want people to be able to push to a branch name without thinking about user hash namespaces.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve May 16, 2018

@benwiley4000 Yeah, that's great! What happens when there's a conflict in two write-access users trying to do conflicting resolves?

RangerMauve commented May 16, 2018

@benwiley4000 Yeah, that's great! What happens when there's a conflict in two write-access users trying to do conflicting resolves?

@benwiley4000

This comment has been minimized.

Show comment
Hide comment
@benwiley4000

benwiley4000 May 16, 2018

@RangerMauve haha I don't know, I guess it just goes on forever until the dust settles. :)

benwiley4000 commented May 16, 2018

@RangerMauve haha I don't know, I guess it just goes on forever until the dust settles. :)

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Sep 12, 2018

So is this basically supported now that there's a pluggable FS implementation? :3

RangerMauve commented Sep 12, 2018

So is this basically supported now that there's a pluggable FS implementation? :3

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Sep 13, 2018

Member

Good question! I think it would, because we can map the DatArchive API commands onto the fs plugin API... let's see:

'fs' plugin DatArchive
fs.readFile(path[, options], callback) readFile(path, opts?)
fs.writeFile(file, data[, options], callback) writeFile(path, data, opts?)
fs.unlink(path, callback) unlink(path)
fs.readdir(path[, options], callback) readdir(path, opts?)
fs.mkdir(path[, mode], callback) mkdir(path)
fs.rmdir(path, callback) rmdir(path, opts?)
fs.stat(path[, options], callback) stat(path, opts?)
fs.lstat(path[, options], callback) stat(path, opts?)*
fs.readlink(path[, options], callback) *
fs.symlink(target, path[, type], callback) *

* Aside from converting promises to callbacks, I think the only other tricky bit is handling symbolic links:

Modify the Stat object returned by Beaker to include a dummy isSymbolicLink() method that always returns false. That way isomorphic-git won't ever call fs.readlink. And isomorphic-git will never call fs.symlink unless you checkout a git repo that includes a symlink, so you can make that a no-operation dummy function.

So yes?! Oh boy I wish I had more time to work on this! Would somebody be willing to start creating a DatArchive fs plugin? Once it got started I would help contribute, I just don't have the bandwidth to be the lead developer on that.

Member

wmhilton commented Sep 13, 2018

Good question! I think it would, because we can map the DatArchive API commands onto the fs plugin API... let's see:

'fs' plugin DatArchive
fs.readFile(path[, options], callback) readFile(path, opts?)
fs.writeFile(file, data[, options], callback) writeFile(path, data, opts?)
fs.unlink(path, callback) unlink(path)
fs.readdir(path[, options], callback) readdir(path, opts?)
fs.mkdir(path[, mode], callback) mkdir(path)
fs.rmdir(path, callback) rmdir(path, opts?)
fs.stat(path[, options], callback) stat(path, opts?)
fs.lstat(path[, options], callback) stat(path, opts?)*
fs.readlink(path[, options], callback) *
fs.symlink(target, path[, type], callback) *

* Aside from converting promises to callbacks, I think the only other tricky bit is handling symbolic links:

Modify the Stat object returned by Beaker to include a dummy isSymbolicLink() method that always returns false. That way isomorphic-git won't ever call fs.readlink. And isomorphic-git will never call fs.symlink unless you checkout a git repo that includes a symlink, so you can make that a no-operation dummy function.

So yes?! Oh boy I wish I had more time to work on this! Would somebody be willing to start creating a DatArchive fs plugin? Once it got started I would help contribute, I just don't have the bandwidth to be the lead developer on that.

@millette

This comment has been minimized.

Show comment
Hide comment
@millette

millette Sep 13, 2018

@wmhilton hyperdrive's API is much closer to that of fs. Not sure if this helps...

millette commented Sep 13, 2018

@wmhilton hyperdrive's API is much closer to that of fs. Not sure if this helps...

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Sep 13, 2018

That's great to hear! :D

Does that mean isomorphic git supports merging from a given folder rather than a remote?

What I mean is, if I have two DatArchives, one of which is my repo, what API would I need to call to "merge" from one into the other?

I don't have time at the moment to do this, but I've been wanting to for a while and lost track of your progress. I'm definately going to get on it sooner than later because I want to play around with version control in dat sometime soon anyways.

RangerMauve commented Sep 13, 2018

That's great to hear! :D

Does that mean isomorphic git supports merging from a given folder rather than a remote?

What I mean is, if I have two DatArchives, one of which is my repo, what API would I need to call to "merge" from one into the other?

I don't have time at the moment to do this, but I've been wanting to for a while and lost track of your progress. I'm definately going to get on it sooner than later because I want to play around with version control in dat sometime soon anyways.

@wmhilton

This comment has been minimized.

Show comment
Hide comment
@wmhilton

wmhilton Sep 13, 2018

Member

I'm not entirely sure what you mean. Isomorphic-Git only supports fast-forward merges at the moment, but even then, all git merges are "local" in the sense that they happen inside a single repo. 🤔

Do you mean fetch? You probably mean fetch. Does Isomorphic-Git support fetching from one repo to another using only the file system... come to think of it, no actually. Not yet. But we should add that ability.

Member

wmhilton commented Sep 13, 2018

I'm not entirely sure what you mean. Isomorphic-Git only supports fast-forward merges at the moment, but even then, all git merges are "local" in the sense that they happen inside a single repo. 🤔

Do you mean fetch? You probably mean fetch. Does Isomorphic-Git support fetching from one repo to another using only the file system... come to think of it, no actually. Not yet. But we should add that ability.

@RangerMauve

This comment has been minimized.

Show comment
Hide comment
@RangerMauve

RangerMauve Sep 13, 2018

Yeah, that's exactly what I meant! I think that DatArchive integration will be blocked until you can fetch from the filesystem.

RangerMauve commented Sep 13, 2018

Yeah, that's exactly what I meant! I think that DatArchive integration will be blocked until you can fetch from the filesystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment