Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide multiple dat hosting/sharing on Bunsen #40

Closed
chrisekelley opened this issue Apr 11, 2018 · 14 comments
Closed

Provide multiple dat hosting/sharing on Bunsen #40

chrisekelley opened this issue Apr 11, 2018 · 14 comments

Comments

@chrisekelley
Copy link
Contributor

chrisekelley commented Apr 11, 2018

Right now, Bunsen is limited to hosting/sharing one dat at a time. How can we implement this support?

Here is the route that handles a dat submission and hosts it in Bunsen: https://github.com/bunsenbrowser/bunsen/blob/master/www/nodejs-project/index.js#L78

We're currently using the dat-node library to join the network/share the dat and mirror to download it to the fs.

var Dat = require('dat-node')
- snip - 
app.get('/dat/:dat', function(req, res) {
  // var name = decodeURI(req.url.split('/')[0])
  var name = '/';
  var datId = req.params.dat
  console.log("name: " + name + " dat: " + datId)
  // res.send("dat is set to " + req.params.dat);
  // var link = "778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639"

  // 1. Tell Dat where to download the files
  Dat(ram, {key: datId}, function (err, dat) {
    if (err) throw err

    var network = dat.joinNetwork()
    network.once('connection', function () {
      console.log('Connected')
    })
    dat.network.on('connection', function () {
      console.log('I connected to someone for ' + datId)
      console.log('connected to', network.connections.length, 'peers')
    })
    dat.archive.metadata.update(download)

    function download () {
      var progress = mirror({fs: dat.archive, name: '/'}, dest, function (err) {
        if (err) throw err
        console.log('Done')
      })
      progress.on('put', function (src) {
        console.log('Downloading', src.name)
      })
      progress.on('end', function () {
        // var datString = JSON.stringify(datString)
        console.log('Finished downloading ' + datId)
        fs.writeFile(datIdFile, datId, function(err) {
          if(err) {
            return console.log("Error writing datIdFile: " + err);
          }

          console.log("The file was saved!");
        });
        ondirectory(dat.archive, name, req, res, opts)
      })
    }

    console.log(`Downloading: ${dat.key.toString('hex')}\n`)
  })
});
@rjcorwin
Copy link
Contributor

@chrisekelley At the moment it's tempting to go with some fork of dat-gateway because it supports multiple dat archives and @RangerMauve has been working on a new DatArchive library called dat-archive-web that gives you the DatArchive API with help from the gateway.

  • Keeping dat archives in the gateway in memory is going to bite us on mobile.
  • When you are using an app that uses DatArchive API to store data, if the gateway throws out memory thus the dat on exit then you lose your dat forever if no one else is hosting it.
  • Because everything is stored in memory, it's not going to work well offline unless you can keep your device on and the dat-gateway in memory.
  • Because all Dat Archives are accessed on the same domain, Apps in Dat Archives cannot assume that for example if they store something in localStorage that other apps can't access the same localStorage variables they set.

This PR solves all of those problems but does introduce a new problem not flushing dat archives that don't need to stick around thus filling up disk unnecessarily. However, it does not yet have a DatArchive API solution.

@RangerMauve
Copy link

RangerMauve commented Apr 11, 2018

I think dat-gateway persists data on disk and then clears the cache periodically. It lets you specify the TTL and number of cached archives. I would suggest using something else to signal that an archives should be persisted permanently.

With regards to same-origin issues, I'm panning on modifying dat-gateway to support subdomains with base32 encoded dat keys.

For example loading

dat://87ed2e3b160f261a032af03921a3bd09227d0a4cde73466c17114816cae43336

(the brakerbrowser.com dat url) could be done with

http://11VD5OTHC3P6381ILS1P46HRQ292FK54PNJJ8PM1E4A82R5E8CPM.gateway.mauve.moe:3000

Once I have this working, you could set up a DNS wildcard that directs *.local.bunsen.com to 127.0.0.1. Then you can have the window load URLS using the base32 URL pointing at the local gateway.

@rjcorwin
Copy link
Contributor

I'm panning on modifying dat-gateway to support subdomains with base32 encoded dat keys.

I like the base32 idea. It makes it easy going back and forth between dat archive address and base32 encoded dat archive address. No map required.

I would suggest using something else to signal that an archives should be persisted permanently.

Agreed, there needs to be some kind of signal. Perhaps a subdomain like dat.lvh.me where you can make http calls to do this signaling or perhaps the websockets route that you are taking?

@rjcorwin
Copy link
Contributor

Wait, do base32 encoded Dat Archive URLs get longer?
screen shot 2018-04-11 at 5 55 59 pm

@rjcorwin
Copy link
Contributor

Would look something like...

diff --git a/index.js b/index.js
index e25d75a..4322af3 100644
--- a/index.js
+++ b/index.js
@@ -30,10 +30,9 @@ class DatGateway {
     })
     this.server = http.createServer((req, res) => {
       log('%s %s', req.method, req.url)
-      // TODO redirect /:key to /:key/
+      let address = base32.decode(req.headers.host.split('.')[0])
       let urlParts = req.url.split('/')
-      let address = urlParts[1]
-      let path = urlParts.slice(2).join('/')
+      let path = urlParts.slice(1).join('/')
       return this.resolveDat(address).then((key) => {
         return this.getDat(key)
       }).then((dat) => {
diff --git a/package.json b/package.json
index 90830fc..e9aa790 100644
--- a/package.json
+++ b/package.json
@@ -9,6 +9,7 @@
     "test": "standard && dependency-check . && mocha"
   },
   "dependencies": {
+    "base32": "0.0.6",
     "dat-link-resolve": "^2.1.0",
     "dat-node": "^3.5.6",
     "hyperdrive-http": "^4.2.2",

@RangerMauve
Copy link

They should be shorter since you have twice as much information per character.

With regards to persisting data, I think that should be a Bunsen specific service and running on a separate port from the gateway.

It should provide the credential tracking that Beaker would be providing as well as tagging archives to be persisted

@RangerMauve
Copy link

Yeah! Maybe fork off of my fork of dat-gateway and add the changes there.

@rjcorwin
Copy link
Contributor

With regards to persisting data, I think that should be a Bunsen specific service and running on a separate port from the gateway.

Ya I could see that being a thing. That would be like self discovery? LOL it works and I never thought about doing it. If you clone a dat, share it, go offline, then in another folder clone it... It finds itself being shared on your local computer.

They should be shorter since you have twice as much information per character.

I haven't had any luck, they always end up longer :-/

Here's another approach where you hit http://dat.<some domain>/<dat archive UUID> and it redirects you to http://<shortened dat archive uuid>.<some domain>.

apr-11-2018 18-40-29

diff --git a/index.js b/index.js
index e25d75a..00e6354 100644
--- a/index.js
+++ b/index.js
@@ -14,6 +14,8 @@ function log () {
   }
 }
 
+var map = {}
+
 module.exports =
 class DatGateway {
   constructor ({ dir, max, maxAge }) {
@@ -30,10 +32,21 @@ class DatGateway {
     })
     this.server = http.createServer((req, res) => {
       log('%s %s', req.method, req.url)
-      // TODO redirect /:key to /:key/
+      let subdomain = req.headers.host.split('.')[0]
       let urlParts = req.url.split('/')
-      let address = urlParts[1]
-      let path = urlParts.slice(2).join('/')
+      if (subdomain === 'dat') {
+        let requestedDat = urlParts[1]
+        let shortname = requestedDat.substr(0,5)
+        map[shortname] = requestedDat 
+        let protocol = (req.connection.encrypted) ? 'https' : 'http'
+        let redirectTo = `${protocol}://${req.headers.host.replace('dat', shortname)}`
+        res.writeHead(302, {
+           location: `${redirectTo}`
+        })
+        return res.end();
+      }
+      let address = map[subdomain]
+      let path = urlParts.slice(1).join('/')
       return this.resolveDat(address).then((key) => {
         return this.getDat(key)
       }).then((dat) => {

@rjcorwin
Copy link
Contributor

@RangerMauve What direction are you thinking makes sense? Am I messing up the base32 and it actually comes in under 64 characters for a dat archive address or does the second approach, while a bit awkward, make sense?

@RangerMauve
Copy link

The problem is that the base32 library is converting between base32 and ASCII. You'll probably want to find something to convert a buffer to base32 and then use toBase32(Buffer.from(key, "hex"))

@rjcorwin
Copy link
Contributor

You'll probably want to find something to convert a buffer to base32 and then use toBase32(Buffer.from(key, "hex"))

Riiiight. I vaguely remember wrangling with this on an IoT project. Will give it another go. In the meantime I opened up a PR to @pfrazee's dat-gateway to get the ball rolling upstream. pfrazee/dat-gateway#6

@rjcorwin
Copy link
Contributor

Moving this conversation partly over to the PR pfrazee/dat-gateway#6 (comment)

@chrisekelley
Copy link
Contributor Author

@RangerMauve - I've forked your version of dat-gateway and we're now using its websocket-replication branch - I'm really excited about the possibilities you're opening up in enabling websocket support. dat-gateway is now in master and support of multiple dats is working.

"dat-gateway": "git+https://github.com/chrisekelley/dat-gateway.git#websocket-replication",

I merged @rjsteinert 's PR enabling CORS support from here into my fork: pfrazee/dat-gateway#7

I'm going to close this issue now that we have multiple dat hosting enabled.

@RangerMauve
Copy link

@chrisekelley Thanks! One nit in the review then we can merge it in. :D

I'm going to continue the work on DatArchiveWeb, but I think my roadmap looks like:

  • Think about persistence for dat-archive-web (particularly how to support selectArchive
  • Get the base32 subdomain thing working
  • Working on Chrome extension to use dat-gateway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants