Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization: don't re-verify unchanged files #715

Merged
merged 2 commits into from Apr 2, 2016
Merged
Changes from 1 commit
Commits
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.

Always

Just for now

Next

Optimization: don't re-verify unchanged files

Let the user specify known-good file modtimes. If the files modtime is at least
that old, then the file hasn't changed and does not need to be re-verified.

This is only valid in node when using FS backing storage, not in the browser
  • Loading branch information
dcposch committed Apr 1, 2016
commit 3756ae636c242ea9423874cdfeeb608b4ba1a507
@@ -5,12 +5,12 @@ module.exports = Torrent
var addrToIPPort = require('addr-to-ip-port')
var BitField = require('bitfield')
var ChunkStoreWriteStream = require('chunk-store-stream/write')
var cpus = require('cpus')
var debug = require('debug')('webtorrent:torrent')

This comment has been minimized.

Copy link
@feross

feross Apr 2, 2016

Member

This needs to be removed from package.json

var Discovery = require('torrent-discovery')
var EventEmitter = require('events').EventEmitter
var extend = require('xtend')
var extendMutable = require('xtend/mutable')
var fs = require('fs')
var FSChunkStore = require('fs-chunk-store') // browser: `memory-chunk-store`
var ImmediateChunkStore = require('immediate-chunk-store')
var inherits = require('inherits')
@@ -45,6 +45,8 @@ var PIPELINE_MAX_DURATION = 1
var RECHOKE_INTERVAL = 10000 // 10 seconds
var RECHOKE_OPTIMISTIC_DURATION = 2 // 30 seconds

var FILESYSTEM_CONCURRENCY = 2

This comment has been minimized.

Copy link
@feross

feross Apr 2, 2016

Member

Why 2?


var TMP = typeof pathExists.sync === 'function'
? path.join(pathExists.sync('/tmp') ? '/tmp' : os.tmpDir(), 'webtorrent')
: '/tmp/webtorrent'
@@ -99,6 +101,9 @@ function Torrent (torrentId, client, opts) {
// for cleanup
this._servers = []

// optimization: don't recheck every file if it hasn't changed
this._fileModtimes = opts.fileModtimes

if (torrentId !== null) this._onTorrentId(torrentId)
}

@@ -403,31 +408,81 @@ Torrent.prototype._onMetadata = function (metadata) {
})

self._debug('verifying existing torrent data')
parallelLimit(self.pieces.map(function (piece, index) {
if (self._fileModtimes && self._store === FSChunkStore) {
// don't verify if the files haven't been modified since we last checked
self.getFileModtimes(function (err, fileModtimes) {
if (err) return self._onError(err)

var unchanged = self.files.map(function (_, index) {
return fileModtimes[index] === self._fileModtimes[index]
}).reduce(function (a, b) {
return a && b
})

This comment has been minimized.

Copy link
@feross

feross Apr 2, 2016

Member

This will probably crash when there's only one file, since a default starting value was not specified for reduce.

This comment has been minimized.

Copy link
@feross

feross Apr 2, 2016

Member

I think arr.every() works better here.

if (unchanged) {
for (var index = 0; index < self.pieces.length; index++) {
self._markVerified(index)
}
self._onStore()

This comment has been minimized.

Copy link
@feross

feross Apr 2, 2016

Member

This could be optimized further to handle the case where only some files are modified, i.e. mark pieces that are wholly-contained within unchanged files as verified, and reverify the rest.

} else {
self._verifyPieces()
}
})
} else {
self._verifyPieces()
}

self.emit('metadata')
}

/*
* Gets the last modified time of every file on disk for this torrent.
* Only valid in Node, not in the browser.
*/
Torrent.prototype.getFileModtimes = function (cb) {
var self = this
var ret = []
parallelLimit(self.files.map(function (file, index) {
return function (cb) {
fs.stat(path.join(self.path, file.path), function (err, stat) {
ret[index] = stat && stat.mtime.getTime()
cb(err)
})
}
}), FILESYSTEM_CONCURRENCY, function (err) {
self._debug('done getting file modtimes')
cb(err, ret)
})
}

Torrent.prototype._verifyPieces = function () {
var self = this
parallelLimit(self.pieces.map(function (_, index) {
return function (cb) {
self.store.get(index, function (err, buf) {
if (err) return cb(null) // ignore error
sha1(buf, function (hash) {
if (hash === self._hashes[index]) {
if (!self.pieces[index]) return
self._debug('piece verified %s', index)
self.pieces[index] = null
self._reservations[index] = null
self.bitfield.set(index, true)
self._markVerified(index)
} else {
self._debug('piece invalid %s', index)
}
cb(null)
})
})
}
}), cpus().length, function (err) {
}), FILESYSTEM_CONCURRENCY, function (err) {
if (err) return self._onError(err)
self._debug('done verifying')
self._onStore()
})
}

self.emit('metadata')
Torrent.prototype._markVerified = function (index) {
this.pieces[index] = null
this._reservations[index] = null
this.bitfield.set(index, true)
}

/**
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.