Skip to content

Commit

Permalink
ds
Browse files Browse the repository at this point in the history
  • Loading branch information
shinnn committed May 21, 2018
1 parent ee2ec4f commit 3b4c47c
Show file tree
Hide file tree
Showing 5 changed files with 1,353 additions and 1,505 deletions.
59 changes: 19 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,9 @@ const url = 'https://****.org/my-archive.tar';

dlTar(url, 'my/dir').subscribe({
next({entry}) {
if (entry.bytes !== entry.header.size) {
return;
if (entry.remain === 0) {
console.log(`${entry.header.name}`);
}

console.log(`${entry.header.name}`);
},
complete() {
readdirSync('my/dir'); //=> ['INSTALL', LICENSE', 'README.md', 'bin']
Expand All @@ -46,11 +44,9 @@ dlTar(url, 'my/dir').subscribe({
Completed.
```

For [gzipped](https://tools.ietf.org/html/rfc1952) tar (`tar.gz`), use [`dl-tgz`](https://github.com/shinnn/dl-tgz) instead.

## Installation

[Use npm.](https://docs.npmjs.com/cli/install)
[Use](https://docs.npmjs.com/cli/install) [npm](https://docs.npmjs.com/getting-started/what-is-npm).

```
npm install dl-tar
Expand All @@ -64,35 +60,37 @@ const dlTar = require('dl-tar');

### dlTar(*tarArchiveUrl*, *extractDir* [, *options*])

*tarArchiveUrl*: `String`
*extractDir*: `String` (a path where the archive will be extracted)
*tarArchiveUrl*: `string`
*extractDir*: `string` (a path where the archive will be extracted)
*options*: `Object`
Return: [`Observable`](https://tc39.github.io/proposal-observable/#observable) ([zenparsing's implementation](https://github.com/zenparsing/zen-observable))

When the `Observable` is [subscribe](https://tc39.github.io/proposal-observable/#observable-prototype-subscribe)d, it starts to download the tar archive, extract it and successively send extraction progress to its [`Observer`](https://github.com/tc39/proposal-observable#observer).

When the [`Subscription`](https://tc39.github.io/proposal-observable/#subscription-objects) is [unsubscribe](https://tc39.github.io/proposal-observable/#subscription-prototype-unsubscribe)d, it stops downloading and extracting.

It automatically unzips gzipped archives.

#### Progress

Every progress object have two properties `entry` and `response`.

##### entry

Type: `Object {bytes: <number>, header: <Object>}`
Type: [`tar.ReadEntry`](https://github.com/npm/node-tar#class-tarreadentry-extends-minipass)

`entry.header` is [a header of the entry](https://github.com/mafintosh/tar-stream#headers), and `entry.bytes` is the total size of currently extracted entry. `bytes` is always `0` if the entry is not a file but directory, link or symlink.
An instance of [node-tar](https://github.com/npm/node-tar)'s [`ReadEntry`](https://github.com/npm/node-tar/blob/v4.4.2/lib/read-entry.js) object.

For example you can get the progress of each entry as a percentage by `(progress.entry.bytes / progress.entry.header.size || 0) * 100`.
For example you can get the progress of each entry as a percentage by `100 - progress.entry.remain / progress.entry.size * 100`.

```javascript
dlTar('https://****.org/my-archive.tar', 'my/dir')
.filter(progress => progress.entry.header.type === 'file')
.filter(progress => progress.entry.type === 'File')
.subscribe(progress => {
console.log(`${(progress.entry.bytes / progress.entry.header.size * 100).toFixed(1)} %`);
console.log(`${(100 - progress.entry.remain / progress.entry.size * 100).toFixed(1)} %`);

if (progress.entry.bytes === progress.entry.header.size) {
console.log(`>> OK ${progress.entry.header.name}`);
if (progress.entry.remain === 0) {
console.log(`>> OK ${progress.entry.header.path}`);
}
});
```
Expand All @@ -119,34 +117,15 @@ dlTar('https://****.org/my-archive.tar', 'my/dir')

Type: `Object {bytes: <number>, headers: <Object>, url: <string>}`

`response.url` is the final redirected URL of the request, `response.headers` is a [response header object](https://nodejs.org/api/http.html#http_message_headers) derived from [`http.IncomingMessage`](https://nodejs.org/api/http.html#http_class_http_incomingmessage), and `response.bytes` is a total content length of the downloaded archive. `content-length` header will be converted to `Number` if it is `String`.
`response.url` is the final redirected URL of the request, `response.headers` is a [response header object](https://nodejs.org/api/http.html#http_message_headers) derived from [`http.IncomingMessage`](https://nodejs.org/api/http.html#http_class_http_incomingmessage), and `response.bytes` is a total content length of the downloaded archive. `content-length` header will be converted to `number` if it's `string`.

#### Options

You can pass options to [Request](https://github.com/request/request#requestoptions-callback) and [tar-fs](https://github.com/mafintosh/tar-fs)'s [`extract` method](https://github.com/mafintosh/tar-fs/blob/12968d9f650b07b418d348897cd922e2b27ec18c/index.js#L167). Note that:

* [`ignore` option](https://github.com/mafintosh/tar-fs/blob/b79d82a79c5e21f6187462d7daaba1fc03cdd1de/index.js#L236) is applied before [`map` option](https://github.com/mafintosh/tar-fs/blob/b79d82a79c5e21f6187462d7daaba1fc03cdd1de/index.js#L232) modifies filenames.
* [`strip` option](https://github.com/mafintosh/tar-fs/blob/12968d9f650b07b418d348897cd922e2b27ec18c/index.js#L47) defaults to `1`, not `0`. That means the top level directory is stripped off by default.
* [`fs`](https://github.com/mafintosh/tar-fs/blob/e59deed830fded0e4e5beb016d2df9c7054bb544/index.js#L65) option defaults to [graceful-fs](https://github.com/isaacs/node-graceful-fs) for more stability.

Additionally, you can use the following:

##### tarTransform

Type: [`Stream`](https://nodejs.org/api/stream.html#stream_stream)
You can pass options to [Request](https://github.com/request/request#requestoptions-callback) and [node-tar](https://www.npmjs.com/package/tar)'s [`Unpack` constructor](https://github.com/npm/node-tar#class-tarunpack). Note that:

A [transform stream](https://nodejs.org/api/stream.html#stream_class_stream_transform) to modify the archive before extraction.

For example, pass [gunzip-maybe](https://github.com/mafintosh/gunzip-maybe) to this option and you can download both [gzipped](https://tools.ietf.org/html/rfc1952) and non-gzipped tar.

```javascript
const dlTar = require('dl-tar');
const gunzipMaybe = require('gunzip-maybe');

const observable = dlTar('https://github.com/nodejs/node/archive/master.tar.gz', './', {
tarTransform: gunzipMaybe()
});
```
* `onentry` option is not supported.
* `strict` option defaults to `true`, not `false`.
* `strip` option defaults to `1`, not `0`. That means the top level directory is stripped off by default.

## License

Expand Down
214 changes: 75 additions & 139 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,88 +1,83 @@
'use strict';

const {inspect} = require('util');
const {join} = require('path');
const streamLib = require('stream');

const Transform = streamLib.Transform;
const PassThrough = streamLib.PassThrough;
const {Transform} = require('stream');

const cancelablePump = require('cancelable-pump');
const Extract = require('tar-stream').extract;
const fsExtract = require('tar-fs').extract;
const gracefulFs = require('graceful-fs');
const {Unpack} = require('tar');
const inspectWithKind = require('inspect-with-kind');
const isPlainObj = require('is-plain-obj');
const isStream = require('is-stream');
const loadRequestFromCwdOrNpm = require('load-request-from-cwd-or-npm');
const mkdirp = require('mkdirp');
const Observable = require('zen-observable');

class DestroyableTransform extends Transform {
destroy() {
super.unpipe();
}
}

class InternalExtract extends Extract {
class InternalUnpack extends Unpack {
constructor(options) {
super();
super(Object.assign({strict: true, strip: 1}, options, {
onentry(entry) {
if (entry.size === 0) {
setImmediate(() => this.emitProgress(entry));
return;
}

if (entry.remain === 0) {
setImmediate(() => {
this.emitFirstProgress(entry);
this.emitProgress(entry);
});
return;
}

const originalWrite = entry.write.bind(entry);
let firstValueEmitted = false;

entry.write = data => {
const originalReturn = originalWrite(data);

if (!firstValueEmitted) {
firstValueEmitted = true;
this.emitFirstProgress(entry);
}

this.emitProgress(entry);
return originalReturn;
};
}
}));

this.cwd = options.cwd;
this.ignore = options.ignore;
this.observer = options.observer;
this.url = '';
this.responseHeaders = null;
this.responseBytes = 0;
}

emit(eventName, header, stream, originalNext) {
if (eventName !== 'entry') {
super.emit(eventName, header);
return;
}

if (this.ignore && this.ignore(join(this.cwd, join('/', header.name)), header)) {
stream.resume();
originalNext();

return;
}

super.emit('entry', header, stream, err => {
if (err) {
originalNext(err);
return;
emitProgress(entry) {
this.observer.next({
entry,
response: {
url: this.url,
headers: this.responseHeaders,
bytes: this.responseBytes
}
});
}

this.observer.next({
entry: {
header,
bytes: header.size
},
response: {
url: this.url,
headers: this.responseHeaders,
bytes: this.responseBytes
}
});
emitFirstProgress(entry) {
const originalRemain = entry.remain;
const originalBlockRemain = entry.blockRemain;

originalNext();
});
entry.remain = entry.size;
entry.blockRemain = entry.startBlockSize;
this.emitProgress(entry);
entry.remain = originalRemain;
entry.blockRemain = originalBlockRemain;
}
}

const functionOptions = new Set(['ignore', 'map', 'mapStream']);
const functionOptions = new Set(['filter', 'onwarn', 'transform']);
const priorRequestOption = {encoding: null};
const priorTarFsOption = {ignore: null};

function echo(val) {
return val;
}

const DEST_ERROR = 'Expected a path where downloaded tar archive will be extracted';
const TAR_TRANSFORM_ERROR = '`tarTransform` option must be a transform stream ' +
'that modifies the downloaded tar archive before extracting';
const MAP_STREAM_ERROR = 'The function passed to `mapStream` option must return a stream';
const STRIP_ERROR = 'Expected `strip` option to be a non-negative integer (0, 1, ...) ' +
'that specifies how many leading components from file names will be stripped';

Expand Down Expand Up @@ -163,91 +158,36 @@ module.exports = function dlTar(...args) {
}
}

if (options.tarTransform !== undefined) {
if (!isStream(options.tarTransform)) {
throw new TypeError(`${TAR_TRANSFORM_ERROR}, but got a non-stream value ${inspect(options.tarTransform)}.`);
}

if (!isStream.transform(options.tarTransform)) {
throw new TypeError(`${TAR_TRANSFORM_ERROR}, but got a ${
['duplex', 'writable', 'readable'].find(type => isStream[type](options.tarTransform))
} stream instead.`);
}
if (options.onentry !== undefined) {
throw new Error('`dl-tar` does not support `onentry` option.');
}
}

const extractStream = new InternalExtract({
cwd: dest,
ignore: options.ignore,
observer
});
const mapStream = options.mapStream || echo;
const fileStreams = [];
let ended = false;
let cancel;

const fsExtractStream = fsExtract(dest, Object.assign({
extract: extractStream,
fs: gracefulFs,
strip: 1
}, options, {
mapStream(fileStream, header) {
const newStream = mapStream(fileStream, header);

if (!isStream.readable(newStream)) {
fsExtractStream.emit(
'error',
new TypeError(`${MAP_STREAM_ERROR}${
isStream(newStream) ?
' that is readable, but returned a non-readable stream' :
`, but returned a non-stream value ${inspect(newStream)}`
}.`)
);

fileStreams.push(fileStream);
return new PassThrough();
}

let bytes = 0;
fileStreams.push(newStream);

if (header.size !== 0) {
observer.next({
entry: {header, bytes},
response: {
url: extractStream.url,
headers: extractStream.responseHeaders,
bytes: extractStream.responseBytes
}
});
}

return newStream.pipe(new DestroyableTransform({
transform(chunk, encoding, cb) {
bytes += chunk.length;

if (bytes !== header.size) {
observer.next({
entry: {header, bytes},
response: {
url: extractStream.url,
headers: extractStream.responseHeaders,
bytes: extractStream.responseBytes
}
});
}

cb(null, chunk);
Promise.all([
loadRequestFromCwdOrNpm(),
new Promise((resolve, reject) => {
mkdirp(dest, err => {
if (err) {
reject(err);
return;
}
}));
}
}, priorTarFsOption));

loadRequestFromCwdOrNpm().then(request => {
resolve();
});
})
]).then(([request]) => {
if (ended) {
return;
}

const unpackStream = new InternalUnpack(Object.assign(options, {
cwd: dest,
observer
}));

const pipe = [
request(Object.assign({url}, options, priorRequestOption))
.on('response', function(response) {
Expand All @@ -260,22 +200,18 @@ module.exports = function dlTar(...args) {
response.headers['content-length'] = Number(response.headers['content-length']);
}

extractStream.url = response.request.uri.href;
extractStream.responseHeaders = response.headers;
unpackStream.url = response.request.uri.href;
unpackStream.responseHeaders = response.headers;
}),
new Transform({
transform(chunk, encoding, cb) {
extractStream.responseBytes += chunk.length;
unpackStream.responseBytes += chunk.length;
cb(null, chunk);
}
}),
fsExtractStream
unpackStream
];

if (options.tarTransform) {
pipe.splice(2, 0, options.tarTransform);
}

cancel = cancelablePump(pipe, err => {
ended = true;

Expand Down

0 comments on commit 3b4c47c

Please sign in to comment.