Skip to content

charto/cget

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cget

build status dependency status npm version

cget is a robust streaming parallel download manager with a filesystem cache and a simple API.

Features

  • Promise-based API, returns HTTP headers and a Node.js stream with contents.
  • Filesystem cache mirrors remote hosts and their directory structure.
    • Easy to bypass cget and look at cached files.
  • Stores headers in separate .header.json files.
  • Caches HTTP errors to avoid repeating failing requests.
  • Limits concurrent downloads automatically using cwait.
  • Follows and caches redirect headers.
  • Built on top of request.
  • Optionally allow streaming from file:// URLs, bypassing the cache.
  • Add arbitrary files in the cache with any URI (URL or URN) as the key.
  • Written in TypeScript.

cget is perfect for downloading and caching various schema files, and is used in cxsd

Usage

Cached downloads

var Cache = require('cget').Cache;

// Store files in "cache" subdirectory next to this script.
var basePath = require('path').join(__dirname, 'cache');

// Initialize the download cache.
var cache = new Cache(basePath, {

  // Allow up to 2 parallel downloads.
  concurrency: 2

});

// Download a web page and print some info.

cache.fetch('http://www.google.com/').then(function(result) {

  console.log('Remote address:   ' + result.address.url);
  console.log('Local cache path: ' + result.address.path);
  console.log('HTTP status code: ' + result.status + ' ' + result.message);

  console.log('Headers:');
  console.log(result.headers);

  console.log('Content:');
  result.stream.pipe(process.stdout);

});

Running it the first time prints and saves the downloaded file and its headers including any redirects in local files, for example:

  • cache/www.google.com.header.json
  • cache/www.google.<COUNTRY>/<NONCE>
  • cache/www.google.<COUNTRY>/<NONCE>.header.json

The second time it prints the exact same output, but without needing a network connection.

Caching arbitrary files

The store method supports caching a string with any URI (URL or URN) as the key:

var cache = new (require('cget').Cache)();

cache.store('urn:x-inspire:specification:gmlas:GeographicalNames:3.0', 'Some data');

cache.store('http://inspire.ec.europa.eu/schemas/ad/4.0', 'More data');

License

The MIT License

Copyright (c) 2015-2017 BusFaster Ltd