Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Loki Zip Adapter #758

Closed
jlarmstrongiv opened this issue Apr 8, 2019 · 4 comments
Closed

Add Loki Zip Adapter #758

jlarmstrongiv opened this issue Apr 8, 2019 · 4 comments
Labels

Comments

@jlarmstrongiv
Copy link

jlarmstrongiv commented Apr 8, 2019

For those times when database size is a concern.

Sample Adapter

Initialize Database

    // initialize adapter
    const adapter = new lfsza();
    // create database
    const db = new loki('sandbox.db', {
      adapter,
      autoload: true,
      autoloadCallback: () => {  },
    });

loki-fs-zip-adapter.js

/*
  Loki (node) fs zip Adapter (need to require this script to instance and use it).

  This adapter will save database container and each collection to separate files inside a zip.
  It is also designed to use a destructured serialization method
  intended to lower the memory overhead of json serialization.

  This adapter utilizes ES6 generator/iterator functionality to stream output and
  uses node linereader module to stream input.  This should lower memory pressure
  in addition to individual object serializations rather than loki's default deep object
  serialization.
*/

(function (root, factory) {
  if (typeof define === 'function' && define.amd) {
    // AMD
    define([], factory);
  } else if (typeof exports === 'object') {
    // Node, CommonJS-like
    module.exports = factory();
  } else {
    // Browser globals (root is window)
    root.LokiFsZipAdapter = factory();
  }
}(this, function () {
  return (function() {

    const fs = require('fs');
    const readline = require('readline');
    const stream = require('stream');

    const path = require('path');
    const yazl = require('yazl');
    const yauzl = require('yauzl');
    const from2 = require('from2');

    /**
     * Loki structured (node) filesystem adapter class.
     *     This class fulfills the loki 'reference' abstract adapter interface which can be applied to other storage methods.
     *
     * @constructor LokiFsZipAdapter
     *
     */
    function LokiFsZipAdapter()
    {
      this.mode = 'reference';
      this.dbref = null;
      this.dirtyPartitions = [];
    }

    /**
     * Generator for constructing lines for file streaming output of db container or collection.
     *
     * @param {object=} options - output format options for use externally to loki
     * @param {int=} options.partition - can be used to only output an individual collection or db (-1)
     *
     * @returns {string|array} A custom, restructured aggregation of independent serializations.
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.generateDestructured = function*(options) {
      var idx, sidx;
      var dbcopy;

      options = options || {};

      if (!options.hasOwnProperty('partition')) {
        options.partition = -1;
      }

      // if partition is -1 we will return database container with no data
      if (options.partition === -1) {
        // instantiate lightweight clone and remove its collection data
        dbcopy = this.dbref.copy();

        for (idx = 0; idx < dbcopy.collections.length; idx++) {
          dbcopy.collections[idx].data = [];
        }

        yield dbcopy.serialize({ serializationMethod: 'normal', });

        return;
      }

      // 'partitioned' along with 'partition' of 0 or greater is a request for single collection serialization
      if (options.partition >= 0) {
        var doccount,
          docidx;

        // dbref collections have all data so work against that
        doccount = this.dbref.collections[options.partition].data.length;

        for (docidx = 0; docidx < doccount; docidx++) {
          yield JSON.stringify(this.dbref.collections[options.partition].data[docidx]);
        }
      }
    };

    /**
     * Loki persistence adapter interface function which outputs un-prototype db object reference to load from.
     *
     * @param {string} dbname - the name of the database to retrieve.
     * @param {function} callback - callback should accept string param containing db object reference.
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.loadDatabase = function(dbname, callback) {
      var self = this;
      this.dbref = null;

      // ensure filename has zip file extension
      var filename = dbname;
      if (dbname.split('.').pop() !== 'zip') {
        filename = dbname + '.zip';
      }

      fs.stat(filename, function(err, stats) {
        if (!err && stats.isFile()) {

          yauzl.open(filename, {
            lazyEntries: true,
            autoClose: false,
          }, function(error, zipFile) {
            if (error) throw new Error(error);

            var entries = [];

            zipFile.readEntry();

            zipFile.on('entry', function(entry) {
              if (/\/$/.test(entry.fileName)) {
              // Directory file names end with '/'.
              // Note that entires for directories themselves are optional.
              // An entry's fileName implicitly requires its parent directories to exist.
                zipFile.readEntry();
              } else {
                entries.push(entry);
                zipFile.readEntry();
              }
            });

            zipFile.on('end', function() {
              entries.sort(function(entryA, entryB) {
                // check and sort db container first
                if (isNaN(entryA.fileName.split('.').pop())) return -1;
                if (isNaN(entryA.fileName.split('.').pop())) return 1;
                // sort by collection number
                if (
                  +entryA.fileName.split('.').pop() <
                  +entryB.fileName.split('.').pop()
                ) {
                  return -1;
                } else {
                  return 1;
                }
              });

              // remove .zip from filename
              filename = filename.split('.').slice(0, -1).join('.');

              self.loadDatabaseFromEntries(filename, zipFile, entries, function() {
                // callback function
                zipFile.close();
                callback(self.dbref);
              });

            });
          });
        } else {
          // file does not exist, so callback with null
          callback(null);
        }
      });
    };

    /**
     * Utility method to load database from zip entries.
     *
     * @param {string} dbname - the name of the database to retrieve.
     * @param {ZipFile} zipFile - the zipFile to load the entries from
     * @param {Entry[]} entries - ordered file entries from the zip file
     * @param {function} callback - callback should accept string param containing db object reference.
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.loadDatabaseFromEntries = function(dbname, zipFile, entries, callback)
    {
      var instream,
        outstream,
        rl,
        self = this;

      zipFile.openReadStream(entries.shift(), function(error, readStream) {
        if (error) throw new Error(error);

        instream = readStream;
        outstream = new stream();
        rl = readline.createInterface(instream, outstream);

        // first, load db container component
        rl.on('line', function(line) {
          // it should single JSON object (a one line file)
          if (self.dbref === null && line !== '') {
            self.dbref = JSON.parse(line);
          }
        });

        // when that is done, examine its collection array to sequence loading each
        rl.on('close', function() {
          if (self.dbref.collections.length > 0) {
            self.loadNextCollection(dbname, zipFile, entries, 0, callback);
          }
        });

      });
    };

    /**
     * Recursive function to chain loading of each collection one at a time.
     * If at some point i can determine how to make async driven generator, this may be converted to generator.
     *
     * @param {string} dbname - the name to give the serialized database within the catalog.
     * @param {ZipFile} zipFile - the zipFile to load the entries from
     * @param {Entry[]} entries - ordered file entries from the zip file
     * @param {int} collectionIndex - the ordinal position of the collection to load.
     * @param {function} callback - callback to pass to next invocation or to call when done
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.loadNextCollection = function(dbname, zipFile, entries, collectionIndex, callback) {
      var instream,
        outstream,
        rl,
        self = this,
        obj;

      zipFile.openReadStream(entries[collectionIndex], function(error, readStream) {
        if (error) throw new Error(error);

        instream = readStream;
        outstream = new stream();
        rl = readline.createInterface(instream, outstream);

        rl.on('line', function (line) {
          if (line !== '') {
            obj = JSON.parse(line);
            self.dbref.collections[collectionIndex].data.push(obj);
          }
        });

        rl.on('close', function (line) {
          instream = null;
          outstream = null;
          rl = null;
          obj = null;

          // if there are more collections, load the next one
          if (++collectionIndex < self.dbref.collections.length) {
            self.loadNextCollection(dbname, zipFile, entries, collectionIndex, callback);
          }
          // otherwise we are done, callback to loadDatabase so it can return the new db object representation.
          else {
            callback();
          }
        });

      });
    };

    /**
     * Generator for yielding sequence of dirty partition indices to iterate.
     *
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.getPartition = function*() {
      var idx,
        clen = this.dbref.collections.length;

      // since database container (partition -1) doesn't have dirty flag at db level, always save
      yield -1;

      // yield list of dirty partitions for iterateration
      for (idx = 0; idx < clen; idx++) {
        // Disable if(dirty) since all files need to be saved to zip
        // if (this.dbref.collections[idx].dirty) {
        //   yield idx;
        // }
        yield idx;
      }
    };

    /**
     * Loki reference adapter interface function.  Saves structured json via loki database object reference.
     *
     * @param {string} dbname - the name to give the serialized database within the catalog.
     * @param {object} dbref - the loki database object reference to save.
     * @param {function} callback - callback passed obj.success with true or false
     * @memberof LokiFsZipAdapter
     */
    LokiFsZipAdapter.prototype.exportDatabase = function(dbname, dbref, callback)
    {
      var idx;

      this.dbref = dbref;

      // create (dirty) partition generator/iterator
      var pi = this.getPartition();

      var zipFile = new yazl.ZipFile();

      // ensure filename has .zip extension
      var filename = dbname;
      if (dbname.split('.').pop() !== 'zip') {
        filename = dbname + '.zip';
      }

      zipFile.outputStream.pipe(fs.createWriteStream(filename)).on('close', function() {
        callback(null);
      });

      this.saveNextPartition(dbname, zipFile, pi, function() {
        zipFile.end();
      });

    };

    /**
     * Utility method for queueing one save at a time
     * @param {ZipFile} zipFile - the zip to contain the database files
     */
    LokiFsZipAdapter.prototype.saveNextPartition = function(dbname, zipFile, pi, callback) {
      var li;
      // default filename to dbname
      var filename = dbname;
      var self = this;
      var pinext = pi.next();

      if (pinext.done) {
        callback();
        return;
      }

      // ensure packaged db files do not have the .zip extension
      if (dbname.split('.').pop() === 'zip') {
        filename = dbname.split('.').slice(0, -1).join('.');
      }
      // db container (partition -1) uses just dbname (without zip extension) for filename,
      // otherwise append collection array index to filename
      filename = filename + ((pinext.value === -1) ? '' : ('.' + pinext.value));
      // remove folders from metadataPath
      filename = path.parse(filename).base;

      var createRstream = function(li) {
        return from2({ objectMode: true, }, function(size, next) {
          // iterate each of the lines generated by function* generateDestructured()
          var outline = li.next();
          if (!outline.done) {
            return next(null, outline.value + '\n');
          }
          return next(null, null);
        });
      };

      li = this.generateDestructured({ partition: pinext.value, });

      var rstream = createRstream(li);

      zipFile.addReadStream(rstream, filename);

      rstream.on('end', function() {
        self.saveNextPartition(dbname, zipFile, pi, callback);
      });
    };

    /**
     * deleteDatabase() - delete the database file, will throw an error if the
     * file can't be deleted
     * @param {string} dbname - the filename of the database to delete
     * @param {function} callback - the callback to handle the result
     * @memberof LokiFsSyncAdapter
     */
    LokiFsZipAdapter.prototype.deleteDatabase = function deleteDatabase(dbname, callback) {
      // ensure filename has zip file extension
      var filename = dbname;
      if (dbname.split('.').pop() !== 'zip') {
        filename = dbname + '.zip';
      }

      fs.unlink(filename, function(error) {
        if (error) return callback(error);
        callback();
      });
    };

    return LokiFsZipAdapter;

  }());
}));

EDIT
Node 10 is adding a core readline module. Require readline from linebyline.

@jlarmstrongiv
Copy link
Author

Just a few follow-up questions.

  • Does this adapter have the potential to be added to the core?

I see two main practical applications for this adapter. First, on low memory devices, storing the database in a zip file helps save space. Second, on AWS Lambdas and other serverless technologies, this adapter is perfect for read-only databases offering flexibility, negligible data transfer costs, and, of course, saving valuable space.

  • How can I make this adapter compatible with LokiDB?

I’m still using LokiJS while LokiDB is in beta. However, LokiDB is written in TypeScript with several upgrades and structural changes.

I would go looking for the fs-structured adapter in LokiDB, but I either could not find it or it has not been ported. If the fs-structured adapter is in the backlog, I can just wait and use that as a sample. If not, I may need additional help or documentation.

@stale
Copy link

stale bot commented Jul 26, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 26, 2019
@jlarmstrongiv
Copy link
Author

Not stale, bump.

@stale stale bot removed the wontfix label Jul 26, 2019
@stale
Copy link

stale bot commented Sep 24, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant