Skip to content

dstructs/readtable

Repository files navigation

readtable

NPM version Build Status Coverage Status Dependencies

Reads a tabular file into a DataFrame.

Installation

$ npm install dstructs-readtable

Usage

var readTable = require( 'dstructs-readtable' );

readTable( file[, options] )

Reads a tabular file synchronously and turns its contents into a DataFrame.

/*
	FILE: data.csv
		Name,Age,Slogan
		Emily,24,"I am the best"
		Bernhard,32,"How are you doing, stranger?"
		Herbert,65,"Retired guy"
		Susy,42,"Vegetarian"
*/

var out = readTable( __dirname + '/data.csv' );
// returns DataFrame with four rows and three columns

The function accepts the following options:

  • separator: String indicating which character is used in input file to split fields. Default: ,.
  • quotemark: String indicating type of quotemarks used to denote how quotations are encoded. Default: ".
  • header: Boolean indicating whether file contains the variable names in its first line. Default: true.

By default, it is assumed that the columns of the first line contain the names of the variables of the data frame. If this is not the case, set the header option to false.

/*
	FILE: data.csv
		Emily,24,"I am the best"
		Bernhard,32,"How are you doing, stranger?"
		Herbert,65,"Retired guy"
		Susy,42,"Vegetarian"
*/

var out = readTable( __dirname + '/data.csv', {
	header: false
});
// returns DataFrame with four rows and three columns

To specify a custom separator, use the separator option. If no separator option is specified, the function tries to infer the used encoding from the filename extension (e.g. a tsv file uses a tab to separate fields, csv files use commas).

var out;
/*
	FILE: data.tsv
		Name	Age	Slogan
		Emily	24	"I am the best"
		Bernhard	32	"How are you doing, stranger?"
		Herbert	65	"Retired guy"
		Susy	42	"Vegetarian"
*/

// Without specifying separator:
out = readTable( __dirname + '/data.tsv' );
// returns DataFrame with four rows and three columns

// Explicitly specifying used separator:
out = readTable( __dirname + '/data.tsv', {
	'separator': '\t'
});
// returns DataFrame with four rows and three columns

Separators inside of quotations are escaped and not used in splitting input fields. The quotemark used to denote quotations defaults to ", but can be set via the quotemark option:

var out;
/*
	FILE: data.csv
		Name,Age,Slogan
		Emily,24,'I am the best'
		Bernhard,32,'How are you doing, stranger?'
		Herbert,65,'Retired guy'
		Susy,42,'Vegetarian'
*/

/*
	Without specifying a custom quotemark, an error would be thrown
	as the third row contains more commas than the rest.
	However, correctly specifying the quotemark
	results in the last comma being escaped as it is placed inside a quotation. 
*/
out = readTable( __dirname + '/data.csv', {
	'quotemark': '\''
});
// returns DataFrame with four rows and three columns

readTable.async( file[, options][, callback] )

Reads a tabular file asynchronously and turns it into a DataFrame. Upon completion, the function calls the supplied callback function with two arguments: if an error is emitted, the first argument is an error object, otherwise it is null. If successful, the second argument is the resulting data frame.

/*
	FILE: data.csv
		Name,Age,Slogan
		Emily,24,"I am the best"
		Bernhard,32,"How are you doing, stranger?"
		Herbert,65,"Retired guy"
		Susy,42,"Vegetarian"
*/
var out = readTable( __dirname + '/data.csv', function( err, res ) {
	var df = res;
	// DataFrame with four rows and three columns
});

The asynchronous version accepts the same options as the synchronous version of the function. If an options object is supplied, the third argument should be the callback function.

/*
	FILE: data.tsv
		Name	Age	Slogan
		Emily	24	"I am the best"
		Bernhard	32	"How are you doing, stranger?"
		Herbert	65	"Retired guy"
		Susy	42	"Vegetarian"
*/
var out = readTable( __dirname + '/data.tsv', {
	'separator': '\t'
}, function( err, res ) {
	var df = res;
	// DataFrame with four rows and three columns
});

Examples

var readTable = require( 'dstructs-readtable' ),
	out;

// With header row:
out = readTable( __dirname + '/data.csv' );

// Without header row:
out = readTable( __dirname + '/data.csv', {
	'header': false
});

To run the example code from the top-level application directory,

$ node ./examples/index.js

Tests

Unit

Unit tests use the Mocha test framework with Chai assertions. To run the tests, execute the following command in the top-level application directory:

$ make test

All new feature development should have corresponding unit tests to validate correct functionality.

Test Coverage

This repository uses Istanbul as its code coverage tool. To generate a test coverage report, execute the following command in the top-level application directory:

$ make test-cov

Istanbul creates a ./reports/coverage directory. To access an HTML version of the report,

$ make view-cov

License

MIT license.

Copyright

Copyright © 2015. The Compute.io Authors.

About

Reads a tabular file into a DataFrame.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published