a DataFrame for Javascript.
crunch numbers in Node or the Browser
- Interactive performance (<100ms) on millions of rows
- Syntax similar to SQL and Pandas
- Compatible with
PapaParse
andBabyParse
Parse the Iris
dataset (with BabyParse
) and create a Frame
from the result.
var baby = require('babyparse'),
Frame = require('frame');
// parse the csv file
config = {"header" :true, "dynamicTyping" : true, "skipEmptyLines" : true};
iris = baby.parseFiles('iris.csv', config).data;
// create a frame from the parsed results
frame = new Frame(iris);
Group on Species
and find the average value (mean
) for Sepal.Length
.
g = frame.groupby("Species");
g.mean("Sepal.Length");
{ "virginica": 6.58799, "versicolor": 5.9360, "setosa": 5.006 }
Using the same grouping, find the average value for Sepal.Width
.
g.mean("Sepal.Width");
{ "virginica": 2.97399, "versicolor": 2.770, "setosa": 3.4279 }
Filter by Species
value virginica
then find the average.
f = frame.where("Species", "virginica");
f.mean("Sepal.Length");
6.58799
Get the number of rows that match the filter.
f.count();
50
Columns can also be accessed directly (with the filter applied).
f["Species"]
["virginica", "virginica", "virginica", ..., "virginica"]
Hundreds of tests verify correctness on millions of data points (against a Pandas reference).
npm run data && npm run test
npm run bench
typical performance on one million rows
operation | time |
---|---|
groupby |
54ms |
where |
29ms |
sum |
5ms |
- compatibility with feather
- pandas
- R
- Linq
- rethinkDB
- Matlab