An interactive tool for exploring large, tabular datasets.
JavaScript Other
Clone or download
Latest commit c81ed73 Jul 1, 2016

README.md

Datacomb

An interactive tool for analyzing, exploring and combing through tabular datasets. by @ChrisPolis

Turn your data into: live demo datacomb preview

Usage

As an htmlwidget in R

devtools::install_github('cmpolis/datacomb', subdir='pkg', ref='1.1.2');
library(datacomb);
Datacomb(iris)

In a browser, with JavaScript:

to build: $ npm install && npm run build

//
//
// Sample usage of Datacomb (see also: /demo/demo.js)

// Column definitions, meta data
var columns = [
  {
    label: 'Team',
    accessor: 'team',

    // columns that are not quantitative need `type` flag
    type: 'discrete'
  },
  {
   label: 'Pos',
   accessor: 'pos',
   type: 'discrete',
   sortOrder: 'PG SG SF PF C'.split(' ')
  },
  {
    label: 'Points',
    accessor: 'pts'
  },
  {
    label: 'Minutes',
    accessor: 'mp'
  },
  {
    label: 'Pts / Min',

    // column defenitions can be functions
    accessor: function(d) { return d.pts / d.mp },

    // can define `format` function to change how text is displayed on the tbale
    format: function(val) { return val.toFixed(3) + 'pts/min'; },
  }
];

// init the interface
var myDatacomb = new Datacomb({

  //
  el: document.getElementById('datacomb-target'),

  // array of objects
  data: [ {name: 'Hank', team: 'Liverpool', points: 3 }, { ... }, ... ],

  //
  columns: columns,

  //
  labelAccessor: 'name'

});

Catalog of Interactions

Hover over rows to reveal exact values

hover

Sort by column(s)

sort

Filter rows visually with a slider or by specifiying exact bounds

filter

Click and drag to select rows to focus

focus

Show only selected rows to analyze a subset

focusonly

View distribution data for each column

histogram

View summary statistics for each column

summary

View relationships between columns by creating scatter plots of a column and all other columns

scatter

Group rows by discrete dimensions

group

Contributing

Pull requests welcomed! However, please try to mention or ask about it as an issue to make sure what you are working on will be merged in and is not already in progress.

$ npm install
$ npm run build
$ npm run serve
$ open http://localhost:5050/demo/

Testing

$ npm test

Resources

Blog post, demo of prototype/old version: http://www.bytemuse.com/post/data-comb-visualization/

R Package(CRAN!): https://github.com/mtennekes/tabplot

Table Lens Paper: https://www.cs.ubc.ca/~tmm/courses/cpsc533c-04-fall/readings/tablelens.pdf

Demo dataset sources:

Status, project todo, notes

  • [IN PROGRESS] v1/prototype:
    • πŸ‘ project setup: can build, test, view in browser...
    • πŸ‘ (https://github.com/cmpolis/smart-table-scroll) table row reuse (minimize # of <.row> DOM elements)
    • πŸ‘ table layout and properly sized bars
    • πŸ‘ hover interaction
    • πŸ‘ click interaction
    • πŸ‘ drag interaction
    • πŸ‘ filtering
    • πŸ‘ sorting
    • πŸ‘ scatter plots (canvas)
    • πŸ‘ histograms
    • πŸ‘ summary statistics
    • πŸ‘ grouping (by discrete dimensions)
    • πŸ‘ coloring (by discrete dimensions)
  • v2
    • ❌ expandable(full screen?) scatter plots
    • ❌ regressions in scatter plots
    • ❌ dynamic column addition, removal
    • ❌ custom column widths
    • ❌ functional column definitions from ui: eg: areaCol: ${height} * ${width}
    • ❌ axis labels
    • ❌ log scaling
    • ❌ quantize columns (continuous dim -> discrete dim)
    • ❌ illustrate filter response on histograms
    • ❌ illustrate filter response on scatter plots
    • ❌ close/expand groupings in table
    • 🚧 keyboard shortcuts
  • 🚧 HTMLWidget/R package
  • ❌ Serializable table configuration format. JSON?
  • ❌ Natural language/DSL mode for table configuration, querying

Released under the MIT License.