jcanny edited this page Mar 9, 2015 · 35 revisions

Table of Contents

BIDMach Library Documentation

New: Tutorials in IScala

New: API Docs


BIDMach is a very fast tool for machine learning, from small problems to terabyte scale. BIDMach is currently the fastest system for many common machine learning tasks (see the benchmarks section). In fact on a single GPU-equipped node, BIDMach outperforms the fastest cluster systems running on up to a few hundred nodes. BIDMach also scales well. BIDMach streams data off disk and is not memory-limited. With a large RAID, BIDMach has run topic models with hundreds of topics on several terabytes of data. We are aware of no other system able to solve that problem at comparable scale.

BIDMach is built on a sister library called BIDMat which provides an efficient, interactive matrix layer.

Installing and Running



BIDMach's Architecture

Machine Learning Models

Causal Inference

Data Wrangling

Data Sources