A framework to organize automated data collection-and-processing pipelines.
Erlang
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
apps
examples
.gitignore
LICENSE
Makefile
README.md
rebar
rebar.config

README.md

Data Mill

These ... machines...perform every necessary movement of the grain, and meal, from one part of the mill to another, and from one machine to another, through all the various operations, from the time the grain is emptied from the wagoner's bag....until completely manufactured into flour...without the aid of manual labor, excepting to set the different machines in motion.

-- [Oliver Evans] (http://en.wikipedia.org/wiki/Oliver_Evans), inventor of the modern science of [handling materials] ( http://en.wikipedia.org/wiki/Bulk_material_handling).

Description

A framework to organize automated data collection-and-processing pipelines.

Development Status

Prototyping. Not ready for use.

Architectural Overview

  • External componenets

  • Internal componenets

    • [stacker] (http://en.wikipedia.org/wiki/Stacker)

      • CLIENT: lives on target SOURCE machines and stacks raw outputs
      • Executed via cron or as a daemon (not yet decided)
      • Executes data-collection commands, then compresses and pushes their outputs to the SERVER via SSH or UDP (depending on sensitivity of data), using filenames for meta-data.
    • [reclaimer] (http://en.wikipedia.org/wiki/Reclaimer)

      • SERVER: lives on mother-ship machine(s) and sorts through stacked outputs, dispatching processing and delivery
      • Executed as a daemon
      • Receives raw output files from CLIENT (SSH or UDP)
      • Dispatches processing of raw outputs through appropriate PLUGINs
      • Delivers extracted data to final consumers for presentation and/or analysis (Graphite, Cacti, Munin, a data scientist, etc).