Clojure Houston User Group - December 2012 meetup on Cascalog
Looking for a map reduce language - blog post that compares plain MapReduce in Java, Pig, Hive, Cascalog, and a few others
- Storm - "Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more"
- ZooKeeper - "ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services."
- HBase - Hadoop database. Modelled after Google Bigtable.
- Hive - Data warehousing on Hadoop. SQL-like querying.
- HUE - UI for Hadoop. Browser for HDFS, Hive and Impalla UI, etc
- Apache Bigtop - packaging of various Hadoop projects. Offshoot of Cloudera's distro?
- Cloudera Impala - realtime queries in Hadoop; alternative to Mapreduce
- Apache Ambari - "a web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters"
- The Google File System
- MapReduce: Simpliﬁed Data Processing on Large Clusters
- Bigtable: A Distributed Storage System for Structured Data
- Dremel: Interactive Analysis of Web-Scale Datasets