Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

init

  • Loading branch information...
commit 270511a7f8272105fc517a3ac1862a0002de01bb 0 parents
Michael Hausenblas authored
Showing with 6,665 additions and 0 deletions.
  1. +1 −0  .gitignore
  2. +43 −0 README.md
  3. +6,621 −0 apache-big-data-cheat-sheet.graffle/data.plist
  4. BIN  apache-big-data-cheat-sheet.graffle/image1.tiff
  5. BIN  apache-big-data-cheat-sheet.graffle/image10.tiff
  6. BIN  apache-big-data-cheat-sheet.graffle/image11.tiff
  7. BIN  apache-big-data-cheat-sheet.graffle/image12.tiff
  8. BIN  apache-big-data-cheat-sheet.graffle/image13.tiff
  9. BIN  apache-big-data-cheat-sheet.graffle/image14.tiff
  10. BIN  apache-big-data-cheat-sheet.graffle/image15.tiff
  11. BIN  apache-big-data-cheat-sheet.graffle/image19.tiff
  12. BIN  apache-big-data-cheat-sheet.graffle/image2.tiff
  13. BIN  apache-big-data-cheat-sheet.graffle/image20.tiff
  14. BIN  apache-big-data-cheat-sheet.graffle/image21.tiff
  15. BIN  apache-big-data-cheat-sheet.graffle/image22.tiff
  16. BIN  apache-big-data-cheat-sheet.graffle/image23.tiff
  17. BIN  apache-big-data-cheat-sheet.graffle/image24.tiff
  18. BIN  apache-big-data-cheat-sheet.graffle/image25.tiff
  19. BIN  apache-big-data-cheat-sheet.graffle/image26.tiff
  20. BIN  apache-big-data-cheat-sheet.graffle/image27.tiff
  21. BIN  apache-big-data-cheat-sheet.graffle/image29.tiff
  22. BIN  apache-big-data-cheat-sheet.graffle/image30.tiff
  23. BIN  apache-big-data-cheat-sheet.graffle/image4.tiff
  24. BIN  apache-big-data-cheat-sheet.graffle/image5.tiff
  25. BIN  apache-big-data-cheat-sheet.graffle/image7.tiff
  26. BIN  apache-big-data-cheat-sheet.graffle/image8.tiff
  27. BIN  apache-big-data-cheat-sheet.graffle/image9.tiff
  28. BIN  apache-big-data-cheat-sheet.pdf
1  .gitignore
@@ -0,0 +1 @@
+.DS_Store
43 README.md
@@ -0,0 +1,43 @@
+# Apache Big Data projects
+
+Based on and motivated by the following resources:
+
+* Apache [project list](http://projects.apache.org/indexes/category.html)
+* Edd Dumbill's [What is Apache Hadoop?](http://strata.oreilly.com/2012/02/what-is-apache-hadoop.html)
+* Edd Dumbill's [The SMAQ stack for big data](http://strata.oreilly.com/2010/09/the-smaq-stack-for-big-data.htm)
+* My [Interactive analysis of large-scale datasets](http://webofdata.wordpress.com/2012/09/02/large-scale-interactive-analysis/) post
+
+## Top-level
+
+* Accumulo, http://accumulo.apache.org/ - a sorted, distributed key/value store
+* Cassandra, http://cassandra.apache.org/ - column-oriented database
+* Cayenne, http://cayenne.apache.org/ - object-relational mapping (ORM) and remoting services
+* CouchDB, http://couchdb.apache.org/ - NoSQL document-oriented datastore
+* Gora, http://gora.apache.org/ - provides an in-memory data model and persistence for big data
+* Hadoop, http://hadoop.apache.org/ - a distributed computing platform:
+ * HDFS - distributed redundant file system for Hadoop
+ * MapReduce - parallel computation on server clusters
+* HBase, http://hbase.apache.org/ - column-oriented database on top of Hadoop
+* Hive, http://hive.apache.org/ - data warehouse with SQL-like access
+* Flume, http://flume.apache.org/ - collection and import of log and event data
+* Lucene, http://lucene.apache.org/ - indexing
+* Mahout, http://mahout.apache.org/ - library of machine learning and data mining algorithms on top of Hadoop
+* Pig, http://pig.apache.org/ - high-level programming language for Hadoop computations
+* Oozie, http://oozie.apache.org/ - orchestration and workflow management for Hadoop
+* Solr, http://lucene.apache.org/solr/ - Lucene-based enterprise search platform
+* Sqoop, http://sqoop.apache.org/ - imports data from relational databases into Hadoop
+* Whirr, http://whirr.apache.org/ - cloud-agnostic deployment of clusters
+* Zookeeper, http://zookeeper.apache.org/ - configuration management and coordination
+
+## Incubator
+
+* Ambari, http://incubator.apache.org/ambari/ - deployment, configuration and monitoring of Hadoop clusters
+* Blur, http://incubator.apache.org/blur/ - search platform for searching massive amounts of data in a cloud computing environment
+* Chukwa, http://incubator.apache.org/chukwa/ - log collection and analysis framework for Apache Hadoop clusters
+* Crunch, http://incubator.apache.org/crunch/ - a Java library for writing, testing, and running pipelines of MapReduce jobs
+* Drill, http://incubator.apache.org/drill/ - interactive analysis of large-scale data
+* HCatalog, http://incubator.apache.org/hcatalog/ - schema and data type sharing over Pig, Hive and MapReduce
+* Kafka, http://incubator.apache.org/kafka/ - distributed publish-subscribe messaging system
+* Mesos, http://incubator.apache.org/mesos/ - a cluster manager that provides resource sharing and isolation across cluster applications
+* S4, http://incubator.apache.org/s4/ - distributed platform for processing continuous unbounded streams of data
+* Tashi, http://incubator.apache.org/tashi/ - infrastructure for service providers to build applications harnessing cluster computing resources to efficiently access repositories of rich data
6,621 apache-big-data-cheat-sheet.graffle/data.plist
6,621 additions, 0 deletions not shown
BIN  apache-big-data-cheat-sheet.graffle/image1.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image10.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image11.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image12.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image13.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image14.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image15.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image19.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image2.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image20.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image21.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image22.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image23.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image24.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image25.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image26.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image27.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image29.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image30.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image4.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image5.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image7.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image8.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.graffle/image9.tiff
Binary file not shown
BIN  apache-big-data-cheat-sheet.pdf
Binary file not shown
Please sign in to comment.
Something went wrong with that request. Please try again.