Permalink
Browse files

Update docs reflecting 0.22

  • Loading branch information...
1 parent bcf0722 commit 99e47aca8b17f6fa95491ccd00a93bac516af32a Brendan W. McAdams committed Mar 6, 2012
@@ -71,13 +71,17 @@ <h1 id="Building+the+Adapter">Building the Adapter</h1><p>The Mongo-Hadoop adapt
target compiles <em>ALL</em> Modules, including Streaming.
</p><ul><li>cdh3
</li><li>Maven artifact: “org.mongodb” / “mongo-hadoop_cdh3u3”
-</li></ul><h4 id="Apache+Hadoop+0.23">Apache Hadoop 0.23</h4><p>This is an alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. Hadoop 0.23 is also the basis for Cloudera’s CDH4 Beta. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2. Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
+</li></ul><h4 id="Apache+Hadoop+0.23">Apache Hadoop 0.23</h4><p>(Currently building against 0.23.1)
+</p><p>This is an alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. Hadoop 0.23 is also the basis for Cloudera’s CDH4 Beta. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2. Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
</p><ul><li>0.23
</li><li>0.23.x
-</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_0.23.0
+</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_0.23.1
</li></ul><h4 id="Cloudera+Release+4+%28Beta+1%29">Cloudera Release 4 (Beta 1)</h4><p>This is the latest beta of Cloudera’s distribution, based upon the 0.23 alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2.Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
</p><ul><li>cdh4
</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_cdh4b1”
+</li></ul><h3 id="Apache+Hadoop+0.22.0">Apache Hadoop 0.22.0</h3><p>This includes Pig 0.9.1 and Hadoop Streaming.
+</p><ul><li>0.22
+</li><li>0.22.0
</li></ul><h4 id="Apache+Hadoop+0.21.0">Apache Hadoop 0.21.0</h4><p>This includes Pig 0.9.1 and Hadoop Streaming.
</p><ul><li>0.21
</li><li>0.21.x
View
@@ -39,7 +39,7 @@
<h4 class="toctitle">Contents</h4>
<div class="tocbody">
<div><a href="#MongoDB%2BHadoop+Connector">MongoDB+Hadoop Connector</a></div><ol class="toc"> <li><div><a href="#Frequently+Asked+Questions">Frequently Asked Questions</a></div></li><li><div><a href="#Getting+Started">Getting Started</a></div><ol class="toc"> <li><div><a href="#Building+the+Adapter">Building the Adapter</a></div></li><li><div><a href="#Configuration+%26+Behavior">Configuration &amp; Behavior</a></div></li> </ol></li><li><div><a href="#Hadoop+Streaming+Support">Hadoop Streaming Support</a></div><ol class="toc"> <li><div><a href="#Building+Hadoop+Streaming+Support">Building Hadoop Streaming Support</a></div></li> </ol></li> </ol></div></div><h1 id="MongoDB%2BHadoop+Connector">MongoDB+Hadoop Connector</h1><p><strong>CURRENT RELEASE</strong>: 1.0.0-rc1
-</p><p>The <em>Mongo+Hadoop Connector</em> (for brevitys sake, we’ll refer to it as <em>mongo-hadoop</em> in this documentation) is a series of plugins for the <a title="Apache Hadoop" href="http://apache.hadoop.org">Apache Hadoop Platform</a> to allow connectivity to <a title="MongoDB" href="http://mongodb.org">MongoDB</a>. This connectivity takes the form of allowing both reading MongoDB data into Hadoop (for use in MapReduce jobs as well as other components of the Hadoop ecosystem), as well as writing the results of Hadoop jobs out to MongoDB. A forthcoming release will also allow for reading and writing static BSON files (ala <em>mongodump / mongorestore</em>) to allow offline batching; commonly, users find this to be a beneficial feature to run analytics against backup data.
+</p><p>The <em>Mongo+Hadoop Connector</em> (for brevitys sake, we’ll often refer to it as <em>mongo-hadoop</em> in this documentation) is a series of plugins for the <a title="Apache Hadoop" href="http://apache.hadoop.org">Apache Hadoop Platform</a> to allow connectivity to <a title="MongoDB" href="http://mongodb.org">MongoDB</a>. This connectivity takes the form of allowing both reading MongoDB data into Hadoop (for use in MapReduce jobs as well as other components of the Hadoop ecosystem), as well as writing the results of Hadoop jobs out to MongoDB. A forthcoming release will also allow for reading and writing static BSON files (ala <em>mongodump / mongorestore</em>) to allow offline batching; commonly, users find this to be a beneficial feature to run analytics against backup data.
</p><p>At this time, we support the “core” Hadoop APIs (now known as <a title="Hadoop Common" href="http://hadoop.apache.org/common/">Hadoop Common</a>), in the form of <em>mongo-hadoop-core</em>. There is additionally support for other pieces of the Hadoop Ecosystem, including <a title="Apache Pig" href="http://pig.apache.org">Pig</a> for ETL and <a title="Hadoop Streaming" href="http://hadoop.apache.org/common/docs/current/streaming.html">Streaming</a> for running Mongo+Hadoop jobs with Python (future releases will support additional scripting languages such as Ruby). Although it is not dependent upon Hadoop, we also provide a connector for the <a title="Flume" href="https://github.com/cloudera/flume/wiki">Flume</a> distributed logging system.
</p><h2 id="Support">Support</h2><p><em>mongo-hadoop</em> is dependent upon the MongoDB Java Driver — currently version 2.7.3.
</p><p>Bugs &amp; Features should be tracked and requested on the <a title="MongoDB Jira" href="https://jira.mongodb.org/browse/HADOOP/">MongoDB Jira</a>. If you have questions please email the
@@ -101,13 +101,17 @@ <h4 class="toctitle">Contents</h4>
target compiles <em>ALL</em> Modules, including Streaming.
</p><ul><li>cdh3
</li><li>Maven artifact: “org.mongodb” / “mongo-hadoop_cdh3u3”
-</li></ul><h4 id="Apache+Hadoop+0.23">Apache Hadoop 0.23</h4><p>This is an alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. Hadoop 0.23 is also the basis for Cloudera’s CDH4 Beta. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2. Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
+</li></ul><h4 id="Apache+Hadoop+0.23">Apache Hadoop 0.23</h4><p>(Currently building against 0.23.1)
+</p><p>This is an alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. Hadoop 0.23 is also the basis for Cloudera’s CDH4 Beta. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2. Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
</p><ul><li>0.23
</li><li>0.23.x
-</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_0.23.0
+</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_0.23.1
</li></ul><h4 id="Cloudera+Release+4+%28Beta+1%29">Cloudera Release 4 (Beta 1)</h4><p>This is the latest beta of Cloudera’s distribution, based upon the 0.23 alpha branch of Hadoop; despite the misleading version numbers, Apache Hadoop 0.23 is “newer” than Apache Hadoop 1.0. This target compiles <em>ALL</em> modules, including Streaming and Pig 0.9.2.Note however that we <em>do not</em> support the next-generation <a href="http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> at this time; support is planned for <em>mongo-hadoop</em> v1.1.
</p><ul><li>cdh4
</li><li>Maven Artifact: “org.mongodb” / “mongo-hadoop_cdh4b1”
+</li></ul><h3 id="Apache+Hadoop+0.22.0">Apache Hadoop 0.22.0</h3><p>This includes Pig 0.9.1 and Hadoop Streaming.
+</p><ul><li>0.22
+</li><li>0.22.0
</li></ul><h4 id="Apache+Hadoop+0.21.0">Apache Hadoop 0.21.0</h4><p>This includes Pig 0.9.1 and Hadoop Streaming.
</p><ul><li>0.21
</li><li>0.21.x
@@ -32,7 +32,7 @@
</div>
<div class="span-16 prepend-1 append-1 contents">
<h1 id="MongoDB%2BHadoop+Connector">MongoDB+Hadoop Connector</h1><p><strong>CURRENT RELEASE</strong>: 1.0.0-rc1
-</p><p>The <em>Mongo+Hadoop Connector</em> (for brevitys sake, we’ll refer to it as <em>mongo-hadoop</em> in this documentation) is a series of plugins for the <a title="Apache Hadoop" href="http://apache.hadoop.org">Apache Hadoop Platform</a> to allow connectivity to <a title="MongoDB" href="http://mongodb.org">MongoDB</a>. This connectivity takes the form of allowing both reading MongoDB data into Hadoop (for use in MapReduce jobs as well as other components of the Hadoop ecosystem), as well as writing the results of Hadoop jobs out to MongoDB. A forthcoming release will also allow for reading and writing static BSON files (ala <em>mongodump / mongorestore</em>) to allow offline batching; commonly, users find this to be a beneficial feature to run analytics against backup data.
+</p><p>The <em>Mongo+Hadoop Connector</em> (for brevitys sake, we’ll often refer to it as <em>mongo-hadoop</em> in this documentation) is a series of plugins for the <a title="Apache Hadoop" href="http://apache.hadoop.org">Apache Hadoop Platform</a> to allow connectivity to <a title="MongoDB" href="http://mongodb.org">MongoDB</a>. This connectivity takes the form of allowing both reading MongoDB data into Hadoop (for use in MapReduce jobs as well as other components of the Hadoop ecosystem), as well as writing the results of Hadoop jobs out to MongoDB. A forthcoming release will also allow for reading and writing static BSON files (ala <em>mongodump / mongorestore</em>) to allow offline batching; commonly, users find this to be a beneficial feature to run analytics against backup data.
</p><p>At this time, we support the “core” Hadoop APIs (now known as <a title="Hadoop Common" href="http://hadoop.apache.org/common/">Hadoop Common</a>), in the form of <em>mongo-hadoop-core</em>. There is additionally support for other pieces of the Hadoop Ecosystem, including <a title="Apache Pig" href="http://pig.apache.org">Pig</a> for ETL and <a title="Hadoop Streaming" href="http://hadoop.apache.org/common/docs/current/streaming.html">Streaming</a> for running Mongo+Hadoop jobs with Python (future releases will support additional scripting languages such as Ruby). Although it is not dependent upon Hadoop, we also provide a connector for the <a title="Flume" href="https://github.com/cloudera/flume/wiki">Flume</a> distributed logging system.
</p><h2 id="Support">Support</h2><p><em>mongo-hadoop</em> is dependent upon the MongoDB Java Driver — currently version 2.7.3.
</p><p>Bugs &amp; Features should be tracked and requested on the <a title="MongoDB Jira" href="https://jira.mongodb.org/browse/HADOOP/">MongoDB Jira</a>. If you have questions please email the
View
@@ -1,5 +1,5 @@
CACHE MANIFEST
-# Mon Mar 05 15:10:28 EST 2012
+# Tue Mar 06 13:54:47 EST 2012
MongoDB%2BHadoop+Connector.html
Frequently+Asked+Questions.html
Getting+Started.html
View
@@ -84,6 +84,14 @@ This is the latest beta of Cloudera's distribution, based upon the 0.23 alpha br
- cdh4
- Maven Artifact: "org.mongodb" / "mongo-hadoop_cdh4b1"
+### Apache Hadoop 0.22.0
+
+This includes Pig 0.9.1 and Hadoop Streaming.
+
+- 0.22
+- 0.22.0
+
+
#### Apache Hadoop 0.21.0
This includes Pig 0.9.1 and Hadoop Streaming.

0 comments on commit 99e47ac

Please sign in to comment.