Permalink
Browse files

Backporting changes from gh-pages docs

  • Loading branch information...
1 parent eea208e commit fbeaeb59dfa35c6107a7bb4fff69b75dfeeddeb0 Philip (flip) Kromer committed Oct 6, 2009
Showing with 51 additions and 16 deletions.
  1. +42 −16 INSTALL.textile
  2. +9 −0 README.textile
View
@@ -1,12 +1,19 @@
---
layout: default
title: Install Notes
+collapse: false
---
h1(gemheader). {{ site.gemname }} %(small):: install%
+** "Get the code":#getcode
+** "Setup":#setup
+** "Installing and Running Wukong with Hadoop":#gethadoop
+** "Installing and Running Wukong with Datamapper, ActiveRecord, the command-line and more":#others
+
+
<notextile><div class="toggle"></notextile>
-h2. Get the code
+h2(#getcode). Get the code
Wukong is still under active development. The newest version is available via "Git":http://git-scm.com on "github:":http://github.com/mrflip/{{ site.gemname }}
@@ -22,31 +29,32 @@ pre. $ sudo gem install {{ site.gemname }} --source=http://gemcutter.org
You can instead download this project in either "zip":http://github.com/mrflip/{{ site.gemname }}/zipball/master or "tar":http://github.com/mrflip/{{ site.gemname }}/tarball/master formats.
-<notextile></div><div class="toggle"></notextile>
-
-h2. Get the Dependencies
+h3. Get the Dependencies
* Hadoop, pig
* extlib, YAML, JSON
* Optional gems: trollop, addressable/uri, htmlentities
-
<notextile></div><div class="toggle"></notextile>
-h2. Setup
+h2(#setup). Setup
-1. Allow Wukong to discover where his elephant friend lives: either
-** set a $HADOOP_HOME environment variable,
-** or create a file 'config/wukong-site.yaml' with a line that points to the top-level directory of your hadoop install: @:hadoop_home: /usr/local/share/hadoop@
-2. Add wukong's @bin/@ directory to your $PATH, so that you may use its filesystem shortcuts.
+1. Allow Wukong to discover where his elephant friend lives by setting a $HADOOP_HOME environment variable: @export HADOOP_HOME="/usr/local/share/hadoop"@
+2. Add wukong's @bin/@ directory to your $PATH if you'd like to use the "wutils":wutils.html
-<notextile></div></notextile>
+<i>(see also: "Ruby Hadoop Quickstart":http://blog.pdatasolutions.com/post/191978092/ruby-on-hadoop-quickstart)</i>
-h2. Installing and Running Wukong under Hadoop
+<notextile></div><div class="toggle"></notextile>
-Wukong is used by many in an non-Hadoop environment -- anywhere you can stream data records you can unleash its monkey power. It was developed for Hadoop, though, and we think it's actually the best (and certainly the most fun) way to use Hadoop.
+h2(#gethadoop). Installing and Running Wukong with Hadoop
-h3. Set up a Hadoop cluster
+Wukong was primarily developed for Hadoop, and we think it's the best way to use Hadoop (it's certainly the most fun!).
+
+h3. Run Wukong on the Amazon AWS EC2 Cloud
+
+h3. Hadoop Infrastructure
+
+Even if you have a bunch of machines with spare cycles, lots of RAM, and a shared filesystem... do yourself a favor and start out using the "Cloudera AMIs on Amazon's EC2 cloud.":http://www.cloudera.com/hadoop-ec2 There are an overwhelming number of fiddly little parameters and you'll be glad for the user experience before you get into server setup. If it's still mid-late 2009 when you read this, ignore prudence and jump straight to using Hadoop 0.20. It will be a) more fun, b) much more robust (trust me, at "v0.20" you want to live on the bleeding edge), and c) you won't have to suffer through migrating your HDFS two weeks after setup.
To set up hadoop, your best bet are the Cloudera AMIs on Amazon's EC2 compute cloud:
@@ -55,9 +63,27 @@ To set up hadoop, your best bet are the Cloudera AMIs on Amazon's EC2 compute cl
EC2 means anyone with a $10 bill can rent a 10-machine cluster with 1TB of distributed storage for 8 hours.
+h3. Run Wukong using Amazon AWS Elastic MapReduce
+
+AWS Elastic MapReduce saves the trouble of even setting up a cluster: click, bam, there it is.
+
+Phil Ripperger has prepared a "Ruby Hadoop Quickstart":http://blog.pdatasolutions.com/post/191978092/ruby-on-hadoop-quickstart explaining how to get started with Wukong, Hadoop and the Amazon Elastic MapReduce cloud -- it's better than anything we could put here. Thanks Phil!
+
+h3. Set up a Hadoop cluster
+
If you have a local cluster, or just want to experiment with a single-machine install, check out the Cloudera packages for both Debian/Ubuntu-based and Redhat/RPM-based Linux systems.
-h3. Run Wukong on the Amazon AWS EC2 Cloud
+h3. More Hadoop Notes
+
+I've braindumped some random notes on configuring and using hadoop "over here":hadoop-tips.html
-Phil Ripperger has prepared "instructions on getting wukong to work on the Amazon AWS cloud":http://blog.pdatasolutions.com/post/191978092/ruby-on-hadoop-quickstart that are better than anything we could put here. Thanks Phil!
+<notextile></div><div class="toggle"></notextile>
+
+h2(#others). Wukong isn't just Hadoop: Datamapper, ActiveRecord, command-line usage and more
+
+Wukong is used by many in an non-Hadoop environment -- anywhere you can stream data records, you can unleash its monkey power.
+Please see the "usage notes":usage.html#playnice for more!
+
+
+<notextile></div></notextile>
View
@@ -12,6 +12,15 @@ Wukong is friends with "Hadoop":http://hadoop.apache.org/core the elephant, "Pig
The main documentation -- including tutorials and tips for working with big data -- lives on the "Wukong Pages":http://mrflip.github.com/wukong and there is some supplemental information on the "wukong wiki.":http://wiki.github.com/mrflip/wukong
+
+* "Install and set up wukong":http://mrflip.github.com/wukong/INSTALL.html
+* "Tutorial":http://mrflip.github.com/wukong/tutorial.html
+* "Usage notes":http://mrflip.github.com/wukong/usage.html
+* "Wutils":http://mrflip.github.com/wukong/wutils.html -- command-line utilies for working with data from the command line
+* Links and tips for "configuring and working with hadoop":http://mrflip.github.com/wukong/hadoop-tips.html
+* Wukong is licensed under the "Apache License":http://mrflip.github.com/wukong/LICENSE.html (same as Hadoop)
+* "More info":http://mrflip.github.com/wukong/moreinfo.html
+
h2. Install
Wukong is still under active development. The newest version is available at

0 comments on commit fbeaeb5

Please sign in to comment.