Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 28 lines (16 sloc) 0.921 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
PREREQUISITES:
- Cassandra 1.1
- Hadoop 1.0.1+
- JDK 1.6
- Scala 2.9 (if building the Scala job)

1. Create the Cassandra schema with the following command:

         cassandra-cli -h <cass_host> -f cassandra/cassandra_schema.txt

2. Use cassandra/seed_data.txt to populate some sample data (using the same command as above), or add your own data.

3. Increase the max heap size for the Hadoop client in hadoop-env.sh:

         export HADOOP_CLIENT_OPTS="-Xmx1g $HADOOP_CLIENT_OPTS"

4. Copy jars from lib directory into $HADOOP_HOME/lib or otherwise ensure they are on the Hadoop classpath

5. cd into either java or scala, and run the build script to generate the jar file.

6. Execute the run script as follows:

./run <cass_host> <num_reducers> <keyspace> <input_column_family> <output_column_family>

   Example if running locally using the schema script above:

./run localhost 1 HadoopTest TestInput TestOutput
Something went wrong with that request. Please try again.