Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Pimped Fork of DataStax Brisk Distribution
Java Shell JavaScript

This branch is 28 commits ahead, 1 commit behind riptano:master

Typo !

latest commit 46213ce616
@steeve authored
Failed to load latest commit information.
bin Correctly detect JAVA_HOME on Mac OS X.
debian Replace README.txt with README.md so deb will build again
demos Packaging fixes for renamed files, bump version to beta2
interface add thrift autogenerated Brisk class
packaging-common Allow setting the cfs replication factor on the first hadoop-enabled …
redhat picked up most recent demo changes to pig
resources Removing obsolete configuration, removing whitespaces
src/java/src Updating Brisk to Cassandra 1.0 API
test Merge branch 'master' into BRISK-230
tools added missing tokentool.py file for chef cookbook
.gitignore Ignore lib in hive, hadoop and cassandra. Also hadoop-*.jar
LICENSE.txt update lisc info
NEWS.txt Copy release notes to NEWS.txt
NOTICE.txt update lisc info
README.md Typo !
ReleaseNotes_Briskv1.0_beta1.pdf ReleaseNotes for RPMs
ReleaseNotes_Briskv1.0_beta2.pdf
ReleaseNotes_Briskv1.0_beta2.txt Reformat txt release notes
build.properties.default Changing cassandra version to 1.0
build.xml Removing trailing whitespaces

README.md

Pimped Brisk

Brisk made Hadoop play nice with Cassandra, but more importantly, easy to use. And this is critical for the Hadoop ecosystem. Datastax worked really hard and delivered an amazing distribution.

However, they shifted their product strategy, and decided they are discontinuing Brisk:

Since Brisk is now discontinued, this fork is an effort to make Brisk work with the latest packages of Cassandra, Pig, Hive, while retaining the original functionnality of Brisk, that is, running MR jobs without HDFS, directly on top of Cassandra.

Also, effort is ongoing to integrate Cascading/Cascalog as well. The end goal would be to have an awesome Cassandra based Hadoop distribution, easy to setup and to use.

HOWEVER, this is IN NO WAY an effort to preempt Datastax in their endeavours. They were nice enough to leave the sources for people to use, which we are!

If you want a professional, entreprise-grade Cassandra offering, please consider reviewing their offerings on: http://www.datastax.com/products

DataStax Brisk

This package contains a HDFS compatable layer (CFS) and a CassandraJobConf which can be used to run MR jobs without HDFS or dedicated job/task trackers.

It also includes a hive-driver for accessing data in cassandra as well as a hive meta-store implementation.

Hadoop jobs and Hive are setup to work with MR cluster.

For detailed docs please see: http://www.datastax.com/docs/0.8/brisk/index

You can also discuss Brisk on freenode #datastax-brisk

Required Setup

On linux systems, you need to run the following as root

echo 1 > /proc/sys/vm/overcommit_memory

This is to avoid OOM errors when tasks are spawned.

Getting Started

To try it out run:

  1. compile and download all dependencies

    ant
    
  2. start cassandra with built in job/task trackers

    ./bin/brisk cassandra -t
    
  3. view jobtracker

    http://localhost:50030
    
  4. examine CassandraFS

    ./bin/brisk hadoop fs -lsr cfs:///
    
  5. start hive shell or webUI

    ./bin/brisk hive
    

    or

    ./bin/brisk hive --service hwi
    

open web browser to http://localhost:9999/hwi

Something went wrong with that request. Please try again.