Permalink
Browse files

Added failover, license, changed data model.

  • Loading branch information...
1 parent 65be44a commit dee783eb8c25783ac89c801c10ee7d9ae5672335 @thobbs committed Sep 8, 2010
Showing with 263 additions and 90 deletions.
  1. +20 −0 LICENSE
  2. +31 −9 README
  3. +212 −81 src/java/org/apache/cassandra/plugins/SimpleCassandraSink.java
View
20 LICENSE
@@ -0,0 +1,20 @@
+Everything else:
+http://www.opensource.org/licenses/mit-license.php
+
+Copyright (c) 2010 Tyler Hobbs
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this
+software and associated documentation files (the "Software"), to deal in the Software
+without restriction, including without limitation the rights to use, copy, modify, merge,
+publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
+to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or
+substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
+INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
+PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
+FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
View
40 README
@@ -40,17 +40,39 @@ in any terminals you will use.
Usage
-----
+
+This plugin primarily targets log storage right now.
+
The Cassandra sink requires four arguments for its constructor:
-1. A Cassandra server hostname (String)
-2. The Cassandra server port (int)
-3. A keyspace (String). For example, 'Keyspace1'.
-4. A ColumnFamily (String).
+1. A keyspace (String). For example, 'Keyspace1'.
+2. A column family name (String) for storing data in.
+3. A column family name (String) for storing indexes in.
+4. A list Cassandra server hostname:port combinations (Strings)
-When the Cassandra sink receives an event, it does the following:
+Cassandra must already be configured so that the keyspace and both of the
+column families must already exist. The index column family should use
+a TimeUUIDType comparator. For example, in cassandra.yaml you would have:
-1. Creates a column where the name is a type 1 UUID (timestamp based) and the
-value is the event body.
-2. Inserts it into row "YYYYMMDD" (the current date) in the given ColumnFamily.
+ - name: FlumeIndexes
+ compare_with: TimeUUIDType
+ comment: 'Stores the v1 uuids for log events'
+
+The data storage column family can use BytesType.
+
+When the Cassandra sink receives an event, it does the following:
-As you might guess, this is primarily targets log storage right now.
+1. In the index column family:
+ a. Creates a column where the name is a type 1 UUID (timestamp based) and the
+ value is empty.
+ b. Inserts it into row "YYYYMMDDHH" (the current date and hour) in the
+ given column family.
+2. In the data column family:
+ a. Creates a column where the name is 'data' and the value is the
+ flume event body.
+ b. Inserts it into a row with a key that is the same uuid from step 1.
+
+This allows you to easily fetch all logs for a slice of time. Simply use
+something like get_slice() on the index column family to get the uuids you
+want for a particular slice of time, and then multiget the data column
+family using those uuids as the keys.
Oops, something went wrong.

0 comments on commit dee783e

Please sign in to comment.