Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Updated for Hypertable version 0.9.7.12; Performance improvements #143

Open
wants to merge 1 commit into from

2 participants

Doug Judd weston platter
Doug Judd

Hi Brian,

I'm the maintainer of Hypertable. I've updated the Hypertable YCSB driver for the latest stable version of Hypertable (0.9.7.12). I also made a number of performance improvements to the driver and updated the README. Can you please pull in this commit? Thank you!

  • Doug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Oct 9, 2013
This page is out of date. Refresh to see the latest.
84 hypertable/README
View
@@ -1,84 +0,0 @@
-1 Install Hypertable
-
-Installation instructions for Hypertable can be found at:
-
-code.google.com/p/hypertable/wiki/HypertableManual
-
-
-
-2 Set Up YCSB
-
-Clone the YCSB git repository and compile:
-
-]$ git clone git://github.com/brianfrankcooper/YCSB.git
-]$ cd YCSB
-]$ mvn clean package
-
-
-
-3 Run Hypertable
-
-Once it has been installed, start Hypertable by running
-
-]$ ./bin/ht start all-servers hadoop
-
-if an instance of HDFS is running or
-
-]$ ./bin/ht start all-servers local
-
-if the database is backed by the local file system. YCSB accesses
-a table called 'usertable' by default. Create this table through the
-Hypertable shell by running
-
-]$ ./bin/ht shell
-hypertable> use '/ycsb';
-hypertable> create table usertable(family);
-hypertable> quit
-
-All iteractions by YCSB take place under the Hypertable namespace '/ycsb'.
-Hypertable also uses an additional data grouping structure called a column
-family that must be set. YCSB doesn't offer fine grained operations on
-column families so in this example the table is created with a single
-column family named 'family' to which all column families will belong.
-The name of this column family must be passed to YCSB. The table can be
-manipulated from within the hypertable shell without interfering with the
-operation of YCSB.
-
-
-
-4 Run YCSB
-
-Make sure that an instance of Hypertable is running. To access the database
-through the YCSB shell, from the YCSB directory run:
-
-]$ ./bin/ycsb shell hypertable -p columnfamily=family
-
-where the value passed to columnfamily matches that used in the table
-creation. To run a workload, first load the data:
-
-]$ ./bin/ycsb load hypertable -P workloads/workloada -p columnfamily=family
-
-Then run the workload:
-
-]$ ./bin/ycsb run hypertable -P workloads/workloada -p columnfamily=family
-
-This example runs the core workload 'workloada' that comes packaged with YCSB.
-The state of the YCSB data in the Hypertable database can be reset by dropping
-usertable and recreating it.
-
-
-
-+ Configuration Parameters
-
-Hypertable configuration settings can be found in conf/hypertable.cfg under
-your main hypertable directory. Make sure that the constant THRIFTBROKER_PORT
-in the class HypertableClient matches the setting ThriftBroker.Port in
-hypertable.cfg.
-
-To change the amount of data returned on each call to the ThriftClient on
-a Hypertable scan, one must add a new parameter to hypertable.cfg. Include
-ThriftBroker.NextThreshold=x where x is set to the size desired in bytes.
-The default setting of this parameter is 128000.
-
-To alter the Hypertable namespace YCSB operates under, change the constant
-NAMESPACE in the class HypertableClient.
75 hypertable/README.md
View
@@ -0,0 +1,75 @@
+## Quick Start
+
+
+### Before you get started -- a brief note about the ThriftBroker
+
+The Hypertable _ThriftBroker_ process is what allows Java clients to communicate
+with Hypertable and should be installed on all YCSB client machines. The
+Hypertable YCSB driver expects there to be a ThriftBroker running on localhost,
+to which it connects. A ThriftBroker will automatically be started on all
+Hypertable machines that run either a Master or RangeServer. If you plan to run
+the ycsb command script on machines that are **not** running either a Master or
+RangeServer, then you'll need to install Hypertable on those machines and
+configure them to start ThriftBrokers. To do that, when performing step 3 of
+the installation instructions below, list these additional client machines under
+the :thriftbroker_additional role in the Capfile. For example:
+
+ role :thriftbroker_additional, ycsb-client00, ycsb-client01, ...
+
+
+### Step 1. Install Hypertable
+
+See [Hypertable Installation Guide]
+(http://hypertable.com/documentation/installation/quick_start_cluster_installation/)
+for the complete set of install instructions.
+
+
+### Step 2. Set Up YCSB
+
+Clone the YCSB git repository and compile:
+
+ git clone git://github.com/brianfrankcooper/YCSB.git
+ cd YCSB
+ mvn clean package
+
+
+### Step 3. Run Hypertable
+
+Once it has been installed, start Hypertable by running the following command in
+the directory containing your Capfile:
+
+ cap start
+
+Then create the _ycsb_ namespace and a table called _usertable_:
+
+ echo "create namespace ycsb;" | /opt/hypertable/current/bin/ht shell --batch
+ echo "use ycsb; create table usertable (family);" | /opt/hypertable/current/bin/ht shell --batch
+
+All interactions by YCSB take place under the Hypertable namespace *ycsb*.
+Hypertable also uses an additional data grouping structure called a column
+family that must be set. YCSB doesn't offer fine grained operations on column
+families so in this example the table is created with a single column family
+named _family_ to which all column families will belong. The name of this
+column family must be passed to YCSB. The table can be manipulated from within
+the hypertable shell without interfering with the operation of YCSB.
+
+
+### Step 4. Run YCSB
+
+Make sure that an instance of Hypertable is running. To access the database
+through the YCSB shell, from the YCSB directory run:
+
+ ./bin/ycsb shell hypertable -p columnfamily=family
+
+where the value passed to columnfamily matches that used in the table
+creation. To run a workload, first load the data:
+
+ ./bin/ycsb load hypertable -P workloads/workloada -p columnfamily=family
+
+Then run the workload:
+
+ ./bin/ycsb run hypertable -P workloads/workloada -p columnfamily=family
+
+This example runs the core workload _workloada_ that comes packaged with YCSB.
+The state of the YCSB data in the Hypertable database can be reset by dropping
+usertable and recreating it.
4 hypertable/pom.xml
View
@@ -20,10 +20,10 @@
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
- <version>${thrift.version}</version>
+ <version>0.8.0</version>
</dependency>
<dependency>
- <groupId>org.hypertable</groupId>
+ <groupId>com.hypertable</groupId>
<artifactId>hypertable</artifactId>
<version>${hypertable.version}</version>
</dependency>
130 hypertable/src/main/java/com/yahoo/ycsb/db/HypertableClient.java
View
@@ -17,7 +17,9 @@
import java.nio.ByteBuffer;
+import java.util.ArrayList;
import java.util.HashMap;
+import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.Vector;
@@ -50,6 +52,18 @@
private String _columnFamily = "";
+ private HashMap<String, String> mFieldMap= new HashMap<String, String>();
+
+ private ScanSpec mSpec = new ScanSpec();
+
+ private RowInterval mRowInterval = new RowInterval();
+
+ private List<String> mColumns = new ArrayList<String>();
+
+ private SerializedCellsReader mReader = new SerializedCellsReader(null);
+
+ private Cell mDeleteCell = new Cell();
+
public static final int OK = 0;
public static final int SERVERERROR = -1;
@@ -93,6 +107,13 @@ public void init() throws DBException
"columnfamily for Hypertable table");
throw new DBException("No columnfamily specified");
}
+
+ List<RowInterval> rowIntervals = new ArrayList<RowInterval>();
+ rowIntervals.add(mRowInterval);
+ mSpec.setRow_intervals(rowIntervals);
+ mSpec.setVersions(1);
+
+ mDeleteCell.key = new Key();
}
/**
@@ -136,20 +157,33 @@ public int read(String table, String key, Set<String> fields,
try {
if (null != fields) {
- Vector<HashMap<String, ByteIterator>> resMap =
- new Vector<HashMap<String, ByteIterator>>();
- if (0 != scan(table, key, 1, fields, resMap)) {
- return SERVERERROR;
- }
- if (!resMap.isEmpty())
- result.putAll(resMap.firstElement());
- } else {
- SerializedCellsReader reader = new SerializedCellsReader(null);
- reader.reset(connection.get_row_serialized(ns, table, key));
- while (reader.next()) {
- result.put(new String(reader.get_column_qualifier()),
- new ByteArrayByteIterator(reader.get_value()));
+
+ // Set the start row and end row to key
+ mRowInterval.setStart_row(key);
+ mRowInterval.setEnd_row(key);
+
+ String qualifiedColumn;
+ mColumns.clear();
+ for (String field : fields) {
+ qualifiedColumn = mFieldMap.get(field);
+ if (qualifiedColumn == null) {
+ qualifiedColumn = _columnFamily + ":" + field;
+ mFieldMap.put(field, qualifiedColumn);
}
+ mColumns.add(qualifiedColumn);
+ }
+ mSpec.setColumns(mColumns);
+
+ mSpec.unsetRow_limit();
+
+ mReader.reset(connection.get_cells_serialized(ns, table, mSpec));
+ }
+ else {
+ mReader.reset(connection.get_row_serialized(ns, table, key));
+ }
+ while (mReader.next()) {
+ result.put(new String(mReader.get_column_qualifier()),
+ new ByteArrayByteIterator(mReader.get_value()));
}
} catch (ClientException e) {
if (_debug) {
@@ -184,41 +218,50 @@ public int scan(String table, String startkey, int recordcount,
{
//SELECT _columnFamily:fields FROM table WHERE (ROW >= startkey)
// LIMIT recordcount MAX_VERSIONS 1;
-
- ScanSpec spec = new ScanSpec();
- RowInterval elem = new RowInterval();
- elem.setStart_inclusive(true);
- elem.setStart_row(startkey);
- spec.addToRow_intervals(elem);
- if (null != fields) {
- for (String field : fields) {
- spec.addToColumns(_columnFamily + ":" + field);
+
+ // Set the start row to startkey and clear the end row
+ mRowInterval.setStart_row(startkey);
+ mRowInterval.setEnd_row(null);
+
+ // Set the column list (null for all columns)
+ if (fields == null) {
+ mSpec.unsetColumns();
+ }
+ else {
+ String qualifiedColumn;
+ mColumns.clear();
+ for (String field : fields) {
+ qualifiedColumn = mFieldMap.get(field);
+ if (qualifiedColumn == null) {
+ qualifiedColumn = _columnFamily + ":" + field;
+ mFieldMap.put(field, qualifiedColumn);
}
+ mColumns.add(qualifiedColumn);
+ }
+ mSpec.setColumns(mColumns);
}
- spec.setVersions(1);
- spec.setRow_limit(recordcount);
- SerializedCellsReader reader = new SerializedCellsReader(null);
+ // Set the LIMIT
+ mSpec.setRow_limit(recordcount);
try {
- long sc = connection.scanner_open(ns, table, spec);
+ long sc = connection.scanner_open(ns, table, mSpec);
String lastRow = null;
boolean eos = false;
while (!eos) {
- reader.reset(connection.scanner_get_cells_serialized(sc));
- while (reader.next()) {
- String currentRow = new String(reader.get_row());
+ mReader.reset(connection.scanner_get_cells_serialized(sc));
+ while (mReader.next()) {
+ String currentRow = new String(mReader.get_row());
if (!currentRow.equals(lastRow)) {
result.add(new HashMap<String, ByteIterator>());
lastRow = currentRow;
}
result.lastElement().put(
- new String(reader.get_column_qualifier()),
- new ByteArrayByteIterator(reader.get_value()));
+ new String(mReader.get_column_qualifier()),
+ new ByteArrayByteIterator(mReader.get_value()));
}
- eos = reader.eos();
-
+ eos = mReader.eos();
if (_debug) {
System.out.println("Number of rows retrieved so far: " +
@@ -279,17 +322,14 @@ public int insert(String table, String key,
}
try {
- long mutator = connection.mutator_open(ns, table, 0, 0);
SerializedCellsWriter writer =
new SerializedCellsWriter(BUFFER_SIZE*values.size(), true);
for (Map.Entry<String, ByteIterator> entry : values.entrySet()) {
- writer.add(key, _columnFamily, entry.getKey(),
- SerializedCellsFlag.AUTO_ASSIGN,
- ByteBuffer.wrap(entry.getValue().toArray()));
+ writer.add(key, _columnFamily, entry.getKey(),
+ SerializedCellsFlag.AUTO_ASSIGN,
+ ByteBuffer.wrap(entry.getValue().toArray()));
}
- connection.mutator_set_cells_serialized(mutator,
- writer.buffer(), true);
- connection.mutator_close(mutator);
+ connection.set_cells_serialized(ns, table, writer.buffer());
} catch (ClientException e) {
if (_debug) {
System.err.println("Error doing set: " + e.message);
@@ -320,13 +360,11 @@ public int delete(String table, String key)
System.out.println("Doing delete for key: "+key);
}
- Cell entry = new Cell();
- entry.key = new Key();
- entry.key.row = key;
- entry.key.flag = KeyFlag.DELETE_ROW;
-
+ mDeleteCell.key.row = key;
+ mDeleteCell.key.flag = KeyFlag.DELETE_ROW;
+
try {
- connection.set_cell(ns, table, entry);
+ connection.set_cell(ns, table, mDeleteCell);
} catch (ClientException e) {
if (_debug) {
System.err.println("Error doing delete: " + e.message);
2  pom.xml
View
@@ -55,7 +55,7 @@
<voldemort.version>0.81</voldemort.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<thrift.version>0.8.0</thrift.version>
- <hypertable.version>0.9.5.6</hypertable.version>
+ <hypertable.version>0.9.7.12</hypertable.version>
</properties>
<modules>
Something went wrong with that request. Please try again.