Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

version conflict merged

  • Loading branch information...
commit cd7b37e4f28324927a0696b103579b6f049b307b 2 parents ca3e4a9 + 3d3de86
@Jae Jae authored
Showing with 64 additions and 4 deletions.
  1. +0 −2  README
  2. +54 −0 README.md
  3. +1 −1  package.json
  4. +9 −1 result_set.js
View
2  README
@@ -1,2 +0,0 @@
-Thrift module has been cloned from https://github.com/wadey/node-thrift/commit/25c0eb4eb85aa63cfb49a8e8c815bd57e2b8043a
-nodejs binding is generated from hive 0.6.1 CDH3B4
View
54 README.md
@@ -0,0 +1,54 @@
+Node Bindings for Hadoop Hive
+=============================
+
+Installation
+------------
+
+ npm install node-hive
+
+Usage
+-----
+
+ hive = require('node-hive').for({ server:"hive.myserver" });
+
+ hive.fetch("SELECT * FROM my_table", function(err, data) {
+ data.each(function(record) {
+ console.log(record);
+ });
+ });
+
+Hive instances currently support the following functions
+
+ hive.fetch(query, callback);
+ hive.fetchInBatch(batchSize, query, callback);
+ hive.execute(query, [callback]);
+
+Query callbacks receive two arguments...
+
+* `error` which is `true` if there was an error
+* `result` which is either a `ResultSet` or an error message depending on the state of `error`
+
+The result of a query is returned as a `ResultSet` which wraps the results with some convenience functions...
+
+* `result.rows` - The original string based rows returned by thrift.
+* `result.schema` - The schema returned from hive.
+* `result.each(callback)` - Iterate through rows converting them to friendly JS objects.
+* `result.headers()` - An Array of the column headers.
+* `result.toTSV(headers=false)` - produce a TSV version of the whole ResultSet.
+
+
+See the `examples` folder for some more usage hints.
+
+
+Connections
+-----------
+
+The Hive Thrift Server currently only supports one blocking query at a time. Due to the Async model of node we want to be able to run several queries at once, for this to work we create a new connection for each query to run in and then close it when the query is completed. There is currently no support for connection pooling as most users run a small number of long running hive queries but pooling should be possible if and when it's needed.
+
+
+Notes
+-----
+
+Thrift module has been cloned from https://github.com/wadey/node-thrift/commit/25c0eb4eb85aa63cfb49a8e8c815bd57e2b8043a
+
+nodejs bindinga are generated from hive 0.6.1 CDH3B4
View
2  package.json
@@ -3,7 +3,7 @@
"contributors":["Jae Lee <jlee@yetitrails.com>", "Antonio Terreno <antonio.terreno@gmail.com>", "Andy Kent"],
"name": "node-hive",
"description": "Node Hive Client Library",
- "version": "0.0.5",
+ "version": "0.1.1",
"homepage": "https://github.com/forward/node-hive",
"repository": {
"type": "git",
View
10 result_set.js
@@ -8,8 +8,16 @@ ResultSet.prototype.each = function(cb) {
var rowArray = this.rows[i].split("\t");
var row = {};
var headers = this.headers();
+ var schema = this.schema;
+ typecast = function(column, stringValue) {
+ var found = schema.filter(function(s) { return s.name === column });
+ var type = 'string';
+ if(found.length === 1) type = found[0].type;
+ if(type === 'double' || type === 'float' || type === 'int') return Number(stringValue);
+ return stringValue;
+ };
for(var a in rowArray) {
- row[headers[a]] = rowArray[a];
+ row[headers[a]] = typecast(headers[a], rowArray[a]);
}
cb(row);
};
Please sign in to comment.
Something went wrong with that request. Please try again.