[IoTDB 226] Hive connector by JackieTien97 · Pull Request #425 · apache/iotdb

JackieTien97 · 2019-09-26T07:31:41Z

No description provided.

…ceReader

… to true

…example module

…p test src

samperson1997

Hi, thanks for your contributing and everything looks good to me.
By reading the User Guide documents you edited and seeing your demo based on real production environment, I believe this must be a fabulous and useful connector. Below I proposed some suggestions which do not affect functions and user interaction, so maybe you can choose to solve them during future work.
Really look forward to the connector tools for Hadoop 3.x and Hive 3.x!

samperson1997 · 2019-10-23T10:01:35Z

hadoop/src/main/java/org/apache/iotdb/hadoop/tsfile/TSFInputFormat.java

   * @return the index of blockLocation or -1 if no block could be found
   */
-  private int getBlockLocationIndex(BlockLocation[] blockLocations, long middle) {
+  private static int getBlockLocationIndex(BlockLocation[] blockLocations, long middle, Logger logger) {


Why passing Logger logger in the method instead of using the public logger? Same question with other places in this file.

Different callers have different loggers. In order to print right log, I need to pass the corresponding logger to the function.

samperson1997 · 2019-10-23T10:05:49Z

hadoop/src/main/java/org/apache/iotdb/hadoop/tsfile/TSFInputSplit.java

 * <code>Mapper</code> task.
 */
-public class TSFInputSplit extends InputSplit implements Writable {
+public class TSFInputSplit extends FileSplit implements Writable, org.apache.hadoop.mapred.InputSplit {


How about import org.apache.hadoop.mapred.InputSplit; in the head of this file?

Suggested change

public class TSFInputSplit extends FileSplit implements Writable, org.apache.hadoop.mapred.InputSplit {

public class TSFInputSplit extends FileSplit implements Writable, InputSplit {

samperson1997 · 2019-10-23T11:08:10Z

hive-connector/src/main/java/org/apache/iotdb/hive/TSFHiveInputFormat.java

@@ -0,0 +1,51 @@
+/**


We are using block comments style for Apache License instead of Java-style comments due to this PR. Maybe you could change all the new files' Apache License in next PR, or we may face some problems when generating JavaDoc.

samperson1997 · 2019-10-23T11:14:28Z

hive-connector/src/main/java/org/apache/iotdb/hive/TsFileDeserializer.java

+import java.util.Objects;
+
+public class TsFileDeserializer {
+  private static final Logger LOG = LoggerFactory.getLogger(TsFileDeserializer.class);


User logger instead of LOG to be consistent with other files.

samperson1997 · 2019-10-23T11:15:29Z

hive-connector/src/main/java/org/apache/iotdb/hive/TsFileDeserializer.java

+        continue;
+      }
+      if (columnType.getCategory() != ObjectInspector.Category.PRIMITIVE) {
+        throw new TsFileSerDeException("Unknown TypeInfo: " + columnType.getCategory());


Maybe you could add logger.error here so that users can know the exact error position. Same with other exceptions below.

Suggested change

throw new TsFileSerDeException("Unknown TypeInfo: " + columnType.getCategory());

logger.error("Unknown TypeInfo: {}", columnType.getCategory());

samperson1997 · 2019-10-23T11:18:41Z

hive-connector/src/main/java/org/apache/iotdb/hive/TSFHiveRecordWriter.java

+    try {
+      writer.write(((HDFSTSRecord)writable).convertToTSRecord());
+    } catch (WriteProcessException e) {
+      throw new IOException(String.format("Write tsfile record error %s", e));


Maybe you could add logger.error here so that users can know the exact error position. Same with other exceptions.

Suggested change

throw new IOException(String.format("Write tsfile record error %s", e));

logger.error("Write tsfile record error: {}", e);

samperson1997 · 2019-10-23T11:23:41Z

example/hadoop/src/main/java/org/apache/iotdb/hadoop/tsfile/TSFMRReadExample.java

 * One example for reading TsFile with MapReduce.
 * This MR Job is used to get the result of sum("device_1.sensor_3") in the tsfile.
 * The source of tsfile can be generated by <code>TsFileHelper</code>.
 * @author Yuan Tian


Don't forget to remove author here ; )

hive-connector/src/main/java/org/apache/iotdb/hive/TSFHiveInputFormat.java

hive-connector/src/main/java/org/apache/iotdb/hive/TSFHiveOutputFormat.java

hive-connector/src/main/java/org/apache/iotdb/hive/TSFHiveRecordReader.java

hive-connector/src/main/java/org/apache/iotdb/hive/TsFileDeserializer.java

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md

LeiRui

I followed the user guide and checked all the instructions mentioned. Everything works (on my hadoop 2.7.7 & hive 2.3.6).

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md

docs/Documentation/UserGuide/9-Tools-Hive.md

LeiRui · 2019-10-26T05:56:31Z

The Travis CI build failed. Try merge the master into your pr to solve the problem.

JackieTien97 · 2019-10-26T07:13:32Z

The Travis CI build failed. Try merge the master into your pr to solve the problem.

It seems that there are some problems with the windows env.

JackieTien97 and others added 30 commits September 5, 2019 21:56

hadoop connector

0bef2ed

Add getSortedChunkGroupMetaDataListByDeviceIds method in TsFileSequen…

3f972be

…ceReader

Add getSortedChunkGroupMetaDataListByDeviceIds method in TsFileSequen…

0521df1

…ceReader

Almost done

92befe7

update junit test code

335a959

modify the junit test code

7519656

test

10cf27d

solve conflicts

7e7bfec

Merge remote-tracking branch 'upstream/master' into hadoop-connector

a07dd1b

add some test cases

680352b

delete useless log

1f8e611

delete useless test file

91f8cdd

set the default value of READ_TIME_ENABLE and READ_DELTAOBJECT_ENABLE…

97ff6f0

… to true

add hadoop windows utils for windows test environment

8e50261

change the dir path format

3b31e90

modify the file path

aa03ffc

modify move file path

7f91b75

Another try

0f2c992

add hadoop environment varibale

ff80bdd

modify the hadoop version

79fb702

only add

3f360a3

solve conflict

76b52e2

add more dependency

8623099

modify the hadoop pom

30da99b

add a hadoop submodule in example module

683aec9

before change to map

1280bc8

Replace ArrayWritable with MapWritable and add a hadoop submodule in …

b8bc065

…example module

remove the redundant HDFSInput and HDFSOutput in hadoop module

f47e1e1

add license

368d41c

add hive connector only for query

c9fbb0d

JackieTien97 added 11 commits October 16, 2019 23:28

add Chinese version doc for hive-connector

095bf93

add more UT

8ab9d6e

add apache-rat

05a5d79

generate test.tsfile in test

8312007

change the tablename and cloumn name

9b2e52e

solve conflict

8de7f99

solve problems

cfe107e

change the package name

03c3268

Merge remote-tracking branch 'upstream/master' into hive-connector

62765ee

Confusing OOM and change the package name in exmaple module and hadoo…

04bf1f7

…p test src

Merge remote-tracking branch 'upstream/master' into hive-connector

da2d06e

samperson1997 approved these changes Oct 23, 2019

View reviewed changes

change database to device_id and tablename to sensorid

a9eb68d

qiaojialin requested changes Oct 25, 2019

View reviewed changes

JackieTien97 added 2 commits October 25, 2019 15:21

Solve some license

6db4fd9

some docs

d30327f

LeiRui reviewed Oct 25, 2019

View reviewed changes

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md Outdated Show resolved Hide resolved

LeiRui reviewed Oct 25, 2019

View reviewed changes

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md Outdated Show resolved Hide resolved

JackieTien97 added 2 commits October 25, 2019 20:28

change the way to get deviceId

ffe6177

modify docs

b097b51

LeiRui reviewed Oct 26, 2019

View reviewed changes

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md Outdated Show resolved Hide resolved

docs/Documentation-CHN/UserGuide/9-Tools-Hive.md Show resolved Hide resolved

LeiRui reviewed Oct 26, 2019

View reviewed changes

docs/Documentation/UserGuide/9-Tools-Hive.md Show resolved Hide resolved

docs/Documentation/UserGuide/9-Tools-Hive.md Outdated Show resolved Hide resolved

docs/Documentation/UserGuide/9-Tools-Hive.md Outdated Show resolved Hide resolved

modify docs

0d12313

Merge remote-tracking branch 'upstream/master' into hive-connector

5bdbed8

change hive connector doc path and name

8503ed4

JackieTien97 requested a review from qiaojialin October 27, 2019 02:09

qiaojialin approved these changes Oct 29, 2019

View reviewed changes

qiaojialin merged commit ea8e23d into apache:master Oct 29, 2019

	public class TSFInputSplit extends FileSplit implements Writable, org.apache.hadoop.mapred.InputSplit {
	public class TSFInputSplit extends FileSplit implements Writable, InputSplit {

	throw new TsFileSerDeException("Unknown TypeInfo: " + columnType.getCategory());
	logger.error("Unknown TypeInfo: {}", columnType.getCategory());

	throw new IOException(String.format("Write tsfile record error %s", e));
	logger.error("Write tsfile record error: {}", e);

Conversation

JackieTien97 commented Sep 26, 2019

Uh oh!

samperson1997 left a comment

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

JackieTien97 Oct 25, 2019

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

samperson1997 Oct 23, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LeiRui left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LeiRui commented Oct 26, 2019

Uh oh!

JackieTien97 commented Oct 26, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

samperson1997 Oct 23, 2019 •

edited

Loading