[CARBONDATA-3127]Fix the HiveExample & TestCarbonSerde exception #2969

SteNicholas · 2018-12-02T01:40:21Z

[CARBONDATA-3127]Hive module test case has been commented off,can' t run.
This pull request fix TestCarbonSerde test case exception with maven compile.Method called deserializeAndSerializeLazySimple of TestCarbonSerde class calls the interface called serializeStartKey which is not exsit in CarbonHiveSerDe class.Just modify replace serializeStartKey with serialize and remove comments of TestCarbonSerde.
Previous hive-integration HiveExample exists obvious bug that there is no data when select HIVE_CARBON_EXAMPLE table.I have already fixed the bug above in this pull request.Location was configured wrong after creating hive table.And block lack of detail info and footer offset when do querying hive table operation.Therefore I set detail info of block with footer offset through input split.

Any interfaces changed?
No
Any backward compatibility impacted?
No
Document update required?
No
Testing done
Modify TestCarbonSerde test
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
No

SteNicholas · 2018-12-02T01:48:02Z

@xubo245 @zzcclp Please review the small update.

CarbonDataQA · 2018-12-02T01:54:18Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1609/

CarbonDataQA · 2018-12-02T02:47:40Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9869/

CarbonDataQA · 2018-12-02T02:48:32Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1820/

xubo245 · 2018-12-02T05:28:06Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerde.java

+    final Properties tbl = new Properties();
+
+    // Set the configuration parameters
+    tbl.setProperty("columns", "ashort,aint,along,adouble,adecimal,astring,alist");


Please take care the spell format, including the case sensitive， for example: aShort

@xubo245 I have already fixed the spell formate you referred.

xubo245 · 2018-12-02T05:29:50Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerde.java

+
+import java.util.Properties;
+
+public class TestCarbonSerde extends TestCase {


Please take care the spell format, including the case sensitive，for example：TestCarbonSerDe

Please check other code

@xubo245 I have already fixed the spell formate you referred.

xubo245 · 2018-12-02T05:30:19Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerde.java

+      arr[0] = new ShortWritable((short) 456);
+      arr[1] = new IntWritable(789);
+      arr[2] = new LongWritable(1000l);
+      arr[3] = new DoubleWritable((double) 5.3);


no need: (double) ,please remove it

@xubo245 I have already update this transformation.

SteNicholas · 2018-12-02T05:50:21Z

@zzcclp Please review this update;

CarbonDataQA · 2018-12-02T05:59:18Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1610/

CarbonDataQA · 2018-12-02T06:57:04Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9870/

CarbonDataQA · 2018-12-02T06:59:30Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1821/

zzcclp · 2018-12-03T04:10:36Z

LGTM

xubo245 · 2018-12-03T06:25:47Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerDe.java

+            arr[2] = new LongWritable(1000l);
+            arr[3] = new DoubleWritable(5.3);
+            arr[4] = new HiveDecimalWritable(HiveDecimal.create(1));
+            arr[5] = new Text("carbonSerde binary".getBytes("UTF-8"));


Please take care the spell format, including the case sensitive，for example：CarbonSerDe

@xubo245 Sorry to forget string spell format,please review it.Sorry for the inconvenience.

CarbonDataQA · 2018-12-05T05:50:07Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1640/

CarbonDataQA · 2018-12-05T07:02:45Z

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9900/

CarbonDataQA · 2018-12-05T07:15:09Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1851/

SteNicholas · 2018-12-05T07:37:46Z

@zzcclp Please review this update.

xubo245 · 2018-12-05T08:12:01Z

LGTM

zzcclp · 2018-12-05T12:49:46Z

LGTM

xuchuanyin · 2018-12-05T13:48:41Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerDe.java

+
+import java.util.Properties;
+
+public class TestCarbonSerDe extends TestCase {


Please refer to the other tests in carbon such as carbon-core module. Actually we do not need to extend TestCase here

I have already remove "extend TestCase" here, and update asserts of method.

xuchuanyin · 2018-12-05T13:50:20Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerDe.java

+    public void testCarbonHiveSerDe() throws Throwable {
+        try {
+            // Create the SerDe
+            System.out.println("test: testCarbonHiveSerDe");


It's not recommended to use stdout in test code. Currently we only use stdout in example code not test code. If you want to print something, use carbon Logger instead

Yeah,I have already replace stdout with Logger as same as the way of other carbon test code.

xuchuanyin · 2018-12-05T13:51:19Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerDe.java

+            System.out.println("test: testCarbonHiveSerDe - OK");
+
+        } catch (final Throwable e) {
+            e.printStackTrace();


use Logger instead of printing stacktarce in test code

I have already remove this catch-rethrow.

xuchuanyin · 2018-12-05T13:52:26Z

integration/hive/src/test/java/org/apache/carbondata/hive/TestCarbonSerDe.java

+            deserializeAndSerializeLazySimple(serDe, arrWritable);
+            System.out.println("test: testCarbonHiveSerDe - OK");
+
+        } catch (final Throwable e) {


after observing the following procedure, I think there is no need to catch-rethrow the exception here. The test framework will handle this

Yeah,I think so.I have already remove this non-sense catch-rethrow in this method.

xuchuanyin · 2018-12-05T13:56:39Z

@SteNicholas Nice to see your work on the existed problems. And it seems the previous code has some problems which are inherited by your code. So I suggest you to fix them at the same time. Please check the above comments.

CarbonDataQA · 2018-12-10T06:43:55Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1670/

CarbonDataQA · 2018-12-10T07:54:43Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9930/

CarbonDataQA · 2018-12-10T08:05:04Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1881/

CarbonDataQA · 2018-12-10T08:29:18Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1672/

problem: Currently, carbon SDK files output (files without metadata folder and its contents) are read by spark using an external table with carbon session. But presto carbon integration doesn't support that. It can currently read only the transactional table output files. solution: Hence we can enhance presto to read SDK output files. This will increase the use cases for presto-carbon integration. The above scenario can be achieved by inferring schema if metadata folder not exists and setting read committed scope to LatestFilesReadCommittedScope, if non-transctional table output files are present. This closes #2982

Changed the two Complex Delimiters used to '\001' and '\002'. This closes #2979

… Describe Formatted Updated Document and added column compressor used in Describe Formatted Command This closes #2986

Support Create DDL for Map type. This closes #2980

Fix some spell error This closes #2890

…beeline to read/write data from/to S3 This PR fix NoClassDefFoundError when use thriftServer and beeline to use table on cloud storage This closes #2925

Changed the complex delimiter in testcases This closes #2989

…maps Changes in this PR: BlockDatamap and blockletDatamap can store multiple files information. Each file is one row in that datamap. But non-default datamaps are not like that, so default datamaps distribution in multithread happens based on number of entries in datamaps, for non-default datamps distribution is based on number of datamaps (one datamap is considered as one record for non-default datamaps) so, supported block pruning in multi-thread for non-default datamaps in this PR. This closes #2949

interface in carbon writer of C++ SDK support withTableProperty, withLoadOption,taskNo, uniqueIdentifier, withThreadSafe,withBlockSize, withBlockletSize, localDictionaryThreshold, enableLocalDictionary, sortBy in C++ SDK support withTableProperty, withLoadOption,taskNo, uniqueIdentifier, withThreadSafe, withBlockSize, withBlockletSize, localDictionaryThreshold, enableLocalDictionary,sortBy in C++ SDK This closes #2899

…s Empty What was the issue? For SDK Write , it was going into bad record by returning null on passing empty array for Complex Type. What has been changed? Added a check for empty array. This will return an empty array. This closes #2983

Problem When some pages is giving 0 rows, then also BlockletScanResult is uncompressing all the pages. When compression is high and one blocklet contains more number of pages in this case page uncompression is taking more time and impacting query performance. Solution Added check if number of record after filtering is 0 then no need to uncompress that page This closes #2985

This closes #2995

What was the issue? After doing SDK Write, Select * was failing for 'long_string_columns' with trailing space. What has been changed? Removed the trailing space in ColumnName. This closes #2988

…ult sort_scope What are the changes proposed in this PR? This PR is based on the below discussion that is mentioned in the mail-chain. http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Make-no-sort-as-default-sort-scope-and-keep-sort-columns-as-empty-by-default-td70233.html changes are done to make 'no_sort' as default sort_scope and keep sort_columns as 'empty' by default. Also few issues are fixed in 'no_sort' and empty sort_columns flow. This closes #2966

This closes #2931

SDV Testcases were failing because Delimiter and Complex Delimiter was same. So changed the Complex Delimiter in Load Option. This closes #3002

replace apache common log with carbondata log4j This closes #2999

Added datasource test cases for Spark File Format. This closes #2951

…bloom filter Problem java.lang.IllegalAccessError is thrown when query on bloom filter without compress on CarbonThriftServer. Analyse similar problem was occur when get/set BitSet in CarbonBloomFilter, it uses reflection to solve. We can do it like this. Since we have set the BitSet already, another easier way is to call super class method to avoid accessing it from CarbonBloomFilter Solution if bloom filter is not compressed, call super method to test membership This closes #3000

Problem: Global Dictionary was not working for Map datatype and giving Null values. Solution:Added the case for Global Dictionary to be created in case the datatype is Complex Map. This closes #3006

This closes #2997

CarbonDataQA · 2018-12-20T18:00:12Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1887/

CarbonDataQA · 2018-12-20T18:34:26Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1888/

CarbonDataQA · 2018-12-20T18:55:19Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10142/

CarbonDataQA · 2018-12-20T19:27:52Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2097/

SteNicholas added 4 commits November 27, 2018 14:51

update maven profile

b53a09c

update maven profile

06bce33

merge master branch

45b3f9c

merge master branch

c13e8e3

xubo245 reviewed Dec 2, 2018

View reviewed changes

xubo245 reviewed Dec 3, 2018

View reviewed changes

xuchuanyin reviewed Dec 5, 2018

View reviewed changes

fetch master

698cf8c

ajantha-bhat and others added 21 commits December 21, 2018 01:43

[CARBONDATA-3153] Complex delimiters change

3635f64

Changed the two Complex Delimiters used to '\001' and '\002'. This closes #2979

[CARBONDATA-3166]Updated Document and added Column Compressor used in…

7cead0e

… Describe Formatted Updated Document and added column compressor used in Describe Formatted Command This closes #2986

[CARBONDATA-3017] Map DDL Support

6b5330b

Support Create DDL for Map type. This closes #2980

[CARBONDATA-3002] Fix some spell error

4e67082

Fix some spell error This closes #2890

[CARBONDATA-3102] Fix NoClassDefFoundError when use thriftServer and …

6dab1e8

…beeline to read/write data from/to S3 This PR fix NoClassDefFoundError when use thriftServer and beeline to use table on cloud storage This closes #2925

[CARBONDATA-3175]Fix Testcase failures in complex delimiters

63fbf8e

Changed the complex delimiter in testcases This closes #2989

[CARBONDATA-3160] Compaction support with MAP data type

c7fe96d

This closes #2995

[CARBONDATA-3174]varchar column trailing space issue fixed

17d13c7

What was the issue? After doing SDK Write, Select * was failing for 'long_string_columns' with trailing space. What has been changed? Removed the trailing space in ColumnName. This closes #2988

[CARBONDATA-2999] support read schema from S3

ad3b795

This closes #2931

[CARBONDATA-3182] Fixed SDV Testcase failures

7c1fccf

SDV Testcases were failing because Delimiter and Complex Delimiter was same. So changed the Complex Delimiter in Load Option. This closes #3002

[HOTFIX] replace apache common log with carbondata log4j

b40280c

replace apache common log with carbondata log4j This closes #2999

[SDV] Add datasource testcases for Spark File Format

5b374cf

Added datasource test cases for Spark File Format. This closes #2951

[CARBONDATA-3187] Supported Global Dictionary For Map

262ed1e

Problem: Global Dictionary was not working for Map datatype and giving Null values. Solution:Added the case for Global Dictionary to be created in case the datatype is Complex Map. This closes #3006

[CARBONDATA-3161]Pipe dilimiter is not working for streaming table

85606ba

This closes #2997

Hive Integration

10e2e9c

SteNicholas closed this Dec 20, 2018

SteNicholas deleted the CARBONDATA-3127 branch December 20, 2018 18:25

SteNicholas restored the CARBONDATA-3127 branch December 20, 2018 18:26

SteNicholas mentioned this pull request Dec 20, 2018

[CARBONDATA-3127]Fix the HiveExample & TestCarbonSerde exception #3012

Closed

5 tasks


		import java.util.Properties;

		public class TestCarbonSerde extends TestCase {

[CARBONDATA-3127]Fix the HiveExample & TestCarbonSerde exception #2969

[CARBONDATA-3127]Fix the HiveExample & TestCarbonSerde exception #2969

Conversation

SteNicholas commented Dec 2, 2018 • edited

SteNicholas commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteNicholas commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

CarbonDataQA commented Dec 2, 2018

zzcclp commented Dec 3, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CarbonDataQA commented Dec 5, 2018

CarbonDataQA commented Dec 5, 2018

CarbonDataQA commented Dec 5, 2018

SteNicholas commented Dec 5, 2018

xubo245 commented Dec 5, 2018

zzcclp commented Dec 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuchuanyin commented Dec 5, 2018 • edited

CarbonDataQA commented Dec 10, 2018

CarbonDataQA commented Dec 10, 2018

CarbonDataQA commented Dec 10, 2018

CarbonDataQA commented Dec 10, 2018

CarbonDataQA commented Dec 20, 2018

CarbonDataQA commented Dec 20, 2018

CarbonDataQA commented Dec 20, 2018

CarbonDataQA commented Dec 20, 2018

SteNicholas commented Dec 2, 2018 •

edited

xuchuanyin commented Dec 5, 2018 •

edited