New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-3127]Fix the HiveExample & TestCarbonSerde exception #2969
Conversation
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1609/ |
Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9869/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1820/ |
final Properties tbl = new Properties(); | ||
|
||
// Set the configuration parameters | ||
tbl.setProperty("columns", "ashort,aint,along,adouble,adecimal,astring,alist"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take care the spell format, including the case sensitive, for example: aShort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xubo245 I have already fixed the spell formate you referred.
|
||
import java.util.Properties; | ||
|
||
public class TestCarbonSerde extends TestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take care the spell format, including the case sensitive,for example:TestCarbonSerDe
Please check other code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xubo245 I have already fixed the spell formate you referred.
arr[0] = new ShortWritable((short) 456); | ||
arr[1] = new IntWritable(789); | ||
arr[2] = new LongWritable(1000l); | ||
arr[3] = new DoubleWritable((double) 5.3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need: (double) ,please remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xubo245 I have already update this transformation.
@zzcclp Please review this update; |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1610/ |
Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9870/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1821/ |
LGTM |
arr[2] = new LongWritable(1000l); | ||
arr[3] = new DoubleWritable(5.3); | ||
arr[4] = new HiveDecimalWritable(HiveDecimal.create(1)); | ||
arr[5] = new Text("carbonSerde binary".getBytes("UTF-8")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take care the spell format, including the case sensitive,for example:CarbonSerDe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xubo245 Sorry to forget string spell format,please review it.Sorry for the inconvenience.
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1640/ |
Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9900/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1851/ |
@zzcclp Please review this update. |
LGTM |
1 similar comment
LGTM |
|
||
import java.util.Properties; | ||
|
||
public class TestCarbonSerDe extends TestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the other tests in carbon such as carbon-core module. Actually we do not need to extend TestCase here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already remove "extend TestCase" here, and update asserts of method.
public void testCarbonHiveSerDe() throws Throwable { | ||
try { | ||
// Create the SerDe | ||
System.out.println("test: testCarbonHiveSerDe"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not recommended to use stdout in test code. Currently we only use stdout in example
code not test
code. If you want to print something, use carbon Logger instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah,I have already replace stdout with Logger as same as the way of other carbon test code.
System.out.println("test: testCarbonHiveSerDe - OK"); | ||
|
||
} catch (final Throwable e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use Logger instead of printing stacktarce in test code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already remove this catch-rethrow.
deserializeAndSerializeLazySimple(serDe, arrWritable); | ||
System.out.println("test: testCarbonHiveSerDe - OK"); | ||
|
||
} catch (final Throwable e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after observing the following procedure, I think there is no need to catch-rethrow the exception here. The test framework will handle this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah,I think so.I have already remove this non-sense catch-rethrow in this method.
@SteNicholas Nice to see your work on the existed problems. And it seems the previous code has some problems which are inherited by your code. So I suggest you to fix them at the same time. Please check the above comments. |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1670/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9930/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1881/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1672/ |
problem: Currently, carbon SDK files output (files without metadata folder and its contents) are read by spark using an external table with carbon session. But presto carbon integration doesn't support that. It can currently read only the transactional table output files. solution: Hence we can enhance presto to read SDK output files. This will increase the use cases for presto-carbon integration. The above scenario can be achieved by inferring schema if metadata folder not exists and setting read committed scope to LatestFilesReadCommittedScope, if non-transctional table output files are present. This closes #2982
Changed the two Complex Delimiters used to '\001' and '\002'. This closes #2979
… Describe Formatted Updated Document and added column compressor used in Describe Formatted Command This closes #2986
Support Create DDL for Map type. This closes #2980
Fix some spell error This closes #2890
…beeline to read/write data from/to S3 This PR fix NoClassDefFoundError when use thriftServer and beeline to use table on cloud storage This closes #2925
Changed the complex delimiter in testcases This closes #2989
…maps Changes in this PR: BlockDatamap and blockletDatamap can store multiple files information. Each file is one row in that datamap. But non-default datamaps are not like that, so default datamaps distribution in multithread happens based on number of entries in datamaps, for non-default datamps distribution is based on number of datamaps (one datamap is considered as one record for non-default datamaps) so, supported block pruning in multi-thread for non-default datamaps in this PR. This closes #2949
interface in carbon writer of C++ SDK support withTableProperty, withLoadOption,taskNo, uniqueIdentifier, withThreadSafe,withBlockSize, withBlockletSize, localDictionaryThreshold, enableLocalDictionary, sortBy in C++ SDK support withTableProperty, withLoadOption,taskNo, uniqueIdentifier, withThreadSafe, withBlockSize, withBlockletSize, localDictionaryThreshold, enableLocalDictionary,sortBy in C++ SDK This closes #2899
…s Empty What was the issue? For SDK Write , it was going into bad record by returning null on passing empty array for Complex Type. What has been changed? Added a check for empty array. This will return an empty array. This closes #2983
Problem When some pages is giving 0 rows, then also BlockletScanResult is uncompressing all the pages. When compression is high and one blocklet contains more number of pages in this case page uncompression is taking more time and impacting query performance. Solution Added check if number of record after filtering is 0 then no need to uncompress that page This closes #2985
What was the issue? After doing SDK Write, Select * was failing for 'long_string_columns' with trailing space. What has been changed? Removed the trailing space in ColumnName. This closes #2988
…ult sort_scope What are the changes proposed in this PR? This PR is based on the below discussion that is mentioned in the mail-chain. http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Make-no-sort-as-default-sort-scope-and-keep-sort-columns-as-empty-by-default-td70233.html changes are done to make 'no_sort' as default sort_scope and keep sort_columns as 'empty' by default. Also few issues are fixed in 'no_sort' and empty sort_columns flow. This closes #2966
SDV Testcases were failing because Delimiter and Complex Delimiter was same. So changed the Complex Delimiter in Load Option. This closes #3002
replace apache common log with carbondata log4j This closes #2999
Added datasource test cases for Spark File Format. This closes #2951
…bloom filter Problem java.lang.IllegalAccessError is thrown when query on bloom filter without compress on CarbonThriftServer. Analyse similar problem was occur when get/set BitSet in CarbonBloomFilter, it uses reflection to solve. We can do it like this. Since we have set the BitSet already, another easier way is to call super class method to avoid accessing it from CarbonBloomFilter Solution if bloom filter is not compressed, call super method to test membership This closes #3000
Problem: Global Dictionary was not working for Map datatype and giving Null values. Solution:Added the case for Global Dictionary to be created in case the datatype is Complex Map. This closes #3006
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1887/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1888/ |
Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10142/ |
Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2097/ |
[CARBONDATA-3127]Hive module test case has been commented off,can' t run.
This pull request fix TestCarbonSerde test case exception with maven compile.Method called deserializeAndSerializeLazySimple of TestCarbonSerde class calls the interface called serializeStartKey which is not exsit in CarbonHiveSerDe class.Just modify replace serializeStartKey with serialize and remove comments of TestCarbonSerde.
Previous hive-integration HiveExample exists obvious bug that there is no data when select HIVE_CARBON_EXAMPLE table.I have already fixed the bug above in this pull request.Location was configured wrong after creating hive table.And block lack of detail info and footer offset when do querying hive table operation.Therefore I set detail info of block with footer offset through input split.
No
No
No
Modify TestCarbonSerde test
No