New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-1400] Fix bug of array column out of bound when writing carbondata file #1273
Conversation
SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/265/ |
SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/268/ |
SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/273/ |
75bfdcc
to
7f956b1
Compare
SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/310/ |
aaefd63
to
174814f
Compare
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/677/ |
retest this please |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/680/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/684/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/685/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/686/ |
retest this please |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/688/ |
ByteArrayOutputStream stream = new ByteArrayOutputStream(); | ||
DataOutputStream out = new DataOutputStream(stream); | ||
for (byte[] byteArrayDatum : byteArrayData) { | ||
out.writeInt(byteArrayDatum.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can' t we use short here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for backward compatible, old code use 4 bytes to store length. I use the same for write and read for variable length column page
@jackylk tests are failing in 1.6, please check |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/711/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/713/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/717/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/22/ |
static ColumnPage newDecimalColumnPage(byte[] lvEncodedBytes, int scale, int precision) | ||
throws MemoryException { | ||
static ColumnPage newDecimalColumnPage(TableSpec.ColumnSpec columnSpec, byte[] lvEncodedBytes, | ||
int scale, int precision) throws MemoryException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since columnSpec is passed can we remove scale and precision from methods and get from columnspec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try, but it is a temporary solution.
I think the correct way is to make DataType a class instead of enum, and keep precision and scale in that.
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/724/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/98/ |
retest this please |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/99/ |
@jackylk it seems compilation fails after rebase to master. please check once |
Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/4/ |
Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder1/5/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/103/ |
LGTM |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/725/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/731/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/730/ |
…arbondata file If there is a big array in input csv file, when loading carbondata table, it may throw ArrayIndexOutOfBoundException because data exceed page size (32000 rows) This PR fixed it by changing complex column encoding to DirectCompressionEncoding This PR added a test case to test input data with big array This closes apache#1273
If there is a big array in input csv file, when loading carbondata table, it may throw ArrayIndexOutOfBoundException because data exceed page size (32000 rows)
This PR fixed it by changing complex column encoding to DirectCompressionEncoding
This PR added a test case to test input data with big array