Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-1018] Add unsafe ColumnPage implementation #1000

Merged
merged 1 commit into from
Jun 19, 2017

Conversation

jackylk
Copy link
Contributor

@jackylk jackylk commented Jun 6, 2017

This PR is based on #987

In this PR, an UnsafeColumnPage is added, it can reduce memory requirement for data loading.
Before loading, user can add following property to enable this feature

    CarbonProperties.getInstance()
      .addProperty(CarbonCommonConstants.ENABLE_LOADING_UNSAFE_COLUMN_PAGE, "true")

@asfgit
Copy link

asfgit commented Jun 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/105/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2232/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2237/

@asfgit
Copy link

asfgit commented Jun 6, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/110/

@asfgit
Copy link

asfgit commented Jun 8, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/177/

Failed Tests: 11

carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark: 2

carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test: 9

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2304/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2307/

@asfgit
Copy link

asfgit commented Jun 8, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/180/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2308/

@asfgit
Copy link

asfgit commented Jun 9, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/181/

@asfgit
Copy link

asfgit commented Jun 9, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/199/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2322/

@jackylk
Copy link
Contributor Author

jackylk commented Jun 9, 2017

apache:retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2324/

@asfgit
Copy link

asfgit commented Jun 9, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/201/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2334/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2452/

@asfgit
Copy link

asfgit commented Jun 13, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/333/

@@ -98,56 +117,54 @@ public static ColumnPage newPage(DataType dataType, int pageSize) {
default:
throw new RuntimeException("Unsupported data dataType: " + dataType);
}
instance.stats = new ColumnPageStatsVO(dataType);
instance.nullBitSet = new BitSet(pageSize);
return instance;
}

protected static ColumnPage newBytePage(byte[] byteData) {
ColumnPage columnPage = new ColumnPage(BYTE, byteData.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytePage does not support unsafeColumnPage ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will fix

@@ -98,56 +117,54 @@ public static ColumnPage newPage(DataType dataType, int pageSize) {
default:
throw new RuntimeException("Unsupported data dataType: " + dataType);
}
instance.stats = new ColumnPageStatsVO(dataType);
instance.nullBitSet = new BitSet(pageSize);
return instance;
}

protected static ColumnPage newBytePage(byte[] byteData) {
ColumnPage columnPage = new ColumnPage(BYTE, byteData.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Byte doe not support unsafeColumnPage ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If don't support, we'd better to add some notes to explain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it supports, I will fix

ColumnPage columnPage = new ColumnPage(BYTE_ARRAY, stringData.length);
columnPage.byteArrayData = stringData;
private static ColumnPage newVarLengthPage(byte[][] byteArray) {
ColumnPage columnPage = new ColumnPage(BYTE_ARRAY, byteArray.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteArray does not support unsafeColumnPage ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It supports, I will update

@@ -508,7 +567,7 @@ public static ColumnPage decompress(Compressor compressor, DataType dataType,
}

// input byte[] is LV encoded, this function can expand it into byte[][]
private static byte[][] deflatten(byte[] input) {
protected static byte[][] deflatten(byte[] input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be better to move deflatten to ByteUtil which is same with flatten ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with erlu.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

private void ensureMemory(int requestSize) throws MemoryException {
if (totalLength + requestSize > capacity) {
memoryBlock = UnsafeMemoryManager.reallocateMemoryWithRetry(
memoryBlock, (long)((totalLength + requestSize) * FACTOR));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capacity should be updated with (totalLength + requestSize) * FACTOR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

CarbonUnsafe.unsafe.copyMemory(bytes, CarbonUnsafe.BYTE_ARRAY_OFFSET + offset,
baseAddress, baseOffset + rowOffset[rowId], length);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even method public void putBytes(int rowId, byte[] bytes) also need to be override here.

Copy link
Contributor Author

@jackylk jackylk Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is implemented in VarLengthColumnPageBase and it will call putBytesAtRow abstract method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i see that. I mean we should do in unsafe

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2491/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2494/

@asfgit
Copy link

asfgit commented Jun 15, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/387/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2509/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2521/

@asfgit
Copy link

asfgit commented Jun 15, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/406/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2526/

@asfgit
Copy link

asfgit commented Jun 15, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/414/

@asfgit
Copy link

asfgit commented Jun 15, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/415/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2528/

@jackylk
Copy link
Contributor Author

jackylk commented Jun 16, 2017

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2546/

@asfgit
Copy link

asfgit commented Jun 16, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/438/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2551/

@jackylk
Copy link
Contributor Author

jackylk commented Jun 17, 2017

retest this please

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2558/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2574/

@asfgit
Copy link

asfgit commented Jun 19, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/468/

@ravipesala
Copy link
Contributor

LGTM

@asfgit asfgit merged commit 7359601 into apache:master Jun 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants