-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CARBONDATA-2760] Reduce Memory footprint and store size for local dictionary encoded columns #2529
Conversation
97673c7
to
c54aa97
Compare
@@ -431,6 +439,15 @@ private void ensureArraySize(int requestSize, DataType dataType) { | |||
System.arraycopy(doubleData, 0, newArray, 0, arrayElementCount); | |||
doubleData = newArray; | |||
} | |||
} else if (dataType == DataTypes.BYTE_ARRAY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
increasing by 16 is too low, it can be doubled like array list case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -201,12 +201,18 @@ public void putDouble(int rowId, double value) { | |||
|
|||
@Override | |||
public void putBytes(int rowId, byte[] bytes) { | |||
try { | |||
ensureMemory(eachRowSize); | |||
} catch (MemoryException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MemoryException can be runtime exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok Will handle this in different PR
byte[] data = new byte[totalLength]; | ||
int numberOfRows = getEndLoop(); | ||
int destOffset = 0; | ||
for (int i = 0; i < numberOfRows; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly get single byte array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6088/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7327/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7321/ |
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7332/ |
retest this please |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6097/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7338/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6102/ |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5929/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5932/ |
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7366/ |
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6127/ |
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5940/ |
retest sdv please |
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5948/ |
LGTM |
…ctionary encoded columns Problem: Local dictionary encoded page is using unsafevarlenghtcolumn column page which internally maintains offset of each value in another column page because of this memory footprint is high. for complex primitive string data type column page while compressing, it is converting to LV even if it is encoded with dictionary values, because of this store size is high. Solution: Use UnsafeFixedLength Column page for local dictionary encoded columns No need to convert to LV during query if local dictionary is present so use UnsafeFixLength Column page This closes #2529
Why this PR?
Solution:
Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
All Testcase will take care. Tested in 3 Node setup with 135 million records
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.