Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2927] multiple issue fixes for varchar column and complex columns, row that grows more than 2MB #2706

Closed
wants to merge 2 commits into from

Conversation

ajantha-bhat
Copy link
Member

@ajantha-bhat ajantha-bhat commented Sep 10, 2018

1. varchar data length is more than 2MB, buffer overflow exception (thread local row buffer)
root casue*: thread* loaclbuffer was hardcoded with 2MB.
solution: grow dynamically based on the row size.

2. read data from carbon file having one row of varchar data with 150 MB length is very slow.
root casue: At UnsafeDMStore, ensure memory is just incresing by 8KB each time and lot of time malloc and free happens before reaching 150MB. hence very slow performance.
solution: directly check and allocate the required size.

3. Jvm crash when data size is more than 128 MB in unsafe sort step.
root cause: unsafeCarbonRowPage is of 128MB, so if data is more than 128MB for one row, we access block beyond allocated, leading to JVM crash.
solution: validate the size before access and prompt user to increase unsafe memory. (by carbon property)

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed? NA

  • Any backward compatibility impacted? NA

  • Document update required? NA

  • Testing done
    done

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/213/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8452/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/382/

@ajantha-bhat ajantha-bhat changed the title [WIP] multiple issue fixes for varchar column and complex columns, row that grows more than 2MB [CARBONDATA-2927] multiple issue fixes for varchar column and complex columns, row that grows more than 2MB Sep 11, 2018
@ajantha-bhat
Copy link
Member Author

@kumarvishal09 @ravipesala : please do in-depth review for this PR. impact is more.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/237/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/406/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.3/8476/

@@ -200,7 +200,7 @@ public static MemoryBlock allocateMemoryWithRetry(long taskId, long size)
}
if (baseBlock == null) {
INSTANCE.printCurrentMemoryUsage();
throw new MemoryException("Not enough memory");
throw new MemoryException("Not enough memory, increase carbon.unsafe.working.memory.in.mb");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can optimize the error message to
Not enough unsafe working memory (total: , available: , request: )

@@ -559,7 +572,13 @@ public int writeRawRowAsIntermediateSortTempRowToUnsafeMemory(Object[] row,
return size;
}


private void validateUnsafeMemoryBlockSizeLimit(long unsafeRemainingLength, int size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please optimize the parameter name of 'size' for better reading, it seems that it represents the requestedSize

rowBuffer.putInt((int) row[this.dictNoSortDimIdx[idx]]);
}
// convert no-dict & no-sort
for (int idx = 0; idx < this.noDictNoSortDimCnt; idx++) {
byte[] bytes = (byte[]) row[this.noDictNoSortDimIdx[idx]];
// cannot exceed default 2MB, hence no need to call ensureArraySize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for one column, it may not exceed 2MB, what if we lots of no-sort-no-dict columns?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 MB is not enough for varchar and complex complex columns.

rowBuffer.putShort((short) bytes.length);
rowBuffer.put(bytes);
}
// convert varchar dims
for (int idx = 0; idx < this.varcharDimCnt; idx++) {
byte[] bytes = (byte[]) row[this.varcharDimIdx[idx]];
// can exceed default 2MB, hence need to call ensureArraySize
rowBuffer = UnsafeSortDataRows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this method per row per column?
Since in most scenarios, 2MB per row is enough, so will the method calling here cause performance decrease?

@@ -598,26 +625,53 @@ private void packNoSortFieldsToBytes(Object[] row, ByteBuffer rowBuffer) {
tmpValue = row[this.measureIdx[idx]];
tmpDataType = this.dataTypes[idx];
if (null == tmpValue) {
// can exceed default 2MB, hence need to call ensureArraySize
rowBuffer = UnsafeSortDataRows
.ensureArraySize(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad indent, can be moved to previous line

@@ -59,12 +60,11 @@ public UnsafeCarbonRowPage(TableFieldStat tableFieldStat, MemoryBlock memoryBloc
this.taskId = taskId;
buffer = new IntPointerBuffer(this.taskId);
this.dataBlock = memoryBlock;
// TODO Only using 98% of space for safe side.May be we can have different logic.
sizeToBeUsed = dataBlock.size() - (dataBlock.size() * 5) / 100;
sizeToBeUsed = dataBlock.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the old comment outdated? Have you ensured the 'safe side' it mentioned?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep back this code, it is for reserving memory for a row in case it exceeds

@@ -72,7 +72,7 @@

private SortParameters parameters;
private TableFieldStat tableFieldStat;
private ThreadLocal<ByteBuffer> rowBuffer;
private static ThreadLocal<ByteBuffer> rowBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the 'static' here may cause problem for concurrent loading. Each loading should their own rowBuffer.

}
} catch (Exception e) {
LOGGER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad indent. we can move the msg to next line and keep method call in this line

@@ -326,6 +335,19 @@ private void startFileBasedMerge() throws InterruptedException {
dataSorterAndWriterExecutorService.awaitTermination(2, TimeUnit.DAYS);
}

public static ByteBuffer ensureArraySize(int requestSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please give a comment that this method is used to increase the rowbuffer during loading.

@@ -326,6 +335,19 @@ private void startFileBasedMerge() throws InterruptedException {
dataSorterAndWriterExecutorService.awaitTermination(2, TimeUnit.DAYS);
}

public static ByteBuffer ensureArraySize(int requestSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we increase the rowbuffer runtime, is there a way to decrease it? Or if there is no need to do so, how long will this rowbuffer last?

@xuchuanyin
Copy link
Contributor

@ajantha-bhat
Hi, I think the main problem may be that you set the 'rowbuffer' as static which should not be shared among different data loadings.

Besides, the judgement for increasing rowBuffer size per row per column may decrease data loading performance.

As a result, I'd like to implement this in an easier way.

We can add a table propery or load option for the size of row buffer. Just keep the previous row-buffer related code as it is. All you need is to change the initial size of the rowbuffer based on the table property or load option.

@kumarvishal09 @ravipesala How do you think?

}

static {
rowBuffer = new ThreadLocal<ByteBuffer>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try using DataOutSteam backed by ByteOutPutStream, it can expand dynamically.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/307/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/483/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8553/

@@ -239,6 +239,7 @@ public void addDataToStore(CarbonRow row) throws CarbonDataWriterException {
* @return false if any varchar column page cannot add one more value(2MB)
*/
private boolean isVarcharColumnFull(CarbonRow row) {
//TODO: test and remove this as now UnsafeSortDataRows can exceed 2MB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for a column page cannot exceed 2GB (actually it is 1.67GB since snappy cannot compress a bigger size in one run), so there is no need to add a comment here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

row with complex column also can grow very big (so checking only for var char is not good)
Also now rows grow more than 2MB. so, we need to modify this check.
This can be handled in separate PR.

now no impact from this method as "if 2MB itself space not there, more than 2MB space will never be there". so functionality remains same.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/308/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/484/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8554/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/309/

this.outputStream = outputStream;
}

public void resetByteArrayOutputStream() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just change name to reset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

outputStream.reset();
}

public int getByteArrayOutputStreamSize() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just change name to getSize

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/312/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8558/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/488/

@ajantha-bhat
Copy link
Member Author

@ravipesala : PR is ready. Please check.

@ravipesala
Copy link
Contributor

@xuchuanyin This 2MB limit causing many issues in varchar and complex columns. We cannot let user to configure this internal limits. We should have a growable stream. Besides, we better remove this bytebuffer and set directly to unsafe.

@@ -239,6 +239,7 @@ public void addDataToStore(CarbonRow row) throws CarbonDataWriterException {
* @return false if any varchar column page cannot add one more value(2MB)
*/
private boolean isVarcharColumnFull(CarbonRow row) {
//TODO: test and remove this as now UnsafeSortDataRows can exceed 2MB
Copy link
Member

@kevinjmh kevinjmh Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajantha-bhat Original implementation use 2MB to ensure next varchar column value can be filled safely, because size of value of single column won't exceed size of a row.
If UnsafeSortDataRows can exceed 2MB(growing dynamically), then we cannot check whether we have enough space for next value because we are not sure how many space next value will take. So the column page size check should be run before adding row to dataRows

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how we come to the conclusion of 2MB. There is no guarantee that we always sort the data to use UnsafeSortDataRows always. How about nosort case? And how about if user wants to add 100MB varchar how to support it.
And also this is not just limited to varchar, we should consider for complex and string columns as well here.
@ajantha-bhat Please remove that todo, But we need to refactor the code to ensure to keep the page size within the snappy max compressed length for complex and string datatypes as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xuchuanyin @kevinjmh @ravipesala @kumarvishal09 : As per discussion let us handle this with configurable page size [from 1 MB to 2GB(snappy max)] and split the complex child pages here only and add validation for each column based on row,

This will be analyzed more and I will open a discussion in community and separate PR will be raised for this.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/334/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8581/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/511/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/337/

@ajantha-bhat
Copy link
Member Author

retest this please

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/339/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/516/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8586/

@ajantha-bhat
Copy link
Member Author

@ravipesala : PR is ready please review

@ravipesala
Copy link
Contributor

LGTM I am merging this PR. @ajantha-bhat Please start another discussion in the forum to support big column data up to 2GB for complex, varchar and string columns. And also make the page size configurable in terms of MB to avoid outofmemory while reading.

@xuchuanyin
Copy link
Contributor

LGTM

@asfgit asfgit closed this in d1bfb74 Sep 19, 2018
@xuchuanyin
Copy link
Contributor

please take care about the loading performance compared with previous nio.buffer implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants