New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account memory for OrcDeletedRows - fix memory accounting for ORC ACID delete delta #9914
Conversation
@@ -244,6 +251,10 @@ private RowId getRowId(int position) | |||
long row = BIGINT.getLong(page.getBlock(ROW_ID_INDEX), i); | |||
RowId rowId = new RowId(originalTransaction, bucket, statement, row); | |||
deletedRowsBuilder.add(rowId); | |||
rowCount++; | |||
if (rowCount % 1000 == 0) { | |||
systemMemoryUsage.setBytes(rowCount * RowId.INSTANCE_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sizeOfObjectArray(rowCount) + rowCount * RowId.INSTANCE_SIZE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth doing setting this as we loop vs just doing it once at the end?
Also, can we safely ignore the return value of setBytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of doing that in the loop (I think) is that we can pause some other query for a while, allowing us to get to the end of loop without OOM. Other than that it does not change much I think.
@findepi am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should assume that building deletedRows
list can exhaust node memory.
if we set memory usage only at the end, we still do not protect ourselves against node OOM.
Also, can we safely ignore the return value of setBytes?
we cannot block here (can we?).
we could use trySetBytes
here, perhaps.
plugin/trino-hive/src/main/java/io/trino/plugin/hive/orc/OrcDeletedRows.java
Outdated
Show resolved
Hide resolved
8a18ee9
to
40446fb
Compare
AC |
It does not look like. Maybe with a significant refactoring.
I was thinking about that. But then what if it returns that we cannot allocate. Throw exception ourselves? |
yes, we would throw then. cc @sopel39 for |
@@ -244,6 +252,10 @@ private RowId getRowId(int position) | |||
long row = BIGINT.getLong(page.getBlock(ROW_ID_INDEX), i); | |||
RowId rowId = new RowId(originalTransaction, bucket, statement, row); | |||
deletedRowsBuilder.add(rowId); | |||
rowCount++; | |||
if (rowCount % 1000 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropped as data is not passed to operator context immediately.
40446fb
to
2c5a097
Compare
deletedRows = deletedRowsBuilder.build(); | ||
// not updating memory usage in the loop, as deletedRows are built, as recorded information is propagated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra space before as
2c5a097
to
e3560ed
Compare
No description provided.