Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RANGER-4390: creating another row batch causes null pointer exception #298

Merged
merged 1 commit into from
Mar 4, 2024

Conversation

fateh288
Copy link
Contributor

Creating another row batch causes null pointer exception as vectorizedRowBatchMap has old batch references and gets garbage collected.

What changes were proposed in this pull request?

In log(Writer writer, Collection<AuthzAuditEvent> events), creating new row batch causes an exception since the new batch renders references in vectorizedRowBatchMap invalid.
The creation of new batch is also not necessary as batch of same size was allocated in initORCAuditSchema(). Also, creating a batch larger than orc buffer size is also not useful since the batch is written to writer as soon as its size reaches orc buffer size.

if (batch.size == orcBufferSize) { writer.addRowBatch(batch); batch.reset(); }

How was this patch tested?

The unit test case TestAuditQueue.testAuditFileQueueSpoolORC failed when xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size and xasecure.audit.destination.hdfs.orc.buffersize were unequal. After the fix, the test case passes and now different values (e.g. 10 and 5 respectively as shown below) can be used for these properties.

props = {
xasecure.audit.destination.hdfs.orc.buffersize=5, 
xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=target/spool, xasecure.audit.destination.hdfs.batch.queuetype=filequeue, xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10, xasecure.audit.destination.hdfs.batch.filequeue.filetype=orc, xasecure.audit.is.enabled=true, xasecure.audit.destination.hdfs.filename.format=%app-type%_ranger_audit.orc, xasecure.audit.destination.hdfs=enable, xasecure.audit.destination.hdfs.orc.stripesize=10, xasecure.audit.destination.hdfs.dir=target/testAuditFileQueueSpoolORC, xasecure.audit.destination.hdfs.orc.compression=none, xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=5
}

… as vectorizedRowBatchMap has old batch references and gets garbage collected
@rameeshm rameeshm merged commit a7026dd into apache:master Mar 4, 2024
1 check passed
docherak pushed a commit to docherak/ranger that referenced this pull request Mar 14, 2024
… as vectorizedRowBatchMap has old batch references and gets garbage collected (apache#298)
mneethiraj pushed a commit that referenced this pull request Jun 20, 2024
… as vectorizedRowBatchMap has old batch references and gets garbage collected (#298)

(cherry picked from commit a7026dd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants