New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: discarding results do not attempt in-memory metacache writer #11163
Conversation
Of course, all of this was tested under |
4326294
to
448a138
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM & tested
Optimizations include - do not write the metacache block if the size of the block is '0' and it is the first block - where listing is attempted for a transient prefix, this helps to avoid creating lots of empty metacache entries for `minioMetaBucket` - avoid the entire initialization sequence of cacheCh , metacacheBlockWriter if we are simply going to skip them when discardResults is set to true. - No need to hold write locks while writing metacache blocks - each block is unique, per bucket, per prefix and also is written by a single node.
448a138
to
ef79de5
Compare
Mint Automation
11163-ef79de5/mint-gateway-azure.sh.log:
11163-ef79de5/mint-erasure.sh.log:
Deleting image on docker hub |
Description
fix: discarding results do not attempt in-memory metacache writer
Motivation and Context
Optimizations include
do not write the metacache block if the size of the
block is '0' and it is the first block - where listing
is attempted for a transient list. this helps to
avoid creating lots of empty metacache entries for
minioMetaBucket
avoid the entire initialization sequence of cacheCh
, metacacheBlockWriter if we are simply going to skip
them when discardResults is set to true.
No need to hold write locks while writing metacache
blocks - each block is unique, per bucket, per prefix
and also is written by a single node.
How to test this PR?
Needs a large setup with lots of files in different prefixes, tested this
on a packet setup with major gains obtained with per-prefix listing.
This PR to address some performance problems observed with
spark workloads under very high concurrency.
Types of changes
Checklist:
commit-id
orPR #
here)