-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(storage): use enum for object store impl #3427
Conversation
It's not necessary to use enum for object store. Because the time cost on IO and RPC will be much more than which cost on dynamic virtual-table. |
Codecov Report
@@ Coverage Diff @@
## main #3427 +/- ##
==========================================
- Coverage 74.41% 74.39% -0.02%
==========================================
Files 771 771
Lines 108925 108933 +8
==========================================
- Hits 81052 81039 -13
- Misses 27873 27894 +21
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
Performance is not the main reason for introducing enum for object store in this PR. The reason is described in the PR description. |
Generally looks good. For @Little-Wallace 's concern, we can still add |
I just think it is not necessary to maintain a complex code to optimize performance of object-storage acquirement. Because every times we send a request to object storage we will switch thread and kernel-state and wait several milliseconds or several hundred milliseconds (because s3 is much slower than MinIO based on NVMe). And all of them will cost much more resource than a dynamic vtable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logics are greatly simplified. Thanks for the PR. LGTM!
Looks like the spot instance limit is hit in CI. You need to push an empty commit to rerun it.
* refactor(storage): use enum for object store impl * avoid using async_trait in object store trait * add async_trait back for object store and remove the local and remote enum * empty commit to trigger CI rerun Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
Currently in our code, we will add a prefix in the object store path to indicate whether we are operating on local or remote object store, and therefore all our object store implementation (S3ObjectStore, InMemObjectStore, DiskObjectStore ...) has to internally be aware that whether they are local object store or not, which makes the logic wired and messy.
In this PR, we use an enum to represent
ObjectStoreImpl
and explicitly restrict which object store can be used as remote or local object store. The path check and routing is done in this higher level enum so that the real object store implementation doesn't have to be aware that whether they are local or not. The enums look likeSome macros are used to reduce duplicated code.
Another benefit of introducing the enum is that in the code where use ObjectStoreImpl, we can be aware which type of object store we are using, and we may call some non-object-store-trait methods of specific object stores to get better performance in some scenario.
After the enum is introduced, we can also avoid using async_trait in object store, though the original cost of using async_trait can be ignored compared to external IO.
Checklist
./risedev check
(or alias,./risedev c
)Refer to a related PR or issue link (optional)